In a

previous blog entry, I talked about the properties of

Sturm Chains. In today's blog entry, I will show how they can be used to establish Sturm's Theorem.

Definition 1: Number of Sign Changes Across a Sturm Chain at xLet

f_{0}, f_{1}, ..., f_{s} be a Sturm Chain where for all

0 ≤ i ≤ s, f_{i} ≠ 0. The

number of sign changes across a Sturm Chain at x is the number of times that a neighboring function changes sign: the number of times when

f_{i}(x) is negative and

f_{i+1}(x) is positive or when

f_{i}(x) is negative and

f_{i+1}(x) is positive.

Example:For the function

f(x) = x^{5} - 3x - 1, the Sturm Chain is (see

here for details on how this Sturm Chain is constructed):

f_{0}(x) = x^{5} - 3x - 1f_{1}(x) = 5x^{4} - 3f_{2}(x) = 12x + 5f_{3}(x) = 1For

x=-2, we see that:

f_{0} = -f_{1} = +f_{2} = -f_{3} = +So, the

number of sign changes across the Sturm Chain at

x=-2 is

3.

Definition 2: Open and Closed IntervalsAn interval is

open on

a,b if for all

x in

a,b,

a is less than

x and is less than

b. An open interval is represented

(a,b). [It is called

open because there is no minimum or maximum value]

An interval is

closed on

a,b if for all

x in

a,b, a ≤ x ≤ b. A closed interval is represented

[a,b] (It is called closed because it has both a minimum and a maximum value)

We can also mix the two notations so that

x in

[a,b) means that

a ≤ x and

x is less than

b.

x in

(a,b] means that

a is less than

x ≤ b.

I use this notation below.

Lemma 1:Let

f be a continuous function such that

f(a) and

f(b) have different signs,

Then:

There exists

x such that

x in

(a,b) and

f(x)=0Proof:

This follows directly from the Intermediate Value Theorem (see Theorem,

here).

QED

Corollary 1.1:If

f is continuous on

[a,b] and for all

x in

[a,b]:

f(x) ≠ 0Then:

for all

x in

[a,b]:

f(x) has the same sign.

Proof:

(1) Assume that

f changes sign over

[a,b](2) Then by Lemma 1 above there exists

x such that

f(x)=0(3) But this is not true so we reject our assumption at step #1.

QED

Corollary 1.2:Let

f_{0}, f_{1}, ..., f_{s} be a Sturm Chain

Let

k be the only zero for any

f_{i} and it is a zero where

i ≥ 1.Let

a,b be an interval such that

k is in

[a,b]Then:

There are no sign changes across the Sturm Chain for

x in

[a,b]Proof:

(1) For all

x in

[a,b]:

f_{i-1}(x) ≠ 0 and

f_{i+1}(x) ≠ 0 and if

x ≠ k, then

f_{i}(k) ≠ 0.(2) Since

f_{i}(k)=0, we know that

f_{i-1}(k) and

f_{i+1}(k) have opposite signs. (see for properties of

Sturm Chains, see Definition 1,

here)

(3) Since

f_{i-1} and

f_{i+1} are nonzero on

[a,b], we know from Corollary 1.1 above that

f_{i-1} and

f_{i+1 } have the same opposite signs for all

x in

[a,b].

(4) But this means that any sign change on

f_{i} doesn't affect the total number of sign changes from

f_{i-1} to

f_{i} to

f_{i+1} since:

Before

k:

There are the following possibilities for all

f_{i-1}, f_{i}, f_{i+1}+, +, - = 1 sign change

+, -, - = 1 sign change

-, +, + = 1 sign change

-, -, + = 1 sign change

After

k:

There are the following possibilities for

f_{i-1}, f_{i}, f_{i+1}+, -, - = 1 sign change

+, +, - = 1 sign change

-, -, + = 1 sign change

-,+,+ = 1 sign change

(5) The above property is true even if there are multiple Sturm functions that are zero at

k. The key point is that by the properties of Sturm Chains, we know that no neighboring functions are both

0 (see

here for details on the properties of a Sturm Chain).

(6) It is possible that

f_{i}, f_{i+2}, etc. are all 0 on

k. But even in this case, the above logic holds and there are no sign changes for any of the Sturm functions.

(5) Because in all other cases, the Sturm functions

are nonzero, it follows from Corollary 1.1 above that they don't change sign and we are done.

QED

Corollary 1.3:Let

f_{0}, f_{1}, ..., f_{s} be a Sturm Chain

Let

[a,b] be an interval such that for all

x in

[a,b]:

f_{0}(x) ≠ 0 (but the other Sturm functions may have a zero)

.Let

Z(x) be the number of sign changes that occur across a Sturm Chain for a given x (see Definition 1 above)

Then:

There are no sign changes over the Sturm Chain in

[a,b]. That is,

Z(a) - Z(b) = 0.

Proof:

(1) Let

f_{0}, f_{1}, ..., f_{s} be a Sturm Chain

(2) Assume that

k is the only zero in

[a,b] for any function in the Sturm Chain where

i ≥ 1.

(3) In this case, the Theorem holds using Corollary 1.2 above.

(4) Assume that we order all zeros in

[a,b] for any function in the Sturm Chain and up to the

nth zero

k, there is no sign change across the Sturm Chain.

(7) Let

k be the

nth zero in

[a,b] and

k' be the

n+1th zero in

[a,b] and

k'' be the

n+2th zero. Let

i,i',i'' be the function that is zero so that we have:

f_{i}(k)=0, f_{i'}(k')=0, and

f_{i''}(k'') = 0.

(8) Since

k' is the only zero on

(k,k''), we can use Corollary 1.2 above to complete the inductive proof.

QED

Lemma 2:Let

f_{0}, f_{1}, ..., f_{s} be a Sturm Chain

Let

k be the only zero for

f_{0} in

[a,b]Let

Z(x) be the number of sign changes that occur across a Sturm Chain for a given

x.

Then:

Z(a) - Z(b) = 1.

_{}Proof:

(1) Let

h be a point before

k and

l be a pointer after

k such that

f'(h) has the same sign as

f'(k) and as

f'(l). (See Property 3 of Sturm Chains in Definition 1,

here)

(4) We have the following cases to consider:

Case I:

f(h) is positive and

f(l) is negative

(a) For all

x in

(h,l):

f(x) is decreasing, so its derivative

f'(x) is negative. [See Lemma

here if needed]

(b) So at

h:

f_{0}(h)=f(h) is positive and

f_{1}(h)=f'(h) is negative.

(c) At

l:

f_{0}(l) is negative and

f_{1}(l) is negative.

(d) So,

Z(h) - Z(l) = 1.

Case II:

f(h) is negative and

f(l) is positive

(a) For all

x in

(h,l):

f(x) is increasing, so its derivative

f'(x) is positive. [See Lemma

here if needed]

(b) So at

h:

f_{0}(h) is negative and

f_{1}(h) is positive.

(c) At

l:

f_{0}(l) is positive and

f_{1}(l) is positive.

(d) So, again,

Z(h) - Z(l) = 1.

QED

Theorem: Sturm's TheoremLet

f(x) be an algebraic equation with real coefficients and with only simple roots.

Let

(a,b) be an interval such that

f(a) ≠ 0 and

f(b) ≠ 0 and

a is less than

b.

Let

Z(x) be the number of sign changes that occur across a Sturm Chain for a given x (see Definition 1 above if needed).

Then:

The number of real roots occurring in

(a,b) is equal to

Z(a) - Z(b)Proof:

(1) Let

f_{0}, f_{1}, ..., f_{s} be a Sturm Chain for

f(x) where

f_{0} = f. [See Lemma 1,

here]

(2) Let

[a,b] be an interval such that

a is less than

b and

a,b are real numbers.

(3) Let

Z(x) be the number of the sign changes across the Sturm Chain for a given

x.

(4) There are three cases that we need to consider to prove this theorem.

Case I: No zeros for any function in the Sturm Chain

(a) To prove the theorem for this case, we need to show that

Z(a) - Z(b) = 0.

(b) For all

x in

[a,b], for all

0 ≤ i ≤ s, f_{i}(x) ≠ 0(c) But then by Corollary 1.1 above, for

f_{i}, there is no sign changes for any function in the Sturm Chain.

(d) So,

Z(a) - Z(b)=0Case II: No zeros in

(a,b) for

f_{0}(x) but at least one zero for the other functions in the Sturm Chain.

This case is equivalent to Corollary 1.3 above.

Case III: At least one zero in

(a,b) for

f(x)(a)

∃x such that

a ≤ x ≤ b and

f_{0}(x) = 0(b) To prove this, I will use induction.

(c) Assume that there is only one such zero in

[a,b](d) Using Lemma 2 above, we know that in this case,

Z(a) - Z(b) = 1.

(e) Assume that the theorem is true up to the

nth zero of

f in

[a,b](f) Let

k be the

nth zero,

k' be the

n+1th zero and

k'' be the

n+2th zero.

(g) Then there is only zero in

(k,k'') which is located at

k'.

(h) Let

l be a point in

(k,k') and

l' be a point in

(k',k'').(i) Then, by Lemma 2 above

Z(l) - Z(l') = 1.

(j) From step e above, we have

Z(a) - Z(l) = n and from step i above we have

Z(a) - Z(l') = n+1.

QED

References