Definition 1: Number of Sign Changes Across a Sturm Chain at x

Let f

_{0}, f

_{1}, ..., f

_{s}be a Sturm Chain where for all 0 ≤ i ≤ s, f

_{i}≠ 0. The number of sign changes across a Sturm Chain at x is the number of times that a neighboring function changes sign: the number of times when f

_{i}(x) is negative and f

_{i+1}(x) is positive or when f

_{i}(x) is negative and f

_{i+1}(x) is positive.

Example:

For the function f(x) = x

^{5}- 3x - 1, the Sturm Chain is (see here for details on how this Sturm Chain is constructed):

f

_{0}(x) = x

^{5}- 3x - 1

f

_{1}(x) = 5x

^{4}- 3

f

_{2}(x) = 12x + 5

f

_{3}(x) = 1

For x=-2, we see that:

f

_{0}= -

f

_{1}= +

f

_{2}= -

f

_{3}= +

So, the number of sign changes across the Sturm Chain at x=-2 is 3.

Definition 2: Open and Closed Intervals

An interval is open on a,b if for all x in a,b, a is less than x and is less than b. An open interval is represented (a,b). [It is called open because there is no minimum or maximum value]

An interval is closed on a,b if for all x in a,b, a ≤ x ≤ b. A closed interval is represented [a,b] (It is called closed because it has both a minimum and a maximum value)

We can also mix the two notations so that x in [a,b) means that a ≤ x and x is less than b.

x in (a,b] means that a is less than x ≤ b.

I use this notation below.

Lemma 1:

Let f be a continuous function such that f(a) and f(b) have different signs,

Then:

There exists x such that x in (a,b) and f(x)=0

Proof:

This follows directly from the Intermediate Value Theorem (see Theorem, here).

QED

Corollary 1.1:

If f is continuous on [a,b] and for all x in [a,b]:

f(x) ≠ 0

Then:

for all x in [a,b]:

f(x) has the same sign.

Proof:

(1) Assume that f changes sign over [a,b]

(2) Then by Lemma 1 above there exists x such that f(x)=0

(3) But this is not true so we reject our assumption at step #1.

QED

Corollary 1.2:

Let f

_{0}, f

_{1}, ..., f

_{s}be a Sturm Chain

Let k be the only zero for any f

_{i}and it is a zero where i ≥ 1.

Let a,b be an interval such that k is in [a,b]

Then:

There are no sign changes across the Sturm Chain for x in [a,b]

Proof:

(1) For all x in [a,b]:

f

_{i-1}(x) ≠ 0 and f

_{i+1}(x) ≠ 0 and if x ≠ k, then f

_{i}(k) ≠ 0.

(2) Since f

_{i}(k)=0, we know that f

_{i-1}(k) and f

_{i+1}(k) have opposite signs. (see for properties of Sturm Chains, see Definition 1, here)

(3) Since f

_{i-1}and f

_{i+1}are nonzero on [a,b], we know from Corollary 1.1 above that f

_{i-1}and f

_{i+1 }have the same opposite signs for all x in [a,b].

(4) But this means that any sign change on f

_{i}doesn't affect the total number of sign changes from f

_{i-1}to f

_{i}to f

_{i+1}since:

Before k:

There are the following possibilities for all f

_{i-1}, f

_{i}, f

_{i+1}

+, +, - = 1 sign change

+, -, - = 1 sign change

-, +, + = 1 sign change

-, -, + = 1 sign change

After k:

There are the following possibilities for f

_{i-1}, f

_{i}, f

_{i+1}

+, -, - = 1 sign change

+, +, - = 1 sign change

-, -, + = 1 sign change

-,+,+ = 1 sign change

(5) The above property is true even if there are multiple Sturm functions that are zero at k. The key point is that by the properties of Sturm Chains, we know that no neighboring functions are both 0 (see here for details on the properties of a Sturm Chain).

(6) It is possible that f

_{i}, f

_{i+2}, etc. are all 0 on k. But even in this case, the above logic holds and there are no sign changes for any of the Sturm functions.

(5) Because in all other cases, the Sturm functions are nonzero, it follows from Corollary 1.1 above that they don't change sign and we are done.

QED

Corollary 1.3:

Let f

_{0}, f

_{1}, ..., f

_{s}be a Sturm Chain

Let [a,b] be an interval such that for all x in [a,b]:

f

_{0}(x) ≠ 0 (but the other Sturm functions may have a zero).

Let Z(x) be the number of sign changes that occur across a Sturm Chain for a given x (see Definition 1 above)

Then:

There are no sign changes over the Sturm Chain in [a,b]. That is, Z(a) - Z(b) = 0.

Proof:

(1) Let f

_{0}, f

_{1}, ..., f

_{s}be a Sturm Chain

(2) Assume that k is the only zero in [a,b] for any function in the Sturm Chain where i ≥ 1.

(3) In this case, the Theorem holds using Corollary 1.2 above.

(4) Assume that we order all zeros in [a,b] for any function in the Sturm Chain and up to the nth zero k, there is no sign change across the Sturm Chain.

(7) Let k be the nth zero in [a,b] and k' be the n+1th zero in [a,b] and k'' be the n+2th zero. Let i,i',i'' be the function that is zero so that we have: f

_{i}(k)=0, f

_{i'}(k')=0, and f

_{i''}(k'') = 0.

(8) Since k' is the only zero on (k,k''), we can use Corollary 1.2 above to complete the inductive proof.

QED

Lemma 2:

Let f

_{0}, f

_{1}, ..., f

_{s}be a Sturm Chain

Let k be the only zero for f

_{0}in [a,b]

Let Z(x) be the number of sign changes that occur across a Sturm Chain for a given x.

Then:

Z(a) - Z(b) = 1.

_{}Proof:

(1) Let h be a point before k and l be a pointer after k such that f'(h) has the same sign as f'(k) and as f'(l). (See Property 3 of Sturm Chains in Definition 1, here)

(4) We have the following cases to consider:

Case I: f(h) is positive and f(l) is negative

(a) For all x in (h,l):

f(x) is decreasing, so its derivative f'(x) is negative. [See Lemma here if needed]

(b) So at h:

f

_{0}(h)=f(h) is positive and f

_{1}(h)=f'(h) is negative.

(c) At l:

f

_{0}(l) is negative and f

_{1}(l) is negative.

(d) So, Z(h) - Z(l) = 1.

Case II: f(h) is negative and f(l) is positive

(a) For all x in (h,l):

f(x) is increasing, so its derivative f'(x) is positive. [See Lemma here if needed]

(b) So at h:

f

_{0}(h) is negative and f

_{1}(h) is positive.

(c) At l:

f

_{0}(l) is positive and f

_{1}(l) is positive.

(d) So, again, Z(h) - Z(l) = 1.

QED

Theorem: Sturm's Theorem

Let f(x) be an algebraic equation with real coefficients and with only simple roots.

Let (a,b) be an interval such that f(a) ≠ 0 and f(b) ≠ 0 and a is less than b.

Let Z(x) be the number of sign changes that occur across a Sturm Chain for a given x (see Definition 1 above if needed).

Then:

The number of real roots occurring in (a,b) is equal to Z(a) - Z(b)

Proof:

(1) Let f

_{0}, f

_{1}, ..., f

_{s}be a Sturm Chain for f(x) where f

_{0}= f. [See Lemma 1, here]

(2) Let [a,b] be an interval such that a is less than b and a,b are real numbers.

(3) Let Z(x) be the number of the sign changes across the Sturm Chain for a given x.

(4) There are three cases that we need to consider to prove this theorem.

Case I: No zeros for any function in the Sturm Chain

(a) To prove the theorem for this case, we need to show that Z(a) - Z(b) = 0.

(b) For all x in [a,b], for all 0 ≤ i ≤ s, f

_{i}(x) ≠ 0

(c) But then by Corollary 1.1 above, for f

_{i}, there is no sign changes for any function in the Sturm Chain.

(d) So, Z(a) - Z(b)=0

Case II: No zeros in (a,b) for f

_{0}(x) but at least one zero for the other functions in the Sturm Chain.

This case is equivalent to Corollary 1.3 above.

Case III: At least one zero in (a,b) for f(x)

(a) ∃x such that a ≤ x ≤ b and f

_{0}(x) = 0

(b) To prove this, I will use induction.

(c) Assume that there is only one such zero in [a,b]

(d) Using Lemma 2 above, we know that in this case, Z(a) - Z(b) = 1.

(e) Assume that the theorem is true up to the nth zero of f in [a,b]

(f) Let k be the nth zero, k' be the n+1th zero and k'' be the n+2th zero.

(g) Then there is only zero in (k,k'') which is located at k'.

(h) Let l be a point in (k,k') and l' be a point in (k',k'').

(i) Then, by Lemma 2 above Z(l) - Z(l') = 1.

(j) From step e above, we have Z(a) - Z(l) = n and from step i above we have Z(a) - Z(l') = n+1.

QED

References

- Heinrich Dorrie (Translated by David Antin), 100 Great Problems of Elementary Mathematics (Dover, 1965)

## No comments:

Post a Comment