In a
previous blog entry, I talked about the properties of
Sturm Chains. In today's blog entry, I will show how they can be used to establish Sturm's Theorem.
Definition 1: Number of Sign Changes Across a Sturm Chain at xLet
f0, f1, ..., fs be a Sturm Chain where for all
0 ≤ i ≤ s, fi ≠ 0. The
number of sign changes across a Sturm Chain at x is the number of times that a neighboring function changes sign: the number of times when
fi(x) is negative and
fi+1(x) is positive or when
fi(x) is negative and
fi+1(x) is positive.
Example:For the function
f(x) = x5 - 3x - 1, the Sturm Chain is (see
here for details on how this Sturm Chain is constructed):
f0(x) = x5 - 3x - 1f1(x) = 5x4 - 3f2(x) = 12x + 5f3(x) = 1For
x=-2, we see that:
f0 = -f1 = +f2 = -f3 = +So, the
number of sign changes across the Sturm Chain at
x=-2 is
3.
Definition 2: Open and Closed IntervalsAn interval is
open on
a,b if for all
x in
a,b,
a is less than
x and is less than
b. An open interval is represented
(a,b). [It is called
open because there is no minimum or maximum value]
An interval is
closed on
a,b if for all
x in
a,b, a ≤ x ≤ b. A closed interval is represented
[a,b] (It is called closed because it has both a minimum and a maximum value)
We can also mix the two notations so that
x in
[a,b) means that
a ≤ x and
x is less than
b.
x in
(a,b] means that
a is less than
x ≤ b.
I use this notation below.
Lemma 1:Let
f be a continuous function such that
f(a) and
f(b) have different signs,
Then:
There exists
x such that
x in
(a,b) and
f(x)=0Proof:
This follows directly from the Intermediate Value Theorem (see Theorem,
here).
QED
Corollary 1.1:If
f is continuous on
[a,b] and for all
x in
[a,b]:
f(x) ≠ 0Then:
for all
x in
[a,b]:
f(x) has the same sign.
Proof:
(1) Assume that
f changes sign over
[a,b](2) Then by Lemma 1 above there exists
x such that
f(x)=0(3) But this is not true so we reject our assumption at step #1.
QED
Corollary 1.2:Let
f0, f1, ..., fs be a Sturm Chain
Let
k be the only zero for any
fi and it is a zero where
i ≥ 1.Let
a,b be an interval such that
k is in
[a,b]Then:
There are no sign changes across the Sturm Chain for
x in
[a,b]Proof:
(1) For all
x in
[a,b]:
fi-1(x) ≠ 0 and
fi+1(x) ≠ 0 and if
x ≠ k, then
fi(k) ≠ 0.(2) Since
fi(k)=0, we know that
fi-1(k) and
fi+1(k) have opposite signs. (see for properties of
Sturm Chains, see Definition 1,
here)
(3) Since
fi-1 and
fi+1 are nonzero on
[a,b], we know from Corollary 1.1 above that
fi-1 and
fi+1 have the same opposite signs for all
x in
[a,b].
(4) But this means that any sign change on
fi doesn't affect the total number of sign changes from
fi-1 to
fi to
fi+1 since:
Before
k:
There are the following possibilities for all
fi-1, fi, fi+1+, +, - = 1 sign change
+, -, - = 1 sign change
-, +, + = 1 sign change
-, -, + = 1 sign change
After
k:
There are the following possibilities for
fi-1, fi, fi+1+, -, - = 1 sign change
+, +, - = 1 sign change
-, -, + = 1 sign change
-,+,+ = 1 sign change
(5) The above property is true even if there are multiple Sturm functions that are zero at
k. The key point is that by the properties of Sturm Chains, we know that no neighboring functions are both
0 (see
here for details on the properties of a Sturm Chain).
(6) It is possible that
fi, fi+2, etc. are all 0 on
k. But even in this case, the above logic holds and there are no sign changes for any of the Sturm functions.
(5) Because in all other cases, the Sturm functions
are nonzero, it follows from Corollary 1.1 above that they don't change sign and we are done.
QED
Corollary 1.3:Let
f0, f1, ..., fs be a Sturm Chain
Let
[a,b] be an interval such that for all
x in
[a,b]:
f0(x) ≠ 0 (but the other Sturm functions may have a zero)
.Let
Z(x) be the number of sign changes that occur across a Sturm Chain for a given x (see Definition 1 above)
Then:
There are no sign changes over the Sturm Chain in
[a,b]. That is,
Z(a) - Z(b) = 0.
Proof:
(1) Let
f0, f1, ..., fs be a Sturm Chain
(2) Assume that
k is the only zero in
[a,b] for any function in the Sturm Chain where
i ≥ 1.
(3) In this case, the Theorem holds using Corollary 1.2 above.
(4) Assume that we order all zeros in
[a,b] for any function in the Sturm Chain and up to the
nth zero
k, there is no sign change across the Sturm Chain.
(7) Let
k be the
nth zero in
[a,b] and
k' be the
n+1th zero in
[a,b] and
k'' be the
n+2th zero. Let
i,i',i'' be the function that is zero so that we have:
fi(k)=0, fi'(k')=0, and
fi''(k'') = 0.
(8) Since
k' is the only zero on
(k,k''), we can use Corollary 1.2 above to complete the inductive proof.
QED
Lemma 2:Let
f0, f1, ..., fs be a Sturm Chain
Let
k be the only zero for
f0 in
[a,b]Let
Z(x) be the number of sign changes that occur across a Sturm Chain for a given
x.
Then:
Z(a) - Z(b) = 1.
Proof:
(1) Let
h be a point before
k and
l be a pointer after
k such that
f'(h) has the same sign as
f'(k) and as
f'(l). (See Property 3 of Sturm Chains in Definition 1,
here)
(4) We have the following cases to consider:
Case I:
f(h) is positive and
f(l) is negative
(a) For all
x in
(h,l):
f(x) is decreasing, so its derivative
f'(x) is negative. [See Lemma
here if needed]
(b) So at
h:
f0(h)=f(h) is positive and
f1(h)=f'(h) is negative.
(c) At
l:
f0(l) is negative and
f1(l) is negative.
(d) So,
Z(h) - Z(l) = 1.
Case II:
f(h) is negative and
f(l) is positive
(a) For all
x in
(h,l):
f(x) is increasing, so its derivative
f'(x) is positive. [See Lemma
here if needed]
(b) So at
h:
f0(h) is negative and
f1(h) is positive.
(c) At
l:
f0(l) is positive and
f1(l) is positive.
(d) So, again,
Z(h) - Z(l) = 1.
QED
Theorem: Sturm's TheoremLet
f(x) be an algebraic equation with real coefficients and with only simple roots.
Let
(a,b) be an interval such that
f(a) ≠ 0 and
f(b) ≠ 0 and
a is less than
b.
Let
Z(x) be the number of sign changes that occur across a Sturm Chain for a given x (see Definition 1 above if needed).
Then:
The number of real roots occurring in
(a,b) is equal to
Z(a) - Z(b)Proof:
(1) Let
f0, f1, ..., fs be a Sturm Chain for
f(x) where
f0 = f. [See Lemma 1,
here]
(2) Let
[a,b] be an interval such that
a is less than
b and
a,b are real numbers.
(3) Let
Z(x) be the number of the sign changes across the Sturm Chain for a given
x.
(4) There are three cases that we need to consider to prove this theorem.
Case I: No zeros for any function in the Sturm Chain
(a) To prove the theorem for this case, we need to show that
Z(a) - Z(b) = 0.
(b) For all
x in
[a,b], for all
0 ≤ i ≤ s, fi(x) ≠ 0(c) But then by Corollary 1.1 above, for
fi, there is no sign changes for any function in the Sturm Chain.
(d) So,
Z(a) - Z(b)=0Case II: No zeros in
(a,b) for
f0(x) but at least one zero for the other functions in the Sturm Chain.
This case is equivalent to Corollary 1.3 above.
Case III: At least one zero in
(a,b) for
f(x)(a)
∃x such that
a ≤ x ≤ b and
f0(x) = 0(b) To prove this, I will use induction.
(c) Assume that there is only one such zero in
[a,b](d) Using Lemma 2 above, we know that in this case,
Z(a) - Z(b) = 1.
(e) Assume that the theorem is true up to the
nth zero of
f in
[a,b](f) Let
k be the
nth zero,
k' be the
n+1th zero and
k'' be the
n+2th zero.
(g) Then there is only zero in
(k,k'') which is located at
k'.
(h) Let
l be a point in
(k,k') and
l' be a point in
(k',k'').(i) Then, by Lemma 2 above
Z(l) - Z(l') = 1.
(j) From step e above, we have
Z(a) - Z(l) = n and from step i above we have
Z(a) - Z(l') = n+1.
QED
References