In today's blog, I will show a proof for Newton's formula. Interestingly, the proof revolves around a peculiar definition. The proof is taken from Jean-Pierre Tignol's Galois' Theory of Algebraic Equations and the definition in question is denoted as τ(a,b).
If you are reading Tignol's book, you may wonder why I am using same notation (σk and sk) in the opposite way. I do this in order to stay consistent with Harold Edwards who I referenced in my previous blogs.
τ(a,b) is defined as the sum of unique a-combinations of the n roots of a polynomial labeled x1, x2, ..., xn where one of the root is set to the power of b [For review of why a polynomial of degree n necessarily has n roots, see here for discussion on the fundamental theorem of algebra which guarantees this result]. This means that there are two different conditions that need to be considered. When b ≥ 2, regardless of which number is put to the power of b, there are n*C(n-1,a-1) terms in this equation. When b = 1, there are still C(n,a) possible terms in this equation.
If a ≥ 2, then n*C(n,a-1) is greater than C(n,a) since C(n,a) = (n/a)*C(n-1,a-1)
[Here's the reason why: (n/a)*C(n-1,a-1) = (n/a)*(n-1)!/[(n - 1 -[a-1])!(a-1)!] = n!/[(a)[n - 1 - a + 1]!(a-1)!] = n!/[a!][n-a] = C(n,a)]
With these two conditions in mind, we can now look at the definitions that are needed for the proof.
Definition 1: τ(a,b)
for b ≥ 2:
τ(a,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xibxfj(1)*xfj(2)*...*xfj(a-1)
for b = 1:
τ(a,1) = ∑ (j=1,C(n,a)) xfj(1)*xfj(2)*...*xfj(k)
where:
C(n-1,a-1) = (n-1)!/{(a-1)!(n-1-[a-1])!}
fj(x) is a function where:
- the range of fj(x) is {1, 2, ..., n}
- fj is such that for all x,y if x ≠ y, then fj(x) ≠ fj(y)
- each fj is a unique (a-1)-way selection of n-1 integers (not including i) so that we can assume that fj(1) is less than fj(2), etc. [since there is only one permutation of any selection that has this characteristic]
For n=3 with {x1, x2, x3 }
C(n-1,a-1) = C(2,1) = 2!/1!(2-1)! = 2
So there are n*C(n-1,a-1) =3*2 = 6 terms
τ(2,2) =
x12x2 + x12x3 + x22x1 +
x22x3 +
x32x1 +
x32x2
Definition 2: sk
sk = ∑ (i=1, n) xik
Example:
For n=3 with {x1, x2, x3},
s4 = x14 + x24 + x34
Lemma 1: τ(1,b) = sb
Proof:
(1) τ(1,b) = ∑ (i=1,n) xib using Definition 1 above.
(2) Then τ(1,b) = sb using Definition 2 above.
QED
Definition 3: σi
σk = ∑ (i=1,C(n,k)) xfi(1)*xfi(2)*...*xfi(k)
where:
C(n,k) = (n)!/{(k)!(n-k)!}
fi(x) is a function where:
- the range of fi(x) is {1, 2, ..., n}
- fi is such that for x,y if x ≠ y, then fi(x) ≠ fi(y)
- each fi is a unique k-way selection of n integers so that we can assume that fi(1) is less than fi(2), etc. [since there is only one permutation of any selection that has this characteristic]
For n=3 with {x1, x2, x3 }
C(n,k) = C(3,2) = 3!/2!(3-2)! = 3
So there are 3 terms
σ2 = x1*x2 + x1*x3 + x2*x3
Lemma 2: τ(a,1) = σa
Proof:
This is clear since for b=1, τ(a,1) and σa have the same definition.
QED
Lemma 3:
For a less than n and b greater than 2:
τ(a,b) = σasb-1 - τ(a+1,b-1)
Proof:
(1) τ(a,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xibxfj(1)*xfj(2)*...*xfj(a-1) [See Definition 1 above]
= ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xi(b-1)xixfj(1)*xfj(2)*...*xfj(a-1)
(2) σk = ∑ (i=1,C(n,k)) xfi(1)*xfi(2)*...*xfi(k) [See Definition 3 above]
(3) sb-1 = ∑ (i=1, n) xi(b-1)
(4) σasb-1 = [ ∑ (i=1,C(n,a)) xfi(1)*xfi(2)*...*xfi(a)][ ∑ (j=1, n) xj(b-1)]
(5) τ(a+1,b-1) = ∑ (i=1, n) ∑ (j=1, C(n-1,a)) xib-1xfj(1)*xfj(2)*...*xfj(a)
(6) Now each term in step #4 consists of the following form:
xjb-1*xfi(1)*...*xfi(a)
(7) Each term can then be categorized into two cases:
Case I: xj is the same as one of the values in xfi(1), xfi(2), ..., xfi(a)
Case II: xj is not the same as any of the values in xfi(1), xfi(2), ..., xfi(a)
(8) If C1 is the sum of all the terms that fall in Case I and C2 is the sum of all terms that fall into Case II, we can see that:
σasb-1 = C1 + C2
(9) Now, it is clear that for a less than n and b greater than 2:
C1 = τ(a,b) [See step #1 above]
(10) It is also clear that for a less than n and b greater than 2:
C2 = τ(a+1,b-1)
(11) Since C1 = σasb-1 - C2, it follows that:
τ(a,b) = σasb-1 - τ(a+1,b-1)
QED
Lemma 4:
if a is less than n:
τ(a,2) = σas1 - (a+1)σa+1
Proof:
(1) τ(a,2) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xi2xfj(1)*xfj(2)*...*xfj(a-1) = = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) [xixfj(1)*xfj(2)*...*xfj(a-1) ]*xi
(2) σa = ∑ (i=1,C(n,a)) xfi(1)*xfi(2)*...*xfi(a)
(3) s1 = ∑ (i=1, n) xi
(4) σas1 = [∑ (i=1,C(n,a)) xfi(1)*xfi(2)*...*xfi(a)][∑ (j=1, n) xj]
(5) Now each term in step #4 consists of the following form:
xj*xfi(1)*...*xfi(a)
(7) Each term can then be categorized into two cases:
Case I: xj is the same as one of the values in xfi(1), xfi(2), ..., xfi(a)
Case II: xj is not the same as any of the values in xfi(1), xfi(2), ..., xfi(a)
(8) If C1 is the sum of all the terms that fall in Case I and C2 is the sum of all terms that fall into Case II, we can see that:
σas1 = C1 + C2
(9) Now, it is clear that for a less than n:
C1 = τ(a,2) [See step #1 above]
(10) It is also clear that for a less than n:
C2 = (a+1)σa+1 since:
(a) σ(a+1) = ∑ (i=1,C(n,a+1)) xfi(1)*xfi(2)*...*xfi(a)*xfi(a+1)
(b) Now, removing all the terms that are case I, it is clear that there a+1 ways to reach each term in step (a).
For example, the combination x1*x2*...*xa*xa+1
This can be reached from:
x1 * (x2x3*...*xa+1) x2 * (x1x3*...*xa+1) ... xa+1*(x1x2*...*xa)
We can make the same argument for each possible term in C2
(11) Since C1 = σas1 - C2, it follows that:
τ(a,2) = σas1 - (a+1)σa+1
QED
Lemma 5:
if b ≥ 2, then:
τ(n,b) = σnsb-1
Proof:
(1) τ(n,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xibxfj(1)*xfj(2)*...*xfj(a-1)
(2) Since b ≥ 2, we have:
τ(n,b) = x1bx2*...*xn + x2bx1*...*xn + ... + xnbx1*...*xn-1 =
= x1b-1x1x2*...*xn + x2b-1x1x2*...*xn + ... + xnb-1x1*...*xn =
= (x1b-1 + x2b-1 + ... + xnb-1)(x1x2*...xn)
(3) σn = x1*x2*...*xn
(4) sb-1 = x1b-1 + x2b-1 + ... + xnb-1
(5) It follows from step #2, that:
τ(n,b) = σnsb-1
QED
Theorem 6: Newton's Formula
Proof:
(1) τ(1,k) = sk [See Lemma 1 above]
(2) Using Lemma 3 above if k ≥ 3, we have:
τ(1,k) = σ1sk-1 - τ(2,k-1) = sk
(3) Assuming that k-1 ≥ 3, Lemma 3 gives us:
τ(2,k-1) = σ2sk-2 - τ(3,k-2)
(4) Following this pattern, if k ≤ n, we eventually get the following:
sk = σ1sk-1 - σ2sk-2 + ... + (-1)kτ(k-1,2)
(5) Using Lemma 4 above:
τ(k-1,2) = σk-1s1 - (k)σk
(6) Thus, we have:
sk = σ1sk-1 - σ2sk-2 + ... + (-1)kσk-1s1 + (-1)k+1kσk
(7) Subtracting sk from both sides and then multiplying -1 gives us:
0 = sk - σ1sk-1 + σ2sk-2 + ... + (-1)k-1σk-1s1 + (-1)kkσk
(8) If k is greater than n, then we eventually get to:
sk = σ1sk-1 - σ2sk-2 + ... + (-1)n+1τ(n,k+1-n)
(9) Using Lemma 5 above gives us:
sk = σ1sk-1 - σ2sk-2 + ... + (-1)n+1σnsk-n
(10) Subtracting sk from both sides and then multiplying by -1 gives us:
0 = sk - σ1sk-1 + σ2sk-2 + ... + (-1)nσnsk-n
(11) So, combining step #7 and step #10, if we assume that:
σ0 = 1 and if k is greater than n, σk=0, then:
QED
References
- Jean-Pierre Tignol, Galois' Theory of Algebraic Equations, World Scientific, 2001
- Harold M. Edwards, Galois Theory, Springer, 1984.