Friday, July 27, 2007

Newton's Identities: Proof of Newton's Formula

In a previous blog, I talked about Newton's Identities and the formula that Sir Isaac Newton used to generate these identities.

In today's blog, I will show a proof for Newton's formula. Interestingly, the proof revolves around a peculiar definition. The proof is taken from Jean-Pierre Tignol's Galois' Theory of Algebraic Equations and the definition in question is denoted as τ(a,b).

If you are reading Tignol's book, you may wonder why I am using same notation (σk and sk) in the opposite way. I do this in order to stay consistent with Harold Edwards who I referenced in my previous blogs.

τ(a,b) is defined as the sum of unique a-combinations of the n roots of a polynomial labeled x1, x2, ..., xn where one of the root is set to the power of b [For review of why a polynomial of degree n necessarily has n roots, see here for discussion on the fundamental theorem of algebra which guarantees this result]. This means that there are two different conditions that need to be considered. When b ≥ 2, regardless of which number is put to the power of b, there are n*C(n-1,a-1) terms in this equation. When b = 1, there are still C(n,a) possible terms in this equation.

If a ≥ 2, then n*C(n,a-1) is greater than C(n,a) since C(n,a) = (n/a)*C(n-1,a-1)

[Here's the reason why: (n/a)*C(n-1,a-1) = (n/a)*(n-1)!/[(n - 1 -[a-1])!(a-1)!] = n!/[(a)[n - 1 - a + 1]!(a-1)!] = n!/[a!][n-a] = C(n,a)]

With these two conditions in mind, we can now look at the definitions that are needed for the proof.

Definition 1: τ(a,b)

for b ≥ 2:

τ(a,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xibxfj(1)*xfj(2)*...*xfj(a-1)

for b = 1:

τ(a,1) = ∑ (j=1,C(n,a)) xfj(1)*xfj(2)*...*xfj(k)

where:

C(n-1,a-1) = (n-1)!/{(a-1)!(n-1-[a-1])!}

fj(x) is a function where:
  • the range of fj(x) is {1, 2, ..., n}
  • fj is such that for all x,y if x ≠ y, then fj(x) ≠ fj(y)
  • each fj is a unique (a-1)-way selection of n-1 integers (not including i) so that we can assume that fj(1) is less than fj(2), etc. [since there is only one permutation of any selection that has this characteristic]
Example:

For n=3 with {x1, x2, x3 }

C(n-1,a-1) = C(2,1) = 2!/1!(2-1)! = 2

So there are n*C(n-1,a-1) =3*2 = 6 terms

τ(2,2) =
x12x2 + x12x3 + x22x1 +
x22x3 +
x32x1 +
x32x2

Definition 2: sk

sk = ∑ (i=1, n) xik

Example:

For n=3 with {x1, x2, x3},

s4 = x14 + x24 + x34

Lemma 1: τ(1,b) = sb

Proof:

(1) τ(1,b) = ∑ (i=1,n) xib using Definition 1 above.

(2) Then τ(1,b) = sb using Definition 2 above.

QED

Definition 3: σi

σk = ∑ (i=1,C(n,k)) xfi(1)*xfi(2)*...*xfi(k)

where:

C(n,k) = (n)!/{(k)!(n-k)!}

fi(x) is a function where:
  • the range of fi(x) is {1, 2, ..., n}
  • fi is such that for x,y if x ≠ y, then fi(x) ≠ fi(y)
  • each fi is a unique k-way selection of n integers so that we can assume that fi(1) is less than fi(2), etc. [since there is only one permutation of any selection that has this characteristic]
Example: σ2

For n=3 with {x1, x2, x3 }

C(n,k) = C(3,2) = 3!/2!(3-2)! = 3

So there are 3 terms

σ2 = x1*x2 + x1*x3 + x2*x3

Lemma 2: τ(a,1) = σa

Proof:

This is clear since for b=1, τ(a,1) and σa have the same definition.

QED

Lemma 3:

For a less than n and b greater than 2:

τ(a,b) = σasb-1 - τ(a+1,b-1)

Proof:

(1) τ(a,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xibxfj(1)*xfj(2)*...*xfj(a-1) [See Definition 1 above]

= ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xi(b-1)xixfj(1)*xfj(2)*...*xfj(a-1)

(2) σk = ∑ (i=1,C(n,k)) xfi(1)*xfi(2)*...*xfi(k) [See Definition 3 above]

(3) sb-1 = ∑ (i=1, n) xi(b-1)

(4) σasb-1 = [ ∑ (i=1,C(n,a)) xfi(1)*xfi(2)*...*xfi(a)][ ∑ (j=1, n) xj(b-1)]

(5) τ(a+1,b-1) = ∑ (i=1, n) ∑ (j=1, C(n-1,a)) xib-1xfj(1)*xfj(2)*...*xfj(a)

(6) Now each term in step #4 consists of the following form:

xjb-1*xfi(1)*...*xfi(a)

(7) Each term can then be categorized into two cases:

Case I: xj is the same as one of the values in xfi(1), xfi(2), ..., xfi(a)

Case II: xj is not the same as any of the values in xfi(1), xfi(2), ..., xfi(a)

(8) If C1 is the sum of all the terms that fall in Case I and C2 is the sum of all terms that fall into Case II, we can see that:

σasb-1 = C1 + C2

(9) Now, it is clear that for a less than n and b greater than 2:

C1 = τ(a,b) [See step #1 above]

(10) It is also clear that for a less than n and b greater than 2:

C2 = τ(a+1,b-1)

(11) Since C1 = σasb-1 - C2, it follows that:

τ(a,b) = σasb-1 - τ(a+1,b-1)

QED

Lemma 4:

if a is less than n:

τ(a,2) = σas1 - (a+1)σa+1

Proof:

(1) τ(a,2) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xi2xfj(1)*xfj(2)*...*xfj(a-1) = = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) [xixfj(1)*xfj(2)*...*xfj(a-1) ]*xi

(2) σa = ∑ (i=1,C(n,a)) xfi(1)*xfi(2)*...*xfi(a)

(3) s1 = ∑ (i=1, n) xi

(4) σas1 = [∑ (i=1,C(n,a)) xfi(1)*xfi(2)*...*xfi(a)][∑ (j=1, n) xj]

(5) Now each term in step #4 consists of the following form:

xj*xfi(1)*...*xfi(a)

(7) Each term can then be categorized into two cases:

Case I: xj is the same as one of the values in xfi(1), xfi(2), ..., xfi(a)

Case II: xj is not the same as any of the values in xfi(1), xfi(2), ..., xfi(a)

(8) If C1 is the sum of all the terms that fall in Case I and C2 is the sum of all terms that fall into Case II, we can see that:
σas1 = C1 + C2

(9) Now, it is clear that for a less than n:

C1 = τ(a,2) [See step #1 above]

(10) It is also clear that for a less than n:

C2 = (a+1)σa+1 since:

(a) σ(a+1) = ∑ (i=1,C(n,a+1)) xfi(1)*xfi(2)*...*xfi(a)*xfi(a+1)

(b) Now, removing all the terms that are case I, it is clear that there a+1 ways to reach each term in step (a).

For example, the combination x1*x2*...*xa*xa+1

This can be reached from:

x1 * (x2x3*...*xa+1) x2 * (x1x3*...*xa+1) ... xa+1*(x1x2*...*xa)

We can make the same argument for each possible term in C2

(11) Since C1 = σas1 - C2, it follows that:

τ(a,2) = σas1 - (a+1)σa+1

QED

Lemma 5:

if b ≥ 2, then:

τ(n,b) = σnsb-1

Proof:

(1) τ(n,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xibxfj(1)*xfj(2)*...*xfj(a-1)

(2) Since b ≥ 2, we have:

τ(n,b) = x1bx2*...*xn + x2bx1*...*xn + ... + xnbx1*...*xn-1 =

= x1b-1x1x2*...*xn + x2b-1x1x2*...*xn + ... + xnb-1x1*...*xn =

= (x1b-1 + x2b-1 + ... + xnb-1)(x1x2*...xn)

(3) σn = x1*x2*...*xn

(4) sb-1 = x1b-1 + x2b-1 + ... + xnb-1

(5) It follows from step #2, that:

τ(n,b) = σnsb-1

QED

Theorem 6: Newton's Formula



Proof:

(1) τ(1,k) = sk [See Lemma 1 above]

(2) Using Lemma 3 above if k ≥ 3, we have:

τ(1,k) = σ1sk-1 - τ(2,k-1) = sk

(3) Assuming that k-1 ≥ 3, Lemma 3 gives us:

τ(2,k-1) = σ2sk-2 - τ(3,k-2)

(4) Following this pattern, if k ≤ n, we eventually get the following:

sk = σ1sk-1 - σ2sk-2 + ... + (-1)kτ(k-1,2)

(5) Using Lemma 4 above:

τ(k-1,2) = σk-1s1 - (k)σk

(6) Thus, we have:

sk = σ1sk-1 - σ2sk-2 + ... + (-1)kσk-1s1 + (-1)k+1k

(7) Subtracting sk from both sides and then multiplying -1 gives us:

0 = sk - σ1sk-1 + σ2sk-2 + ... + (-1)k-1σk-1s1 + (-1)kk

(8) If k is greater than n, then we eventually get to:

sk = σ1sk-1 - σ2sk-2 + ... + (-1)n+1τ(n,k+1-n)

(9) Using Lemma 5 above gives us:

sk = σ1sk-1 - σ2sk-2 + ... + (-1)n+1σnsk-n

(10) Subtracting sk from both sides and then multiplying by -1 gives us:

0 = sk - σ1sk-1 + σ2sk-2 + ... + (-1)nσnsk-n

(11) So, combining step #7 and step #10, if we assume that:

σ0 = 1 and if k is greater than n, σk=0, then:



QED

References