Fermat's Last Theorem: Newton's Identities: Proof of Newton's Formula

In a previous blog, I talked about Newton's Identities and the formula that Sir Isaac Newton used to generate these identities.

In today's blog, I will show a proof for Newton's formula. Interestingly, the proof revolves around a peculiar definition. The proof is taken from Jean-Pierre Tignol's Galois' Theory of Algebraic Equations and the definition in question is denoted as τ(a,b).

If you are reading Tignol's book, you may wonder why I am using same notation (σ_k and s_k) in the opposite way. I do this in order to stay consistent with Harold Edwards who I referenced in my previous blogs.

τ(a,b) is defined as the sum of unique a-combinations of the n roots of a polynomial labeled x₁, x₂, ..., x_n where one of the root is set to the power of b [For review of why a polynomial of degree n necessarily has n roots, see here for discussion on the fundamental theorem of algebra which guarantees this result]. This means that there are two different conditions that need to be considered. When b ≥ 2, regardless of which number is put to the power of b, there are n*C(n-1,a-1) terms in this equation. When b = 1, there are still C(n,a) possible terms in this equation.

If a ≥ 2, then n*C(n,a-1) is greater than C(n,a) since C(n,a) = (n/a)*C(n-1,a-1)

[Here's the reason why: (n/a)*C(n-1,a-1) = (n/a)*(n-1)!/[(n - 1 -[a-1])!(a-1)!] = n!/[(a)[n - 1 - a + 1]!(a-1)!] = n!/[a!][n-a] = C(n,a)]

With these two conditions in mind, we can now look at the definitions that are needed for the proof.

Definition 1: τ(a,b)

for b ≥ 2:

τ(a,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) x_i^bx_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(a-1)}

for b = 1:

τ(a,1) = ∑ (j=1,C(n,a)) x_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(k)}

where:

C(n-1,a-1) = (n-1)!/{(a-1)!(n-1-[a-1])!}

f_j(x) is a function where:

the range of f_j(x) is {1, 2, ..., n}
f_j is such that for all x,y if x ≠ y, then f_j(x) ≠ f_j(y)
each f_j is a unique (a-1)-way selection of n-1 integers (not including i) so that we can assume that f_j(1) is less than f_j(2), etc. [since there is only one permutation of any selection that has this characteristic]

Example:

For n=3 with {x₁, x₂, x₃ }

C(n-1,a-1) = C(2,1) = 2!/1!(2-1)! = 2

So there are n*C(n-1,a-1) =3*2 = 6 terms

τ(2,2) =
x₁²x₂ + x₁²x₃ + x₂²x₁ +
x₂²x₃ +
x₃²x₁ +
x₃²x₂

Definition 2: s_k

s_k = ∑ (i=1, n) x_i^k

Example:

For n=3 with {x₁, x₂, x₃},

s₄ = x₁⁴ + x₂⁴ + x₃⁴

Lemma 1: τ(1,b) = s_b

Proof:

(1) τ(1,b) = ∑ (i=1,n) x_i^b using Definition 1 above.

(2) Then τ(1,b) = s_b using Definition 2 above.

QED

Definition 3: σ_i

σ_k = ∑ (i=1,C(n,k)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(k)}

where:

C(n,k) = (n)!/{(k)!(n-k)!}

f_i(x) is a function where:

the range of f_i(x) is {1, 2, ..., n}
f_i is such that for x,y if x ≠ y, then f_i(x) ≠ f_i(y)
each f_i is a unique k-way selection of n integers so that we can assume that f_i(1) is less than f_i(2), etc. [since there is only one permutation of any selection that has this characteristic]

Example: σ₂

For n=3 with {x₁, x₂, x₃ }

C(n,k) = C(3,2) = 3!/2!(3-2)! = 3

So there are 3 terms

σ₂ = x₁*x₂ + x₁*x₃ + x₂*x₃

Lemma 2: τ(a,1) = σ_a

Proof:

This is clear since for b=1, τ(a,1) and σ_a have the same definition.

QED

Lemma 3:

For a less than n and b greater than 2:

τ(a,b) = σ_as_b-1 - τ(a+1,b-1)

Proof:

(1) τ(a,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) x_i^bx_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(a-1)} [See Definition 1 above]

= ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) x_i^(b-1)x_ix_{f_j(1)}*x_{f_j(2)}*...*xf_j(a-1)

(2) σ_k = ∑ (i=1,C(n,k)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(k)} [See Definition 3 above]

(3) s_b-1 = ∑ (i=1, n) x_i^(b-1)

(4) σ_as_b-1 = [ ∑ (i=1,C(n,a)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(a)}][ ∑ (j=1, n) x_j^(b-1)]

(5) τ(a+1,b-1) = ∑ (i=1, n) ∑ (j=1, C(n-1,a)) x_i^b-1x_{f_j(1)}*x_{f_j(2)}*...*xf_j(a)

(6) Now each term in step #4 consists of the following form:

x_j^b-1*x_{f_i(1)}*...*x_{f_i(a)}

(7) Each term can then be categorized into two cases:

Case I: x_j is the same as one of the values in x_{f_i(1)}, x_{f_i(2)}, ..., x_{f_i(a)}

Case II: x_j is not the same as any of the values in x_{f_i(1)}, x_{f_i(2)}, ..., x_{f_i(a)}

(8) If C₁ is the sum of all the terms that fall in Case I and C₂ is the sum of all terms that fall into Case II, we can see that:

σ_as_b-1 = C₁ + C₂

(9) Now, it is clear that for a less than n and b greater than 2:

C₁ = τ(a,b) [See step #1 above]

(10) It is also clear that for a less than n and b greater than 2:

C₂ = τ(a+1,b-1)

(11) Since C₁ = σ_as_b-1 - C₂, it follows that:

τ(a,b) = σ_as_b-1 - τ(a+1,b-1)

QED

Lemma 4:

if a is less than n:

τ(a,2) = σ_as₁ - (a+1)σ_a+1

Proof:

(1) τ(a,2) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) x_i²x_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(a-1)} = = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) [x_ix_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(a-1)} ]*x_i

(2) σ_a = ∑ (i=1,C(n,a)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(a)}

(3) s₁ = ∑ (i=1, n) x_i

(4) σ_as₁= [∑ (i=1,C(n,a)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(a)}][∑ (j=1, n) x_j]

(5) Now each term in step #4 consists of the following form:

x_j*x_{f_i(1)}*...*x_{f_i(a)}

(7) Each term can then be categorized into two cases:

Case I: x_j is the same as one of the values in x_{f_i(1)}, x_{f_i(2)}, ..., x_{f_i(a)}

Case II: x_j is not the same as any of the values in x_{f_i(1)}, x_{f_i(2)}, ..., x_{f_i(a)}(8) If C₁ is the sum of all the terms that fall in Case I and C₂ is the sum of all terms that fall into Case II, we can see that:
σ_as₁ = C₁ + C₂

(9) Now, it is clear that for a less than n:

C₁ = τ(a,2) [See step #1 above]

(10) It is also clear that for a less than n:

C₂ = (a+1)σ_a+1since:

(a) σ_(a+1) = ∑ (i=1,C(n,a+1)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(a)}*x_{f_i(a+1)}

(b) Now, removing all the terms that are case I, it is clear that there a+1 ways to reach each term in step (a).

For example, the combination x₁*x₂*...*x_a*x_a+1

This can be reached from:

x₁ * (x₂x₃*...*x_a+1) x₂ * (x₁x₃*...*x_a+1) ... x_a+1*(x₁x₂*...*x_a)

We can make the same argument for each possible term in C₂

(11) Since C₁ = σ_as₁ - C₂, it follows that:

τ(a,2) = σ_as₁ - (a+1)σa+1

QED

Lemma 5:

if b ≥ 2, then:

τ(n,b) = σ_ns_b-1

Proof:

(1) τ(n,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) x_i^bx_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(a-1)}

(2) Since b ≥ 2, we have:

τ(n,b) = x₁^bx₂*...*x_n + x₂^bx₁*...*x_n + ... + x_n^bx₁*...*x_n-1 =

= x₁^b-1x₁x₂*...*x_n + x₂^b-1x₁x₂*...*x_n + ... + x_n^b-1x₁*...*x_n =

= (x₁^b-1 + x₂^b-1 + ... + x_n^b-1)(x₁x₂*...x_n)

(3) σ_n = x₁*x₂*...*x_n

(4) s_b-1 = x₁^b-1 + x₂^b-1 + ... + x_n^b-1

(5) It follows from step #2, that:

τ(n,b) = σ_ns_b-1

QED

Theorem 6: Newton's Formula

Proof:

(1) τ(1,k) = s_k [See Lemma 1 above]

(2) Using Lemma 3 above if k ≥ 3, we have:

τ(1,k) = σ₁s_k-1 - τ(2,k-1) = s_k

(3) Assuming that k-1 ≥ 3, Lemma 3 gives us:

τ(2,k-1) = σ₂s_k-2 - τ(3,k-2)

(4) Following this pattern, if k ≤ n, we eventually get the following:

s_k = σ₁s_k-1 - σ₂s_k-2 + ... + (-1)^kτ(k-1,2)

(5) Using Lemma 4 above:

τ(k-1,2) = σ_k-1s₁ - (k)σ_k

(6) Thus, we have:

s_k = σ₁s_k-1 - σ₂s_k-2 + ... + (-1)^kσ_k-1s₁ + (-1)^k+1kσ_k
(7) Subtracting s_k from both sides and then multiplying -1 gives us:

0 = s_k - σ₁s_k-1 + σ₂s_k-2 + ... + (-1)^k-1σ_k-1s₁ + (-1)^kkσ_k

(8) If k is greater than n, then we eventually get to:

s_k = σ₁s_k-1 - σ₂s_k-2 + ... + (-1)ⁿ⁺¹τ(n,k+1-n)

(9) Using Lemma 5 above gives us:

s_k = σ₁s_k-1 - σ₂s_k-2 + ... + (-1)ⁿ⁺¹σ_ns_k-n

(10) Subtracting s_k from both sides and then multiplying by -1 gives us:

0 = s_k - σ₁s_k-1 + σ₂s_k-2 + ... + (-1)ⁿσ_ns_k-n

(11) So, combining step #7 and step #10, if we assume that:

σ₀ = 1 and if k is greater than n, σ_k=0, then:

QED

References