Fermat's Last Theorem: Waring's Method

In today's blog, I will show Edward Waring's method for expressing any symmetric polynomial in terms of the elementary symmetric polynomials (s₁, ..., s_n). For review of the elementary symmetric polynomials, see here. For review of polynomials, see here. For review of a field, see here.

The content in today's blog is taken from Jean-Pierre Tignol's Galois' Theory of Algebraic Equations.

Definition 1: Symmetric Polynomial

A polynomial P(x₁, ..., x_n) in n indeterminates is symmetric if and only if it is not altered when the indeterminates are arbitarily permuted among themselves.

That is, for every permutation σ of 1, ..., n, we have:

P(x_σ(1), ..., x_σ(n)) = P(x₁, ..., x_n)

Definition 2: Symmetric Rational Fraction

A rational fraction P/Q in n indeterminates is symmetric if it is not altered when the indeterminates are permuted; i.e. for every permutation σ of 1,..., n:

P(x_σ(1), ..., x_σ(n)) /Q(x_σ(1), ..., x_σ(n)) = P(x₁, ..., x_n)/Q(x₁, ..., x_n)

Note: This is not mean that P,Q are necessarily symmetric. Still, it will be seen later that every symmetric rational fraction can be represented as the quotient of symmetric polynomials.

Definition 3: ∑ x₁^i₁x₂^i₂*...*x_n^i_n

We can use this notation to characterize a symmetric polynomial. In this case, each monomial has the form expressed in the sum.

In using this notation, it is important to specify the value of n and the number of indeterminates. Otherwise, it is not clear how it is expressed.

For example for a symmetric polynomial in two variables:

∑ x₁²x₂ = x₁²x₂ + x₁x₂²

For a symmetric polynomial in three variables, we have:

∑ x₁²x₂ = x₁²x₂ + x₁x₂²+ x₁²x₃ + x₁x₃²+ x₂²x₃ + x₂x₃2

We can also use this notation to represent the elementary symmetric polynomials:

s₁ = ∑ x₁

s₂ = ∑ x₁x₂

...

s_n-1= ∑ x₁*...*x_n-1

s_n = ∑ x₁*...*x_n

Definition 4: deg (∑ x₁^i₁x₂^i₂*...*x_n^i_n)= (i₁, ..., i_n)

Using the notation ∑ x₁^i₁x₂^i₂*...*x_n^i_n, by degree, I mean the set of i₁, ..., i_n that make up the symmetric polynomial.

For any non-zero polynomial P = P(x₁, ..., x_n) in n indeterminates x₁, ..., x_n over a field, the degree of P is defined as the largest n-tuple (i₁, ..., i_n) for which the coefficient x₁^i₁*...*x_n^i_n in P is nonzero.

For purposes of comparison, we assume that i₁, ..., i_n are ordered such that i₁ ≥ i₂ ≥ ... ≥ ... i_n

Here are some examples:

deg s₁ = (1, 0, 0, ..., 0)

deg s₂ = (1,1,0,....,0)

...

deg s_n-1 = (1,1,...1,0)

deg ∑ x₁²x₂ = (2,1,0, ..., 0)

Definition 5: Nⁿ

Let Nⁿ be the set of n-tuples of integers of the form of Definition 4.

So that:

Nⁿ = { (i₁, ..., i_n), (j₁, ..., j_n), (k₁, ..., k_n), ... }

Definition 6: Ordering of Nⁿ: (i₁, ..., i_n) ≥ (j₁, ..., j_n)

(i₁, ..., i_n) ≥ (j₁, ..., j_n) if and only if for all values:

i₁ is greater than j₁ or

i₁ = j₁ and i₂ is greater than j₂ or

for all u, i_u = j_u or

there exists v, such that i_v is greater than j_v and for all u less than v, i_u = j_u

Lemma 1: deg(P+Q) ≤ max(deg P, deg Q)

Proof:

(1) Let P = ∑ x₁^i₁x₂^i₂*...*x_n^i_n

(2) Let Q = ∑ x₁^j₁x₂^j₂*...*x_n^j_n

(3) deg P = (i₁, ..., i_n)

(4) deg Q = (j₁, ..., j_n)

(5) Assume that (i₁, ..., i_n) is greater than (j₁, ..., j_n)

(6) So, max(deg P,deg Q) = (i₁, ..., i_n)

(7) If none of the resulting coefficients changes to zero, then:

deg(P+Q) = (i₁, ..., i_n) [This follows from definition 4 above]

(8) If at least one of the resulting coefficients changes to zero, then:

deg(P+Q) is less than (i₁, ..., i_n)

(9) We know that deg(P+Q) cannot be higher because addition may change the coefficient but it cannot change the power of any of the terms.

QED

Lemma 2: deg(PQ) = deg P + deg Q

Proof:

(1) Let P = ∑ x₁^i₁x₂^i₂*...*x_n^i_n

(2) Let Q = ∑ x₁^j₁x₂^j₂*...*x_n^j_n

(3) deg P = (i₁, ..., i_n)

(4) deg Q = (j₁, ..., j_n)

(5) deg P + deg Q = (i₁ + j₁, ..., i_n+j_n)

(6) P*Q = (∑ x₁^i₁x₂^i₂*...*x_n^i_n)*(∑ x₁^j₁x₂^j₂*...*x_n^j_n) = ∑ (x₁^i₁+j₁x₂^i₂+j₂*...*x_n^i_n+j_n)

(7) deg(P*Q) = (i₁ + j₁, ..., i_n+j_n)

QED

Corollary 2.1: deg a^x = x*deg(a)

Proof:

deg a^x = deg (a*a*...*a) = deg(a) + deg(a) + ... + deg(a) = x*deg(a)

QED

Lemma 3: Nⁿ does not contain any infinite strictly decreasing sequence of elements.

That is if we take any x ∈ Nⁿ, we can only decrease it a finite amount of times.

Proof:

(1) This is clearly true for n=1.

(2) So, we can assume that this is true up to n-1.

(3) Assume that we have an infinite strictly decreasing sequence in Nⁿ such that:

(i₁₁, i₁₂, ..., i_1n) is greater than (i₂₁, i₂₂, ..., i_2n) which is greater than ... which is greater than (i_m1, i_m2, ..., i_mn) is greater than ....

(4) Using Definition 6 above, we know that:

i₁₁ ≥ i₂₁ ≥ ... ≥ i_m1 ≥ ....

(5) Since i₁₁ is finite, it follows that the only way that this can be infinite is if this sequence is eventually constant.

(6) Let us assume that it becomes constant starting with i_M1 so that for all m ≥ M, i_m1 = i_M1

(7) By our assumption in step #3, it follows that the following sequence must also be infinite:

(i_M1, i_M2, ..., i_Mn) is greater than (i_(M+1)1, i_(M+1)2, ...., i_(M+1)n) is greater than ... and so on.

(8) Since all of the first elements are equal and from definition 6 above, we can remove the first element in all cases to get the following infinite sequence:

(i_M2, ..., i_Mn) is greater than (i_(M+1)2, ...., i_(M+1)n) is greater than ... and so on.

(9) But now we have a contradiction. Since we assumed in step #2 that there are no infinite strictly decreasing sequence of elements in N^n-1

(10) So, we reject our assumption in step #3.

QED

Theorem 4: Waring's Method (Fundamental Theorem of Symmetric Polynomials)

A polynomial in n indeterminates x₁, ..., x_n over a field F can be expressed as a polynomial in s₁, ..., s_n if and only if it is symmetric.

In other words, for any symmetric polynomial, there exists a function g such that:

P(x₁, ..., x_n) = g(s₁, ..., s_n)

where s₁, s₂, ..., s_n are the elementary symmetric polynomials.

Proof:

(1) Let P ∈ F[x₁, ..., x_n] be a non-zero symmetric polynomial.

(2) Let deg P = (i₁, ..., i_n) ∈ Nⁿ where i₁ ≥ i₂ ≥ ... ≥ i_n

(3) I will now show that P can be expressed as a function of the elementary symmetric polynomials.

(4) Let us define the following polynomial:

f = s₁^{i₁ - i₂}s₂^i₂-i₃*...*s_n-1^i_n-1-i_ns_n^i_n

(5) Using Lemma 2 above, we have:

deg f = deg(s₁^{i₁ - i₂}) + deg(s₂^{i₂ - i₃}) + ... + deg(s_n^i_n)

(6) Using Corollary 2.1 above, we have:

deg f = (i₁ - i₂)deg(s₁) + (i₂ - i₃)deg(s₂) + ... + i_ndeg(s_n)

(7) Based on the definition of the elementary symmetric polynomials (see here), we have:

deg f = (i₁ - i₂,0,...,0) + (i₂ - i₃, i₂ - i₃,0,...,0) + ... + (i_n, ..., i_n) = (i₁, i₂, ..., i_n)

(8) Also from the definition of the elementary symmetric polynomials, we know that the leading coefficient of f is 1.

(9) So, we can restate f as:

f = x₁^i₁*...*x_n^i_n + (terms of lower degree)

(10) Let a ∈ F^x be the leading coefficient of P such that:

P = ax₁^i₁*...*x_n^i_n + (terms of lower degree)

(11) Let:

P₁ = P - af

(12) We can see that P₁ has the following properties:

(a) deg P₁ is less than deg P [See definition 4 above]

(b) P₁ is symmetric since P and f are symmetric

(c) We can assume that P₁ is nonzero. [If it were 0, we would be done with the proof. We only need to handle the case when P₁ is nonzero to finish the proof]

(13) Now since P₁ is a symmetric polynomial, we can repeat step #12 such that we define a polynomial P₂

(14) In this way, we can continue to reduce P_i until we P_i - af = 0.

(15) We know that this process will eventually complete from Lemma 3 above.

QED

Example 4.1: S = ∑ x₁⁴x₂x₃ + ∑ x₁³x₂³

(1) Let:

S = ∑ x₁⁴x₂x₃ + ∑ x₁³x₂³

(so that: S = x₁⁴x₂x₃ + x₁x₂⁴x₃ + x₁x₂x₃⁴ + x₁³x₂³ + x₁³x₃³ + x₂³x₃³)

[See Definition 3 above for details if needed]

(2) Since (4,1,1) is greater than (3,3,0) [see Definition 6 above], it follows that [see Definition 4 above]:

deg(S) = (4,1,1)

(3) Let f = s₁^4-1s₂^1-1s₃¹ = s₁³s₃

(4) Using the definitions for s₁, ..., s₃, we have:

s₁³s₃ = (∑ x₁)³(x₁x₂x₃) = ∑ x₁⁴x₂x₃ + 3∑ x₁³x₂²x₃ + 6x₁²x₂²x₃²

(5) If we let S₁ = S - f, then we get:

S₁ = ∑ x₁³x₂³ - 3∑ x₁³x₂²x₃ - 6x₁²x₂²x₃²

(6) deg S₁ = (3,3,0)

(7) Let f₁ = s₁^3-3s₂^3-0s₃⁰ = s₂³

(8) Using the definitions for s₁, ..., s₃, we have:

s₂³ = (∑ x₁x₂)³ = ∑ x₁³x₂³ + 3∑ x₁³x₂²x₃ + 6x₁²x₂²x₃²

(9) If we let S₂ = S₁ - f₁, then we get:

S₂ = - 6∑ x₁³x₂²x₃ - 12x₁²x₂²x₃²

(10) deg S₂ = (3,2,1)

(11) Let f₂ = s₁^3-2s₂^2-1s₁¹ = s₁s₂s₃

(12) Using the definitions for s₁, ..., s₃, we have:

s₁s₂s₃ = (∑ x₁)(∑ x₁x₂)(∑ x₁x₂x₃) = ∑x₁³x₂²x₃ + 3x₁²x₂²x₃²

(13) If we let S₃ = S₂ + 6f₂, then we get:

S₃ = 6x₁²x₂²x₃²

(14) deg(S₃) = (2,2,2)

(15) Let f₃ = s₁^2-2s₂^2-2s₃² = s₃²

(16) Using the definitions for s₁, ..., s₃, we have:

s₃² = (∑ x₁x₂x₃)² = x₁²x₂²x₃²

(17) We can see that S₃ - 6f₃ = 0 so we are done.

(18) The resulting function in terms of the elementary symmetric polynomials is:

S = s₁³s₃ + s₂³ - 6s₁s₂s₃ + 6s₃²

Lemma 5:

Let P,Q be polynomials such that P/Q is symmetric

Then:

P is symmetric if and only Q is symmetric

Proof:

(1) Assume that P is symmetric

(2) Assume that Q is not symmetric such that Q' is a permutation of Q and Q' ≠ Q.

(3) Let P' be the same permutation as Q'.

(4) Since P is symmetric, P' = P

(5) But P'/Q' = P/Q' ≠ P/Q

(6) But this is impossible since we assumed that P/Q is symmetric.

(7) Therefore, we reject our assumption in step #2.

(8) We can make the exact same argument if we assume that Q is symmetric and P is not.

QED

Theorem 6:

A rational fraction in n indeterminates x₁, ..., x_n over a field F can be expressed as a rational fraction in s₁, ..., s_n if it is symmetric

Proof:

(1) Let P,Q be polynomials in n indeterminates x₁, ..., x_n such that the rational fraction P/Q is symmetric.

(2) We can assume that P,Q are not symmetric.

If P is symmetric, then Q is too (from Lemma 5 above) and we can use Theorem 4 above to get our result. So, to prove the theorem, we need only handle the case where both P,Q are not symmetric.

(3) Since Q is not symmetric, let Q₁, ..., Q_r be the distinct polynomials (other than Q) obtained from Q through permutations of the indeterminates.

(4) The product QQ₁*...*Q_r is symmetric since any permutation of the indeterminates simply permutes the factors.

(5) Since P/Q is symmetric, it follows that P/Q = P*(Q₁*...*Q_r)/[Q*(Q₁*...*Q_r)] is symmetric too.

(6) It further follows that P*(Q₁*...*Q_r) is symmetric from Lemma 5 above.

(7) Using Theorem 4 above, we know that there exists functions f,g such that:

P*(Q₁*...*Q_r) = f(s₁, ..., s_n)

and

Q*(Q₁*...*Q_r) = g(s₁, ..., s_n)

(8) Thus,

P/Q = f(s₁, ..., s_n)/g(s₁, ..., s_n)

QED

References

Jean-Pierre Tignol, Galois' Theory of Algebraic Equations, World Scientific, 2001

Fermat's Last Theorem

Wednesday, September 09, 2009

Waring's Method

No comments:

Topic Index

Completed Proofs

Recommended Books

Required Reading for Experts

About Me

Blog Archive