Wednesday, September 09, 2009

Waring's Method

In today's blog, I will show Edward Waring's method for expressing any symmetric polynomial in terms of the elementary symmetric polynomials (s1, ..., sn). For review of the elementary symmetric polynomials, see here. For review of polynomials, see here. For review of a field, see here.

The content in today's blog is taken from Jean-Pierre Tignol's Galois' Theory of Algebraic Equations.

Definition 1: Symmetric Polynomial

A polynomial P(x1, ..., xn) in n indeterminates is symmetric if and only if it is not altered when the indeterminates are arbitarily permuted among themselves.

That is, for every permutation σ of 1, ..., n, we have:

P(xσ(1), ..., xσ(n)) = P(x1, ..., xn)

Definition 2: Symmetric Rational Fraction

A rational fraction P/Q in n indeterminates is symmetric if it is not altered when the indeterminates are permuted; i.e. for every permutation σ of 1,..., n:

P(xσ(1), ..., xσ(n)) /Q(xσ(1), ..., xσ(n)) = P(x1, ..., xn)/Q(x1, ..., xn)

Note: This is not mean that P,Q are necessarily symmetric. Still, it will be seen later that every symmetric rational fraction can be represented as the quotient of symmetric polynomials.

Definition 3: ∑ x1i1x2i2*...*xnin

We can use this notation to characterize a symmetric polynomial. In this case, each monomial has the form expressed in the sum.

In using this notation, it is important to specify the value of n and the number of indeterminates. Otherwise, it is not clear how it is expressed.

For example for a symmetric polynomial in two variables:

∑ x12x2 = x12x2 + x1x22

For a symmetric polynomial in three variables, we have:

∑ x12x2 = x12x2 + x1x22+ x12x3 + x1x32 + x22x3 + x2x32

We can also use this notation to represent the elementary symmetric polynomials:

s1 = ∑ x1

s2 = ∑ x1x2

...

sn-1= ∑ x1*...*xn-1

sn = ∑ x1*...*xn


Definition 4: deg (∑ x1i1x2i2*...*xnin)= (i1, ..., in)

Using the notation ∑ x1i1x2i2*...*xnin, by degree, I mean the set of i1, ..., in that make up the symmetric polynomial.

For any non-zero polynomial P = P(x1, ..., xn) in n indeterminates x1, ..., xn over a field, the degree of P is defined as the largest n-tuple (i1, ..., in) for which the coefficient x1i1*...*xnin in P is nonzero.

For purposes of comparison, we assume that i1, ..., in are ordered such that i1 ≥ i2 ≥ ... ≥ ... in

Here are some examples:

deg s1 = (1, 0, 0, ..., 0)

deg s2 = (1,1,0,....,0)

...

deg sn-1 = (1,1,...1,0)

deg ∑ x12x2 = (2,1,0, ..., 0)


Definition 5: Nn

Let Nn be the set of n-tuples of integers of the form of Definition 4.

So that:

Nn = { (i1, ..., in), (j1, ..., jn), (k1, ..., kn), ... }


Definition 6: Ordering of Nn: (i1, ..., in) ≥ (j1, ..., jn)

(i1, ..., in) ≥ (j1, ..., jn) if and only if for all values:

i1 is greater than j1 or

i1 = j1 and i2 is greater than j2 or

for all u, iu = ju or

there exists v, such that iv is greater than jv and for all u less than v, iu = ju


Lemma 1: deg(P+Q) ≤ max(deg P, deg Q)

Proof:

(1) Let P = ∑ x1i1x2i2*...*xnin

(2) Let Q = ∑ x1j1x2j2*...*xnjn

(3) deg P = (i1, ..., in)

(4) deg Q = (j1, ..., jn)

(5) Assume that (i1, ..., in) is greater than (j1, ..., jn)

(6) So, max(deg P,deg Q) = (i1, ..., in)

(7) If none of the resulting coefficients changes to zero, then:

deg(P+Q) = (i1, ..., in) [This follows from definition 4 above]

(8) If at least one of the resulting coefficients changes to zero, then:

deg(P+Q) is less than (i1, ..., in)

(9) We know that deg(P+Q) cannot be higher because addition may change the coefficient but it cannot change the power of any of the terms.

QED


Lemma 2: deg(PQ) = deg P + deg Q

Proof:

(1) Let P = ∑ x1i1x2i2*...*xnin

(2) Let Q = ∑ x1j1x2j2*...*xnjn

(3) deg P = (i1, ..., in)

(4) deg Q = (j1, ..., jn)

(5) deg P + deg Q = (i1 + j1, ..., in+jn)

(6) P*Q = (∑ x1i1x2i2*...*xnin)*(∑ x1j1x2j2*...*xnjn) = ∑ (x1i1+j1x2i2+j2*...*xnin+jn)

(7) deg(P*Q) = (i1 + j1, ..., in+jn)

QED

Corollary 2.1: deg ax = x*deg(a)

Proof:

deg ax = deg (a*a*...*a) = deg(a) + deg(a) + ... + deg(a) = x*deg(a)

QED


Lemma 3: Nn does not contain any infinite strictly decreasing sequence of elements.

That is if we take any x ∈ Nn, we can only decrease it a finite amount of times.

Proof:

(1) This is clearly true for n=1.

(2) So, we can assume that this is true up to n-1.

(3) Assume that we have an infinite strictly decreasing sequence in Nn such that:

(i11, i12, ..., i1n) is greater than (i21, i22, ..., i2n) which is greater than ... which is greater than (im1, im2, ..., imn) is greater than ....

(4) Using Definition 6 above, we know that:

i11 ≥ i21 ≥ ... ≥ im1 ≥ ....

(5) Since i11 is finite, it follows that the only way that this can be infinite is if this sequence is eventually constant.

(6) Let us assume that it becomes constant starting with iM1 so that for all m ≥ M, im1 = iM1

(7) By our assumption in step #3, it follows that the following sequence must also be infinite:

(iM1, iM2, ..., iMn) is greater than (i(M+1)1, i(M+1)2, ...., i(M+1)n) is greater than ... and so on.

(8) Since all of the first elements are equal and from definition 6 above, we can remove the first element in all cases to get the following infinite sequence:

(iM2, ..., iMn) is greater than (i(M+1)2, ...., i(M+1)n) is greater than ... and so on.

(9) But now we have a contradiction. Since we assumed in step #2 that there are no infinite strictly decreasing sequence of elements in Nn-1

(10) So, we reject our assumption in step #3.

QED

Theorem 4: Waring's Method (Fundamental Theorem of Symmetric Polynomials)

A polynomial in n indeterminates x1, ..., xn over a field F can be expressed as a polynomial in s1, ..., sn if and only if it is symmetric.

In other words, for any symmetric polynomial, there exists a function g such that:

P(x1, ..., xn) = g(s1, ..., sn)

where s1, s2, ..., sn are the elementary symmetric polynomials.

Proof:

(1) Let P ∈ F[x1, ..., xn] be a non-zero symmetric polynomial.

(2) Let deg P = (i1, ..., in) ∈ Nn where i1 ≥ i2 ≥ ... ≥ in

(3) I will now show that P can be expressed as a function of the elementary symmetric polynomials.

(4) Let us define the following polynomial:

f = s1i1 - i2s2i2-i3*...*sn-1in-1-insnin

(5) Using Lemma 2 above, we have:

deg f = deg(s1i1 - i2) + deg(s2i2 - i3) + ... + deg(snin)

(6) Using Corollary 2.1 above, we have:

deg f = (i1 - i2)deg(s1) + (i2 - i3)deg(s2) + ... + indeg(sn)

(7) Based on the definition of the elementary symmetric polynomials (see here), we have:

deg f = (i1 - i2,0,...,0) + (i2 - i3, i2 - i3,0,...,0) + ... + (in, ..., in) = (i1, i2, ..., in)

(8) Also from the definition of the elementary symmetric polynomials, we know that the leading coefficient of f is 1.

(9) So, we can restate f as:

f = x1i1*...*xnin + (terms of lower degree)

(10) Let a ∈ Fx be the leading coefficient of P such that:

P = ax1i1*...*xnin + (terms of lower degree)

(11) Let:

P1 = P - af

(12) We can see that P1 has the following properties:

(a) deg P1 is less than deg P [See definition 4 above]

(b) P1 is symmetric since P and f are symmetric

(c) We can assume that P1 is nonzero. [If it were 0, we would be done with the proof. We only need to handle the case when P1 is nonzero to finish the proof]

(13) Now since P1 is a symmetric polynomial, we can repeat step #12 such that we define a polynomial P2

(14) In this way, we can continue to reduce Pi until we Pi - af = 0.

(15) We know that this process will eventually complete from Lemma 3 above.

QED

Example 4.1: S = ∑ x14x2x3 + ∑ x13x23

(1) Let:

S = ∑ x14x2x3 + ∑ x13x23

(so that: S = x14x2x3 + x1x24x3 + x1x2x34 + x13x23 + x13x33 + x23x33)

[See Definition 3 above for details if needed]

(2) Since (4,1,1) is greater than (3,3,0) [see Definition 6 above], it follows that [see Definition 4 above]:

deg(S) = (4,1,1)

(3) Let f = s14-1s21-1s31 = s13s3

(4) Using the definitions for s1, ..., s3, we have:

s13s3 = (∑ x1)3(x1x2x3) = ∑ x14x2x3 + 3∑ x13x22x3 + 6x12x22x32

(5) If we let S1 = S - f, then we get:

S1 = ∑ x13x23 - 3∑ x13x22x3 - 6x12x22x32

(6) deg S1 = (3,3,0)

(7) Let f1 = s13-3s23-0s30 = s23

(8) Using the definitions for s1, ..., s3, we have:

s23 = (∑ x1x2)3 = ∑ x13x23 + 3∑ x13x22x3 + 6x12x22x32

(9) If we let S2 = S1 - f1, then we get:

S2 = - 6∑ x13x22x3 - 12x12x22x32

(10) deg S2 = (3,2,1)

(11) Let f2 = s13-2s22-1s11 = s1s2s3

(12) Using the definitions for s1, ..., s3, we have:

s1s2s3 = (∑ x1)(∑ x1x2)(∑ x1x2x3) = ∑x13x22x3 + 3x12x22x32

(13) If we let S3 = S2 + 6f2, then we get:

S3 = 6x12x22x32

(14) deg(S3) = (2,2,2)

(15) Let f3 = s12-2s22-2s32 = s32

(16) Using the definitions for s1, ..., s3, we have:

s32 = (∑ x1x2x3)2 = x12x22x32

(17) We can see that S3 - 6f3 = 0 so we are done.

(18) The resulting function in terms of the elementary symmetric polynomials is:

S = s13s3 + s23 - 6s1s2s3 + 6s32


Lemma 5:

Let P,Q be polynomials such that P/Q is symmetric

Then:

P is symmetric if and only Q is symmetric

Proof:

(1) Assume that P is symmetric

(2) Assume that Q is not symmetric such that Q' is a permutation of Q and Q' ≠ Q.

(3) Let P' be the same permutation as Q'.

(4) Since P is symmetric, P' = P

(5) But P'/Q' = P/Q' ≠ P/Q

(6) But this is impossible since we assumed that P/Q is symmetric.

(7) Therefore, we reject our assumption in step #2.

(8) We can make the exact same argument if we assume that Q is symmetric and P is not.

QED

Theorem 6:

A rational fraction in n indeterminates x1, ..., xn over a field F can be expressed as a rational fraction in s1, ..., sn if it is symmetric

Proof:

(1) Let P,Q be polynomials in n indeterminates x1, ..., xn such that the rational fraction P/Q is symmetric.

(2) We can assume that P,Q are not symmetric.

If P is symmetric, then Q is too (from Lemma 5 above) and we can use Theorem 4 above to get our result. So, to prove the theorem, we need only handle the case where both P,Q are not symmetric.

(3) Since Q is not symmetric, let Q1, ..., Qr be the distinct polynomials (other than Q) obtained from Q through permutations of the indeterminates.

(4) The product QQ1*...*Qr is symmetric since any permutation of the indeterminates simply permutes the factors.

(5) Since P/Q is symmetric, it follows that P/Q = P*(Q1*...*Qr)/[Q*(Q1*...*Qr)] is symmetric too.

(6) It further follows that P*(Q1*...*Qr) is symmetric from Lemma 5 above.

(7) Using Theorem 4 above, we know that there exists functions f,g such that:

P*(Q1*...*Qr) = f(s1, ..., sn)

and

Q*(Q1*...*Qr) = g(s1, ..., sn)

(8) Thus,

P/Q = f(s1, ..., sn)/g(s1, ..., sn)

QED

References