Saturday, May 20, 2006

Fundamental Theorem of Algebra: The Proof

In today's blog, I complete the proof for the Fundamental Theorem of Algebra. In my next blog, I will use this result to factor Fermat's Last Theorem into cyclotomic integers.

Today's proof is taken from David Antin's translation of Heinrich Dorrie's 100 Great Problems of Elementary Mathematics.

Lemma 1: if an algebraic equation f(x) has a root α, then f(x) can be divided by x-α without a remainder and the degree of the result f'(x) is less than the degree of f(x).

Proof:

(1) Let f(x) = xn + a1xn-1 + ... + an-1x + an

(2) Let α be a root such that f(α) = 0

(3) Now, if we divide the polynomial by (x-α), we get the following (see here if proof needed):

f(x)/(x - α) = f1(x) + R/(x-α)

where R is a constant and f1(x) is a polynomial with order n-1.

(4) Multiplying both sides with x-α gives us:

f(x) = (x - α)f1(x) + R

(5) Now, if we substitute α for x we get:

f(α) = 0 which means that the constant in the equation is 0 so R = 0.

QED

Theorem: Fundamental Theorem of Algebra

For any polynomial equation of order n, there exist n roots ri such that:

xn + a1xn-1 + ... + an-1x + an = (x - r1)(x - r2)*...*(x - rn)

Proof:

(1) Let f(x) = xn + a1xn-1 + ... + an-1x + an

(2) We know that f(x) has at least one solution α1. [See here for proof]

(3) Using Lemma 1 above, we know that:

f(x)/(x - α1) = f'(x) where deg f'(x) = n-1.

So that we have:

f(x) = (x - α1)f'(x)

(4) But we know f'(x) has at least one solution (from here) so we can repeat steps #2 and #3 to get:

f'(x)/(x - α2) = f''(x) where deg f''(x) = n-2.

which combined with step #3 gives us:

f(x) = (x - α1)(x - α2)f''(x)

(5) Eventually we get to the point where the degree of fn(x) = 1.

In this case, fn(x) = x - αn.

(6) This establishes that there are n roots for a given equation f(x) where the degree is n.

(7) Putting this all together gives us:

f(x) = (x - α1)(x - α2)*...*(x - αn)

(8) Now, since f(x)=0 only when one of the values αi=x, we see that the n roots αi are the only solutions.

(9) So, we have proven that each equation is equal to n roots.

One important point to remember is that the n roots are not necessarily distinct. That is, it is possible that αi = αj where i ≠ j.

QED

References

Friday, May 19, 2006

Fundamental Theorem of Algebra: At least one solution

In today's blog, I continue the proof for the Fundamental Theorem of Algebra. Today, I will show the proof that all polynomials in the complex domain have at least one root that leads to 0.

Today's proof is taken from David Antin's translation of Heinrich Dorrie's 100 Great Problems of Elementary Mathematics.

Lemma 1: If f(x) = xn + a1xn-1 + ... + an-1x + an where ai and x are complex numbers, then there exists at least one solution r such that f(r)=0.

Proof:

(1) Let f(x) = xn + a1xn-1 + ... + an-1x + an

(2) Assume that for all x, f(x) ≠ 0

(3) We know that there exists a value x0 such that w0=f(x0) and w0 is the smallest absolute number. [See Lemma 2 here for proof]

(4) From step #2, we can assume that absolute(w0) is greater than 0.

(5) We can plot the minimal point, w0, on the plane of complex numbers (see here for more details if needed)

(6) From this point, we can define a small circle K with radius R.

(7) Now, for any point x in K, x = x0 + ζ

In other words, complex numbers form a one-to-one mapping between the possible values for f(x) and the Cartesian coordinate system. If we use the form r(cos θ + i sin θ), then we see that r is the radius of K [See here if more information needed]

(8) Using the plane of complex numbers, we know that there exists ρ, θ such that ζ = ρ(cos θ + isin θ) (see here for review of how cos θ + isin θ can be used in this situation)

(9) In the above case, ρ = the absolute magnitude of ζ

(10) So, for any value x, there exists a value w and a value ζ such that:

w = f(x) = f(x0 + ζ) = (x0 + ζ)n + a1(x0 + ζ)n-1 + ... + an

(11) From the equation in #10, we can rearrange the values to get the following:

w = f(x0) + c1ζ + c2ζ2 + ... + cnζn = w0 + c1ζ + c2ζ2 + ... + cnζn

(12) Now, it is quite possible that some of the ci values are 0 so we can rearrange the values so the first nonzero coefficient is c and the power is v, the next is c' and the power is v', and so on where each c,v are nonzero and v is less than v' is less than v'', etc.:

w = w0 + cζv + c'ζv' + c''ζv'' + ...

(13) Since we assume that w0 is nonzero, we can divide both sides by w0 to get:

w/w0 = 1 + (cζv)/w0 + (c'ζv')/w0 + (cζv'')/w0 + ...

(14) Now let us define some values to make this equation more manageable:

Let q = c/w0

Let ξ = (c'ζv' + c''ζv'' + ...)/(cζv)

So that:

w/w0 = 1 + qζv(1 + ζξ)

(15) Now, since q, ζ are complex numbers, we can represent them both using r(cos + isin) form (see here if more information needed)

So that there exists h, λ such that:

q = h(cos λ + i sin λ)

From step #8, there exists ρ, θ such that:

ζ = p(cos θ + i sin θ)

(16) To shorten the equation we can use:

1λ = cos λ + i sin λ

And use:

1θ = cos θ + i sin θ

So that we have:

q = h * 1λ

ζ = ρ * 1θ

(17) Using step #16, we get:

v = h*1λ*(ρ*1θ)v = h*pv*1λ*(1θ)v

(18) Now, using Euler's Formula (see Lemma #1, Lemma #2 here if needed), we know that:

(1θ)v = 1

1*1λ = 1λ + vθ

(19) Within the circle K, we can now consider only the values of x that are associated with θ = (π - λ)/v. [Since a circle includes all values of θ between 0 and, see here if needed]

(20) In this case:

1λ + vθ = 1λ + v(π - λ)/v = 1λ - λ + π = 1π

(21) Now 1π = cos (π) + isin(π) = -1 + i*0 = -1. [See here if needed]

(22) So, in this case, combining step #21 with step #17:

v = h*ρv*(-1) = -h*ρv

(23) Combining step #22 with step #14:

w/w0 = 1 - hρv(1 + ζξ)

(24) Now, we can set the radius of K to any value so we can constrain ζξ to be as close to 0 as we wish so that for all purposes, we have:

w/w0 = 1 - hρv

[The idea here is that the radius of K was selected arbitarily in step #6, any nonzero radius r will do]


(25) Now, we can choose any value for ρ so we choose a value such that ρ is greater than 0 and less than (1/h)(1/v).

We can do this since ρ is the magnitude of ζ (see step #9). In step #19, we constrained θ and in step #24, we constrained the maximum magnitude of the circle K.

Even with all of the above constraints, we are still left with a set of values and we can select a value of x such that the magnitude is less than the radius of K and less than (1/h)(1/v) but still greater than 0.

(26) But then v is greater than 0 and less than h*(1/h) = 1 since:

h*[(1/h)(1/v)]v = 1

(27) But this means there is a value of x such that w/w0 is greater than 0 and less than 1.

(28) But this is a contradiction because it means that w is less than w0 which is impossible from step #3.

The reasoning here is that if w/w0 = fraction, this implies that w = w0 * fraction which implies that w is less than w0

(25) So we are forced to reject our assumption in step #2.

QED

References

Sunday, May 14, 2006

Fundamental Theorem of Algebra: Preliminaries

Cyclotomic integers allow Fermat's Last Theorem to be factored in the following way:

xn + yn = (x + y)(x + αy)(x + α2y)*...*(x + αn-1y)

In the above factorization, n is an odd prime and α is a root of unity where αn = 1 and αi ≠ 1 for all 1 ≤ i ≤ n-1.

To establish this refactorization, I will be using the Fundamental Theorem of Algebra.

The main idea behind the Fundamental Theorem of Algebra is that for any given polynomial of a single variable of order n, there are at least n zeros to this equation in the complex domain.

This is an easy proof to state but difficult to prove. Rene Descartes and Jean le Rond d'Alembert knew about this result but it was not until Carl Friedrich Gauss that the fundamental theorem was rigorously proved. d'Alembert thought that the existence of a minimum point for a complex equation was obvious and needed no proof. I do not find this point so obvious so I begin with its proof.

The details in today's proof are taken from B. N. Delone's article on Algebra from Mathematics: Its Content, Methods, and Meaning by A. D. Aleksandrov, A. N. Kolmogorov, and M. A. Lavrent'ev and translated by S. H. Gould.

Theorem 1: Bolzano-Weierstrass Theorem

If a rectangle contains an infinite sequence of points (z1, z2, ..., zn, ... in its interior, then there exists a point z0 such that in any arbitrarily small neighborhood of z0, there are infinitely many points of the sequence z1, z2, ..., zn, ...

Proof:

(1) Let R1 be a rectanlge that contains an infinite sequence of points.

(2) We divide it up into 4 equal parts using two lines parallel to each of its sides.

(3) At least one of these four parts contains infinitely many points, let us label it R2

(4) We now divide up R2 into 4 equal parts using two lines parallel to each of its sides.

(5) One of these four parts contains infinitely many points and we label it R3

(6) In this way, we are able to generate a sequence of nested rectangles which we can label R1, R2, ..., Rn

(7) We can think of each side of this rectangle representing two nested intervals that exist on the x and y axises.

(8) Using Lemma 1 here, we there exists a point on the x-axis and a point on the y-axis that each of these nested intervals have in common.

(9) But this means that there is a point within the nested rectangles such that any arbitrary small neighborhood contains an infinity of points.

QED

Theorem 2: There exists a minimum point for c0xn + c1xn-1 + ... + cn.

Let f(x) = c0xn + c1xn-1 + ... + cn.

There exists a value x0 such that w0 = c0(x0)n + c1(x0)n-1 + ... + cn where absolute(w0) is the minimum value.

Proof:

(1) Let g = absolute(f(0))

(2) Let G be a number greater than g.

(3) Let R be a number such that if absolute(x) > R, then absolute(f(x)) is greater than G

(4) Now, if f(0)=0, then x0=0 and w0=0 so we can assume going forward that f(0) ≠ 0

(5) If f(0) is greater than 0 and all f(x) ≥ g, then x0=0 and w0=g so we can assume that there exists at least 1 point x' such that absolute(f(x')) is less than g.

(6) Based on a nonzero g, we can set up the follow sequence:

0, g/n, 2g/n, ..., ng/n = g

(7) We can find a value i, cn such that cn = (i/n)g and all values absolute(f(x)) ≥ cn.

(8) From i, we can also find a value cn' such that cn' = [(i+1)/n]g and there exists at least one value x' such that absolute(f(x')) is less than cn'

(9) We can find cn,i,cn' regardless of the value of n so we can let n increase to infinity.

(10) Now for all values of n, we can assume that absolute(x) ≤ R since if absolute(x) is greater than R, then absolute(f(x)) is greater than G and therefore greater than g. For purposes here, let's call these values xn

(11) So, we only need to consider the points xn that lie inside a rectangle of sides 2R and with its center at the origin.

(12) By Theorem 1 above, there exists a point z0 such that every neighborhood of z0 contains infinitely many points of the sequence z1, z2, ..., zn. Let us call this point x0

(13) For any point x, we have:

absolute(f(x)) is greater than cn = cn' - g/n which is greater than absolute(f(xn)) - g/n = absolute(f(x0)) + absolute(f(xn)) - absolute(f(x0)) - g/n.

(14) This inequality is true for all values of n so as n increases toward infinity, we see that difference absolute(f(xn)) - absolute(f(x0)) becomes arbitrarily small in absolute value with g/n.

(15) Consequently, all absolute(f(xn)) ≥ absolute(f(x0)) so x0 is the minimum point.

QED

References

Saturday, May 13, 2006

Cyclotomic Integers: Factoring Fermat's Last Theorem

Today's blog continues the discussion of Kummer's proof of Fermat's Last Theorem for regular primes. If you would like to review the historical context for this proof, start here.

The major reason why cyclotomic integers are interesting in relation to Fermat's Last Theorem is because they enable us to factor Fermat's Last Theorem in the following way:

zn = xn + yn = (x + y)(x + αy)(x + α2y) .... (x + αn-1y)

Below I will show how I can derive this factoring using the Fundamental Theorem of Algebra.

Lemma 1: Let α be a primitive root of unity such that n is an odd prime and αn = 1, and let x,y,z be integers such that xn + yn = zn, then:

zn = xn + yn = (x + y)(x + αy)(x + α2y) .... (x + αn-1y)

Proof:

(1) We know that xn - 1 has n root from the Fundamental Theorem of Algebra.

(2) We also note that for all αi where 0 ≤ i ≤ n-1, we have i)n = 1.

NOTE: αn = 1 so it is really the same as α0.

(3) Based on #2, the Fundamental Theorem of Algebra gives us:

xn - 1 = (x - 1)*(x - α)*(x - α2)*...*(x - αn-1)

QED

Theorem 1: if n is odd, then zn = xn + yn = (x + y)(x + αy)(x + α2y) .... (x + αn-1y)

Proof:

(1) an - 1 = (a - 1)*(a - α)*(a - α2)*...*(a - αn-1) [From Lemma 1 above]

(2) Since a can be any value, let a = -x/y so that:

(-x/y)n - 1 = [(-x/y) - 1]*[(-x/y) - α]*...*[(-x/y) - αn-1] = -(x)n/yn - 1

(3) If we multiply (-y)n=-(yn) to both sides, we get:

xn + yn = (x + y)*(x + yα)*...*(x + αn-1y)

QED

Cyclotomic Integers: Division Algorithm

Today's blog continues the discussion of Kummer's proof of Fermat's Last Theorem for regular primes. If you would like to review the historical context for this proof, start here.

Today, I will continue reviewing the basic properties of cyclotomic integers. Today's content comes directly from Chapter 4 of Harold M. Edwards' Fermat's Last Theorem: A Genetic Introduction to Algebraic Number Theory.

Lemma 1: Criteria for division of a cyclotomic integer by a rational integer.

A cyclotomic integer is divisible by a rational integer if and only if all of its integer coefficients are congruent modulo the rational integer.

Proof:

(1) Let f(α) be a cyclotomic integer.

(2) f(α) = a0 + a1α + ... aλ-1αλ-1 [See Lemma 1 here for details]

(3) Let d be a rational integer.

(4) It is clear that if d divides f(α), then a0 ≡ a1 ≡ ... ≡ aλ-1 ≡ 0 (mod d).

(5) Assume a0 ≡ a1 ≡ ... ≡ aλ-1 (mod d).

(6) Then, using Corrolary 2.1 from here, we know that we can add aλ-1 to each of the coefficients to get:

f(α) = (a0 - aλ-1) + (a1 - aλ-1)α + ... + (aλ - 2 - aλ - 1λ-2 + (aλ-1 - aλ-1λ-1

(7) But then it is clear that d divides each of the coefficients so it divides f(α).

QED

Corollary 1.1: Division Algorithm for a cyclotomic integer f(α) by a rational integer d

if f(α) = a0 + a1α + ... + aλ-1αλ-1, the result is:

[(a0 - aλ-1)/d] + [(a1 - aλ-1)/d]α + ... + [(aλ-2 - aλ-1)/d]αλ-2

Proof:

(1) If d divides f(α), then a0 ≡ a1 ≡ ... ≡ aλ-1 (mod d) [From Lemma 1 above]

(2) By Corollary 2.1 here, we can add constant -c to each coefficient and still maintain the same value so that:

f(α) (a0 - aλ-1) + (a1 - aλ-1)α + ... + (aλ - 2 - aλ - 1λ-2 + (aλ-1 - aλ-1λ-1

(3) From this we know that d divides each coefficient and the result of this corrollary follows.

QED

Lemma 2: Division Algorithm for Cyclotomic Integers

A cyclotomic integer h(α) is divisible by another cyclotomic integer f(α) if and only if:

h(α)*f(α)-1 is divisible by the Nf(α)


(1) Let f(α),h(α) be cyclotomic integers.

(2) We can assume f(α) ≠ 0

(a) Assume f(α) = 0.

(b) Then, g(α) exists only if h(α) = 0 in which case g(α) can take any value.

(c) This resolves f(α) = 0 so we only need to address the case where f(α) ≠ 0.

(3) Assume f(α) * g(α) = h(α)

(4) Then, Nf(α)*g(α) = h(α) *f(α2)*f(α3)*...*f(αλ-1) = h(α)*f(α)-1

(5) We see that f(α) divides h(α) if and only if Nf(α) divides h(α)*f(α3)*...*f(αλ-1).

(6) But Nf(α) is a rational integer. [See Lemma 5 here]

(7) Let i(α) = h(α)*f(α)-1

(8) i(α) = b0 + b1α + ... + bλ-1αλ-1

(9) Using Lemma 1 above, we see that Nf(α) divides i(α) if and only if for any two coefficients bj ≡ bk (mod Nf(α))

(10) But from step #5, we have that f(α) divides h(α) if and only if after computing i(α) from step #8, we find that all coeffients of i(α) are congruent modulo Nf(α).

QED

Corollary 2.1: Method for determining result of division between two cyclotomic integers.

if f(α) = a0 + a1α + ... + aλ-1αλ-1 and:

h(α)f(α)-1 = b0 + b1α + ... + bλ-1αλ-1,

the result is:

[(b0 - bλ-1)/Nf(α)] + [(b1 - bλ-1)/Nf(α)]α + ... + [(bλ-2 - bλ-1)/Nf(α)]αλ-2

Proof:

(1) Let f(α), g(α), h(α) be cyclotomic integers with h(α) = f(α)g(α).

(2) Multiplying both sides by f(α)-1 gives us:

Nf(α)g(α) = h(α)f(α)-1

(3) Since Nf(α) is a rational integer, we can apply Corollary 1.1 above to get the desired result.

QED

Friday, May 12, 2006

Cyclotomic Integers: Units and Primes

Today's blog continues the discussion of Kummer's proof of Fermat's Last Theorem for regular primes. If you would like to review the historical context for this proof, start here.

Today, I will continue reviewing the basic properties of cyclotomic integers. Today's content comes directly from Chapter 4 of Harold M. Edwards' Fermat's Last Theorem: A Genetic Introduction to Algebraic Number Theory.

1. Unit

Definition 1: Cyclotomic Unit

A cyclotomic unit is a cyclotomic integer whose norm is 1.

So, if f(α) is a unit, then f(α)*f(α2)*...*f(αλ-1) = 1.

Definition 2: Cyclotomic Inverse

If f(α) is a unit, then f(α2)*..*f(αλ-1) is called the inverse and it represented as f(α)-1.

Lemma 1: if f(α)*g(α) = 1, then f(α) is a unit and g(α) = f(α2)*f(α3)*...*f(αλ-1).

Proof:

(1) Let f(α)*g(α) = 1.

(2) Nf(α)*Ng(α) = N(1) = 1 [See Lemma 6, here]

(3) So, Nf(α) = 1 [See Lemma 5, here]

(4) So, Nf(α) = f(α)*f(α2)*...*f(αλ-1) = 1 [Definition of Norm for Cyclotomic integers, here]

(5) So, f(α)*f(α2)*...*f(αλ-1) = f(α)*g(α) [Combining step #1 with step #4]

(6) Dividing both sides of step #5 by f(α) gives us the desired result.

QED

Lemma 2: A cyclotomic unit is a factor of any cyclotomic integer.

Proof:

(1) Let f(α) be a unit.

(2) Let h(α) be any cyclotomic integer.

(3) Let g(α) = h(α)*f(α)-1 [See definition 2 above]

(4) Then, h(α) = g(α)*f(α)

QED

2. Cyclotomic Primes

Definition 3: Irreducible

A cyclotomic integer h(α) is irreducible if for any factorization h(α)=f(α)*g(α), either f(α) or g(α) is a unit.

Definition 4: Cyclotomic Prime

A cyclotomic integer h(α) is prime if:

(a) if h(α) divides f(α)*g(α), then h(α) divides f(α) or g(α).

(b) there exists at least one cyclotomic integer f(α) that h(α) does not divide.

(c) if h(α) is not a factor of f(α) and it is not a factor of g(α), then it is not a factor of f(α)*g(α)

In rational integers, all irreducible nonunits are also primes. One of the question that needs to be addressed is whether this is still the case with cyclotomic integers. I will answer this question in a later blog.

3. More Properties of Units

Lemma 3: a unit * a unit = a unit

Proof:

(1) Let g(α),h(α) be units.

(2) By Lemma 6, here, if i(α)=g(α)*h(α), then Ni(α) = Ng(α)*Nh(α) = 1*1 = 1.

QED

Lemma 4: 1/unit = a unit

Proof:

(1) Let h(α) be a unit

(2) Nh(α) = h(α)*h(α2)*...*h(αλ-1) = 1. [See Definition 2, here for definition of norm for cyclotomic integers]

(3) From (#2), we can see that:

1/h(α) = h(α2)*...*h(αλ-1)

(4) Further, we can see that:

Norm(1/h(α)) = Nh(α2)*N(α3)*...*Nh(αλ-1) = 1*1*1*...*1 = 1 [See Lemma 6, here]

QED

Lemma 5: unit/unit = unit

Proof:

(1) Let h(α),g(α) be units.

(2) h(α)/g(α) = h(α)*(1/g(α))

(3) From Lemma 4 above, we know that (1/g(α)) is a unit.

(4) From Lemma 3 above, we know that a unit*unit = unit.

QED

Tuesday, May 09, 2006

Basic Properties of Cyclotomic Integers

Today's blog continues the discussion of Kummer's proof of Fermat's Last Theorem for regular primes. If you would like to review the historical context for this proof, start here.

Today, I will review the basic properties of cyclotomic integers. Today's content comes directly from Chapter 4 of Harold M. Edwards' Fermat's Last Theorem: A Genetic Introduction to Algebraic Number Theory.

1. Notation

For Kummer's notation, he used λ to represent the odd prime number and α to represent the root of unity so that we have:

Definition 1:
αλ = 1

2. Standard Form of Cyclotomic Integers

Lemma 1:
If a0, a1, ... aλ-1 are integers, then all cyclotomic integers for a given value of λ can be represented in the following form:

a0 + a1α + a2α2 + ... + aλ-1αλ-1

Proof:

(1) Let's assume that we have cyclotomic integer = a0 + a1α + a2α2 + ... + aλ-1αλ-1 + aλαλ

(2) By definition 1 above, αλ = 1

(3) So that we have:

(a0 + aλ) + a1α + a2α2 + ... + aλ-1αλ-1

(4) We can do the same thing for any power of αi where i ≥ λ

(5) So we can conclude that all values can be reduced to the form required.

QED

Lemma 2: For any given value of λ, 1 + α+α2 + ... + αλ-1 = 0

Proof:

(1) Since αλ = 1, we have:

1 + α+α2 + ... + αλ-1λ + α+α2 + ... + αλ-1 =

= α(αλ-1 + 1 + α+α2 + ... + αλ-2)

(2) Now, we know that α ≠ 0 since 0λ = 0 which contradicts with definition 1.

(3) We also know that α ≠ 1 since α is a λth root of unity [using Euler's Identity, see here], we know that α = e2iπ/λ

(4) So, therefore, 1 + α+α2 + ... + αλ-1 = 0

QED

Corollary 2.1: for any given integer c, a0 + a1α + a2α2 + ... + aλ-1αλ-1 = (a0 + c) + (a1 + c)α + (a2 + c)α2 + ... + (aλ-1 + c)αλ-1.

Proof:

(1) 1 + α + α2 + ... + αλ-1 = 0 [From Lemma 2 above]

(2) c + cα + cα2 + ... + cαλ-1 = c*0 = 0

(3) So that:

a0 + a1α + a2α2 + ... + aλ-1αλ-1 = a0 + a1α + a2α2 + ... + aλ-1αλ-1 + 0 =

= a0 + a1α + a2α2 + ... + aλ-1αλ-1 + c + cα + cα2 + ... + cαλ-1 =

= (a0 + c) + (a1 + c)α + (a2 + c)α2 + ... + (aλ-1 + c)αλ-1.

QED


3. Conjugates

Since each cyclotomic value can be represented as:

a0 + a1α + a2α2 + ... + aλ-1αλ-1

Kummer used the following shorthand to represent a cyclotomic integer:

f(α), g(α), φ(α), F(α), etc.

One important point that we find is that if f(α) = g(α), then f(α2) = g(α2) and so on up until λ - 1.

Lemma 2.5: Conjugates preserve relations between equations

That is, if f(α) = g(α), then f(αi) = g(αi) where i is a positive number less than λ, αi ≠ 1 and αλ = 1.

Proof:

(1) Let f(α) = a0 + a1α + ... + aλ-1αλ-1

(2) For any value f(αi) we see that:

f(αi) = a0 + a1αi + ... + aλ-1αi*(λ-1)

(3) In step #1, let j be the possible values ranging from 1 to λ -1. Combining this with step #2, we get:

f(αi) = ∑ ajαj*i

(4) To prove this lemma, we need to show each element j*i is congruent to a unique value of i modulo λ

In other words, we are trying to prove that each element of the f(αi) is distinct.

(5) This turns out to be the case from Lemma 1 here.

QED

For this reason, we say that f(α), f(α2), ...., and f(αλ-1) are conjugates of each other.

4. Norm

Definition 2: Norm of a cyclotomic integer f(α)

Nf(α) = f(α)*f(α2)*...*f(αλ-1)

I will now use this definition in the following proofs.

Lemma 3: Nf(α) = Nf(αi) for all values of i between 1 and λ-1.

Proof:

(1) Nf(αi) = f(αi)*f(α2*i)*...*f(αi(λ-1))

(2) Now, each value i, 2*i, 3*i, ... (λ-1)*i maps to a distinct value of 1,2,3,...,(λ-1) modulo λ (see Lemma 1 here)

(3) So in each case, i,2*i, etc. maps to a1*λ+1, a2*λ+2, etc.

(4) So we get Nf(αi) = f(αa0*λ+1)*f(αa1*λ+2)*...*f(αaλ-1*λ+λ-1) where ai is a nonnegative integer.

(5) Since αn*λ=1, we get:

Nf(αi) = f(α)*f(α2)*...*f(αλ-1)

QED

Lemma 4: αj = αλ-j

Proof:

(1) From roots of unity and Euler's Formula, we know that:

α = e(i2π/λ) = cos(2π/λ) + isin(2π/λ)

(2) We also know that the complex conjugate of a + bi is a - bi, so the complex conjugate for α is:

α = cos(2π/λ) - isin(2π/λ)

(3) Likewise, we know that the complex conjugate for αj is:

αj = cos(2jπ/λ) - isin(2jπ/λ)

(4) Using Euler's Formula, we see that:

e-2jπ/λ = cos(-2jπ/λ) + isin(-2jπ/λ)

(5) Since cos(-x) = cos(x) and sin(-x) = -sin(x) [see here], we can use (#4) to get:

e-2jπ/λ = cos(2jπ/λ) - isin(2jπ/λ)

which is from #3, the complex conjugate for αj

(6) Now, e-2jπ/λ = (e2π/λ)-j =

= α-j = α-jλ =

= αλ - j

QED

Corollary 4.1: f(αj) = f(αλ-j)

Proof:

(1) From Lemma 1, we have:

f(α) = a0 + a1α + a2α2 + ... + aλ-1αλ-1

(2) From this,

f(αj) = a0 + a1αj + a2α2*j + ... aλ-1αj*(λ-1)

(3) Now, from Lemma 4, we know that:

f(αj) = a0 + a1αλ-j + a2αλ - 2*j + ... aλ-1αλ - j*(λ-1)

(4) And, we know that:

f(αλ-j) = a0 + a1αλ-j + a2α(λ-j)*2 + ... aλ-1α(λ - j)*(λ - 1)

(5) Now,

n*λ - j*n ≡ λ - j*n (mod λ) [See here if you need a review of modular arithmetic]

(6) So that we see that step #3 and step #4 are equal so that:

f(αj) = f(αλ-j)

QED

Corollary 4.2: f(αj)*f(αλ-j) is a nonnegative real number

Proof:

(1) f(αj) * f(αλ-j) = f(αj)* f(αj) [From Corollary 4.1 above]

(2) So that:
f(αj) * f(αλ-j) = (a0 + a1αj + ... + aλ-1αj*(λ-1))(a0 + a1αj + ... + aλ-1αj*(λ-1)) =

= (a0)2 + (a1)2j*αj) + ... + (aλ-1)2j*(λ-1)*αj*(λ-1))

(3) Since each α*α is a nonnegative number, the conclusion follows.

QED

Lemma 5: For any cyclotomic integer f(α), its norm is a nonnegative rational integer.

Proof:

(1) Using Lemma 1 above, we know that:

Nf(α) = a0 + a1α + a2α2 + ... + aλ-1αλ-1

(2) By Lemma 3 above, we can substitute any conjugate αj and get the same norm so that:

Nf(αj) = Nf(α)

(3) But by changing to a conjugate, we keep the same coefficients but get the following:

Nf(αj) = a0 + a1αj + a2αj*2 + ... + aλ-1α(λ-1)*j

(4) Combining the two equations gets us:

a0 + a1αj + a2αj*2 + ... + aλ-1α(λ-1)*j = a0 + a1α + a2α2 + ... + aλ-1αλ-1

(5) Subtracting one from the other gives us:

a0 - a0 + (a1 - ajj + ... = 0

(6) Since we know that each of these j,2*j,...,(λ-1)*j matches up with a value 1,2,...,λ-1, we know that:

a1 = aj

(7) Further, since j can be any value from 2 thru λ-1, we can conclude the following:

a1 = a2 = a3 = ... = aλ-1

(8) So that:
Nf(α) = a0 + a1(α + α2 + ... + αλ-1)

(9) From Lemma 2, we know that:

1 + α+α2 + ... + αλ-1 = 0

so that:

α+α2 + ... + αλ-1 = -1

(10) So, we apply (#9) to (#8) to give us:

Nf(α) = a0 - a1

(11) We know that it is nonnegative since:

Nf(α) = [f(α1)*f(αλ-1)]*[f(α2)*f(αλ-2)]*...

(12) From Corollary 4.2 above, we know that multiplication of (λ-1)/2 pairs of nonnegative values will result in a nonnegative value.

QED

Lemma 6: f(α)g(α) = h(α) → Nf(α)*Ng(α) = Nh(α)

Proof:

(1) Let f(α)g(α) = h(α)

(2) By Definition 2 above:

Nf(α) = f(α)*f(α2)*...*f(αλ-1)

Ng(α) = g(α)*g(α2)*...*g(αλ-1)

Nh(α) = h(α)*h(α2)*...*h(αλ-1)

(3) Using step #1 gives us:

Nh(α) = f(α)*g(α)*f(α2)*g(α2)*...*f(αλ-1)*g(αλ-1) =

= f(α)*f(α2)*...*f(αλ-1) *g(α)*g(α2)*...*g(αλ-1) =

= Nf(α)*Ng(α)

QED

Sunday, May 07, 2006

Fermat's Last Theorem: Proof for regular primes

One of the highpoints of the 19th century mathematics is Kummer's proof of Fermat's Last Theorem for regular primes.

Kummer's theory of ideal numbers is one of the foundations of algebraic number theory. In future blogs, I will talk about some of the other very important proofs that came out at this time (impossibility of a general method for quintic equations, transcendence of π, and the fundamental theorem of algebra) and show how Dedekind reinterpreted many of these developments into the modern concepts of ideals, rings, groups, and fields.

Kummer's proof comes down to three major points.

(A) For certain primes (which Kummer called "regular primes"), cyclotomic integers can be said to have a form of unique factorization. [See here for discussion on ideal numbers and how they "save" unique factorization for cyclotomic integers]

(B) For a regular prime λ, there is no solution to xλ + yλ = zλ where x,y,z are pairwise relatively prime all prime to λ

(C) For a regular prime λ, there is no solution to xλ + yλ = zλ where x,y, z are pairwise relatively prime and where λ divides z.

For the full proof, go here.

References