Fermat's Last Theorem: 2005-11-27

Today's blog provides the solution to Pell's Equation. For background on the history behind Pell's Equation, start here. For those not familiar with Continued Fractions, start here. I will also be using Matrices (for a review of matrix math, see here.) For a review of how matrices can be used to represent continued fractions, see here.

Pell's Equation is presented within the context of solving Fermat's Last Theorem: n=5. For those interested in the history of Fermat's Last Theorem: n=5, start here. For those interested in the proof itself, start here.

Today's blog is again based on Harold M. Stark's An Introduction to Number Theory.

Method for Solving Pell's Equation of the form: x² - dy² = 1 where x,y are integers.

(1) Use Continued Fractions to identify the the period for the quadratic integer √d (see here for details on the method for doing this). We will need to know at what point the period starts and its length.

For example, let's look at √3

√3 = [ 1, 1, 2 ]

The period starts at n=1 and it has a length of 2.

(2) Use Matrix Theory and Continued Fractions to find M_n^-1 (see here for more details)

In the case of √3:

(3) Use Matrix Theory and Continued Fractions to find M_kn+p

where k is any positive integer. For the example, I will use k = 1.

In the case of √3 (see here for details on how q_n and p_n are defined):

(4) Now, to find the solution, get the product of M_n^-1 with M_n+p.

In the case of √3 (see here for a review of matrix products):

(5) Then, we find the answer in the result where x,y are found here:

x = a
y = c

In this case of √3 :

x=2, y=1

And we see that:

(2)² - 3(1)² = 4 - 3 = 1

NOTE: Setting a=2, 3, 4 etc. gives us:

a=2: x=7, y = 4
a=3: x=26, y = 15
a=4: x=97, y=56

QED

Finally, here is a lemma about this method:

Lemma 1: The above method solves Pell's Equation (Sufficiency)

In other words:

(a) We let r_k, s_k, t_k, u_k be the result of M_n^-1M_n+kj:

(b) We define M_n based on the Continued Fraction Approximation Algorithm so that:

(c) We further assume that:
n = the start of the period for √d and j is the length of the period such that:
√d = [ a₀, a₁, ... a_n-1, α_n ]

and

α_n = [ a₀, ... a_j-1, α_n ]

(d) k is any positive integer.

Then

r_k² - dt_k² = (-1)ⁿ

Proof:

(1) Let γ_n = α_nq_n-1 + q_n-2. [See here for definition of α_n and q_n ]

(2) We know that γ_n ≠ 0. [See here for proof]

(3) Let λ = γ_n+j/γ_n. [We can do this since γ_n ≠ 0]

(4) We note that δ(1,√d) = (1,√d)M_n^-1M_n+kj
[See here for proof]

(5) From this, we can know that:
(dt_k - s_k) + (r_k - u_k)√d = 0. [See here for proof ]

(6) From a basic property of irrational numbers (see here), we know that:

dt_k - s_k = 0
r_k - u_k = 0

Therefore, we have:

u_k = r_k
s_k = dt_k

(7) Appyling this to (a) above gives us:

(8) From this, we know that:

(9) Now using the fundamentals of determinants (see here) and some lemmas found here, we know that:

det (M_n^-1M_n+kj) = det(M_n^-1)det(M_n+kj) =
det (M_n)^-1det (M_n+kj)=
[(-1)ⁿ]^-1(-1)^n+kj =
(-1)^-n(-1)^n+kj = (-1)^kj

(10) Likewise, we know that:

(11) So putting this together, we have show that:

r_k² - dt_k² = (-1)^kj

QED

Corollary 1.1: There are either an infinite number of solutions to Pell's equation or no solutions. (Existence).

(1) We know that:

det(M_n^-1M_n+kj) = (-1)^kj [From Lemma 1]

(2) There are infinitely many values of k involves since k is any positive integer.

(3) When j is even, (-1)^kj = 1 for all k.

(4) If j is odd, (-1)^kj = 1 for all even k and (-1)^kj = -1 for all odd k.

(5) Different values of k give different pairs of numbers r_k and t_k

(a) Suppose r_k = r_m, t_k = t_m

(b) M_n^-1M_n+kj = M_n^-1M_n+mj

(c) M_n+kj = M_n(M_n^-1M_n+kj) = M_b(M_n^-1M_n+mj) = M_n+mj

(d) Thus n+kj = n + mj and k = m (see here for proof)

(e) Therefore, k ≠ m → r_k ≠ r_m or t_k ≠ t_m

QED

Today's blog is a continuation of the discussion about the famous math problem known as Pell's Equation.

In today's blog, I will show proofs for two properties of Continued Fractions (for those not familiar with Continued Fractions, start here):

All rational numbers can be represented as a finite continued fraction.
All irrational numbers that are solutions to a quadratic equation are periodic (that is, they repeat the same pattern of integers over and over again after a certain point)

For the irrational numbers, I mean any irrational number that is a solution to a quadratic equation that is of this form:
ax² + bx + c = 0

where a,b,c are integers.

For a periodic continued fraction which a period of size j and where the period begins before i, then: a_i = a_i+j.

The proofs outlined in today's blog are based on work done by Harold M. Stark in his book An Introduction to Number Theory.

Theorem 1: A continued fraction is finite if and only if it is a rational number.

(1) We know that if a continued fraction is finite, then it is representable as a rational number. (See Lemma 2, here)

(2) So, all we need to prove is that a rational number is representable as a finite continued fraction.

(3) For any rational number, there are two integers, let's call them c,d₀ such that the rational number is representable by a/b. [Definition of a rational number]

(4) To generate a continued fraction, we need to generate a set of integers a₀ through a_n.

(5) Let's set a₀ = floor(c/d₀).

A floor is a function that returns the minium integer that is less than or equal to a certain rational value. For example, the floor of (3/4) is 0. The floor of (5/2) is 2 and the floor of (-8) is -8. Finally, the floor of (-8.1) is -9.

(6) If a₀ = c/d₀, then we are done and the continued fraction is [a₀]. Going forward, we will assume that a₀ is less than but not equal to c/d₀.

(7) Subtracting the original number by the a₀ results in a rational number which we can call β₀:

c/d₀ = a₀ + β₀

We note that β₀ is ≥ 0 and less than 1 by our assumption in #6. When it is 0 we are done.

(8) Now, from #7, we know that d₀ * β₀ is an integer since:

c = d₀ * a₀ + d₀*β₀

And therefore:

d₀ * β₀ = c - d₀*a₀

Which is an integer since c is an integer and d₀ is an integer and a₀ is an integer.

(9) So, we know there exists an integer d₁ = d₀ * β₀.

We also note that this integer has the following properties:

(a) d₁ ≥ 0.

We know that it is not negative since we can assume d₀ is positive (if both c,d₀ are negative, we can assume that they are both positive. If c is positive and d₀ is negative, we can assume that c is negative and d₀ is positive.

(b) d₁ is less than d₀

We know it is less than d₀ since β₀ is less than 1 and since d₁ = d₀ * β₀

(10) From #9, we note that β₀ = d₁/d₀

(11) Now, a₁ = floor(1/β₀) = floor(d₁/d₀)

Now if this is an integer (for example, 1/(1/2) = 2), then we are done and the continued fraction is [a₀,a₁]. So, let's assume that we are not done.

(12) So, then, there exists rational value β₁ that is equal to the difference between 1/β₀ and a₁ which is greater than 0 so that:

d₁/d₀ = a₁ + β₁

This is the same form as step #7 and the same principles apply. We can derive a d₂ that is a positive integer less than d₁ and so on.

(13) Because d_i is an integer that continually gets smaller, we know that eventually it will equal 0.

(14) When d_i = 0 we are done and so we have proven that our sequence will eventually end.

QED

Lemma 1: If a continued fraction is periodic, then it represents a quadratic number.

(1) A quadratic number is any real number that satisfies the quadratic equation:
ax² + bx + c = 0 where a,b,c are integers.

For example, √2 is a quadratic number since it satisfies the equation (a=1,b=0,c=-2) where:
x² - 2 = 0

(2) Let α be a real number that is represented by a periodic continued fraction that repeats a a_m with a period of k.

So α = [ a₀, a₁, ..., a_m-1, α_m]

And

α_m = [ a_m, a_m+1, ..., a_m+k-1, a_m, a_m+1, ...]

And

α_m = [ a_m, a_m+1, ..., a_m+k-1, α_m]

(3) So, applying Lemma 1 from a previous blog, we get:
α_m = (α_mp_m + p_m-1)/(α_mq_m + q_m-1)

(4) And multiplying (α_mq_{_m} + q_{_m-1}) to both sides gives us:

(α_m)²*q_m + α_m*q_m-1 = (α_mp_{_m} + p_{_m-1})

And:
(α_m)²*q_m + α_m*q_m-1 - α_mp_m - p_m-1= 0

And:
(α_m)²*q_m + (α_m)(q_m-1 - p_m) - p_m-1= 0

(5) Applying lemma 1 from a previous blog to the first equation in #2 gives us:
α = (α_m*p_m-1 + p_m-2)/(α_m*q_m-1 + q_m-2)

(6) So,

α*(α_m*q_m-1 + q_m-2) = (α_m*p_m-1 + p_m-2)

and

α*α_m*q_m-1 + α*q_m-2 - α_m*p_m-1 - p_m-2 = 0

and

α_m(α*q_m-1 - p_m-1) = p_m-2 - α*q_m-2

and

α_m = (p_m-2 - α*q_m-2)/(α*q_m-1 - p_m-1)

(7) Now inserting (#6) into (#4) gives us:

[ ([p_m-2 - α*q_m-2)/(α*q_m-1 - p_m-1)]²*q_m +
[(p_m-2 - α*q_m-2)/(α*q_m-1 - p_m-1)](q_m-1 - p_m) - p_m-1 = 0

And multiplying (α*q_m-1 - p_m-1)² to both sides give us:
(p_m-2 - α*q_m-2)²*q_m +
(p_m-2 - α*q_m-2)(α*q_m-1 - p_m-1)(q_m-1 - p_m) - (α*q_m-1 - p_m-1)²(p_m-1) = 0

(8) And solving for all the values above gives us a quadratic equation of the form:
α²a + αb + c = 0

QED

Lemma 2: If α is a quadratic number, and [a₀, a₁, ... , a_n-1, α_n] is a continued fraction representation where a_i are integers and α_n is a real number, then α_n is also a quadratic number.

(1) Since α is a quadratic number, there exists integers a,b, and c such that:
a(α)² + b(α) + c = 0. [Definition of a quadratic number]

(2) From Lemma 1 in a previous blog, we know that:
α = (α_n * p_n-1 + p_n-2)/(α_n* q_n-1 + q_n-2)

(3) Inserting (1) into (2) and then multiplying everything out and then regrouping around (α_n)², α_n, and any integer gives us three values A_n, B_n, C_n such that:
A_n(α_n)² + B_n(α_n) + C_n = 0.

where

A_n = a(p_n-1)² + bp_n-1q_n-1 + c(q_n-1)²

B_n = 2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2

C_n = a(p_n-2)² + bp_n-2q_n-2 + c(q_n-2)² = A_n-1

(4) By definition, this means that α_n is a quadratic number.

QED

Corollary 2.1: (B_n)² - 4A_nC_n = b² - 4ac

where:

A_n = a(p_n-1)² + bp_n-1q_n-1 + c(q_n-1)²

B_n = 2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2

C_n = a(p_n-2)² + bp_n-2q_n-2 + c(q_n-2)² = A_n-1

(1) From Lemma 2 above:

(B_n)² - 4A_nC_n =

= (2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2)(2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2) - 4*[a(p_n-1)² + bp_n-1q_n-1 + c(q_n-1)²][a(p_n-2)² + bp_n-2q_n-2 + c(q_n-2)² = A_n-1] =

= (2ap_n-1p_n-2 )(2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2) +
(bp_n-1q_n-2)(2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2) +
(bp_n-2q_n-1)(2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2) +
(2cq_n-1q_n-2)(2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2) -
(4a(p_n-1)²)[a(p_n-2)² + bp_n-2q_n-2 + c(q_n-2)² ] -
(4bp_n-1q_n-1)[a(p_n-2)² + bp_n-2q_n-2 + c(q_n-2)² ] -
(4c(q_n-1)²)[a(p_n-2)² + bp_n-2q_n-2 + c(q_n-2)² ]

(2) Multipyling each term, we get:

(i) (2ap_n-1p_n-2 )(2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2) =

4a²(p_n-1)²(p_n-2)² +
2ab(p_n-1)²(p_n-2)(q_n-2) +
2ab(p_n-1)(p_n-2)²(q_n-1) +
4ac(p_n-1)(p_n-2)(q_n-1)(q_n-2)

(ii) (bp_n-1q_n-2)(2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2) =

2ab(p^n-1)²(p_n-2)(q_n-2) +
b²(p_n-1)²(q_n-2)² +
b²(p_n-1)(p_n-2)(q_n-1)(q_n-2) +
2bc(p_n-1)(q_n-1)(q_n-2)²

(iii) (bp_n-2q_n-1)(2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2) =

2ab(p_n-1)(p_n-2)²(q_n-1) +
b²(p_n-1)(p_n-2)(q_n-1)(q_n-2) +
b²(p_n-2)²(q_n-1)² +
2bc(p_n-2)(q_n-1)²(q_n-2)

(iv) (2cq_n-1q_n-2)(2ap_n-1p_n-2 + bp_n-1q_n-2 + bp_n-2q_n-1 + 2cq_n-1q_n-2) =

4ac(p_n-1)(p_n-2)(q_n-1)(q_n-2) +
2bc(p_n-1)(q_n-1)(q_n-2)² +
2bc(p_n-2)(q_n-1)²(q_n-2) +
4c²(q_n-1)²(q_n-2)²

(v) -(4a(p_n-1)²)[a(p_n-2)² + bp_n-2q_n-2 + c(q_n-2)² ] =

-4a²(p_n-1)²(p_n-2)² + -4ab(p_n-1)²(p_n-2)(q_n-2) + -4ac(p_n-1)²(q_n-2)²

(vi) -(4bp_n-1q_n-1)[a(p_n-2)² + bp_n-2q_n-2 + c(q_n-2)² ] =

-4ab(p_n-1)(p_n-2)²(q_n-1) +
-4b²(p_n-1)(p_n-2)(q_n-1)(q_n-2) +
-4bc(p_n-1)(q_n-1)(q_n-2)²

(vii) -(4c(q_n-1)²)[a(p_n-2)² + bp_n-2q_n-2 + c(q_n-2)² ] =

-4ac(p_n-2)²(q_n-1)² +
-4bc(p_n-2)(q_n-1)²(q_n-2) +
-4c²(q_n-1)²(q_n-2)²

(3) We can see that many of these terms line up and cancel out:

= [ 4a²(p_n-1)²(p_n-2)² - 4a²(p_n-1)²(p_n-2)² ]+ { from 2.i and 2.v }

[ 2ab(p_n-1)²p_n-2q_n-2 + 2ab(p_n-1)²p_n-2q_n-2 - 4ab(p_n-1)²p_n-2q_n-2] + { from 2.i, 2.ii, and 2.iv }

[ 2ab(p_n-1)(p_n-2)²(q_n-1) + 2ab(p_n-1)(p_n-2)²(q_n-1) - 4ab(p_n-1)(p_n-2)²(q_n-1) ]+ { from 2.i, 2.iii, and 2.vi }

[ 2bcp_n-1(q_n-1)²q_n-2 + 2bcp_n-1(q_n-1)²q_n-2- 4bcp_n-1(q_n-1)²q_n-2 ] + { from 2.iii, 2.iv, and 2.vii }

[ 2bcp_n-2q_n-1(q_n-2)² +2bcp_n-2q_n-1(q_n-2)² - 4bcp_n-2q_n-1(q_n-2)² ] + { from 2.ii, 2.iv, and 2.vi }

( 4c²(q_n-1)²(q_n-2)² - 4c²(q_n-1)²(q_n-2)² ) + { from 2.iv and 2.vii }

( b²p_n-1p_n-2q_n-1q_n-2 + b²p_n-1p_n-2q_n-1q_n-2- 4b²p_n-1p_n-2q_n-1q_n-2 ) + { from 2.ii, 2.iii, and 2.vi }

(4acp_n-1p_n-2q_n-1q_n-2 + 4acp_n-1p_n-2q_n-1q_n-2) + { from 2.i and 2.iv }

[b²(p_n-1)²(q_n-2)² + b²(p_n-2)²(q_n-1)² ] + { from 2.ii and 2.iii }

[-4ac(p_n-2)²(q_n-1)² +-4ac(p_n-1)²(q_n-2)² ] { from 2.v and 2.vii }

= -2b²p_n-1p_n-2q_n-1q_{n-2 +}8acp_n-1p_n-2q_n-1q_n-2 +
b²(p_n-1)²(q_n-2)² +
b²(p_n-2)²(q_n-1)² +
-4ac(p_n-2)²(q_n-1)² +
-4ac(p_n-1)²(q_n-2)² =

= b²[(p_n-1)²(q_n-2)² + (p_n-2)²(q_n-1)² - 2p_n-1p_n-2q_n-1q_n-2] - 4ac[(p_n-1)²(q_n-2)² + (p_n-2)²(q_n-1)² - 2p_n-1p_n-2q_n-1q_n-2] =

= (b² - 4ac)[(p_n-1)²(q_n-2)² + (p_n-2)²(q_n-1)² - 2p_n-1p_n-2q_n-1q_n-2] =

= (b² - 4ac)(p_n-1q_n-2 - p_n-2q_n-1)²

(4) Applying Lemma 2 from a previous blog, we get:

(b² - 4ac)(p_n-1q_n-2 - p_n-2q_n-1)²= (b² - 4ac)(-1)² =
= b² - 4ac.

QED

Corollary 2.2: A_n, B_n, C_n can only span over a finite set of integers.

(1) Let λ_n = p_nq_n - α(q_n)²

(2) Rearranging values gives us:
p_n = λ_n/q_n - αq_n

(3) We note that absolute(λ_n) is less than 1 since:

(a) We know that α - p_n/q_n is less than 1/q_n² and greater than -1/q_n². [From Theorem 2, here]

(b) Multiplying (q_n)² to all sides give us that:

α*(q_n)² - p_nq_n is between -1 and 1.

(c) Multiplying -1 to all sides give us that:

p_nq_n - α(q_n)² is between -1 and 1.

(d) This proves that λ_n is between -1 and 1 or rather absolute(λ_n) is less than 1.

(4) Inserting #2 into the formula for A_n in Lemma 2 gives us:

A_n = a( λ_n-1/q_n-1 + αq_n-1)² + b( λ_n-1/q_n-1 + αq_n-1)a_n-1 + c(q_n-1)²

Working this through and rearranging values gives us:

(q_n-1)²(aα² + bα + c) + 2aαλ_n-1 + a(λ_n-1)²/(q_n-1)² + bλ_n-1

Since we know that aα² + bα + c = 0 (since α is a quadratic number based on a,b,c), we get:

A_n = 2aαλ_n-1 + a(λ_n-1)²/(q_n-1)² + bλ_n-1

Since absolute(λa_n-1) is less than 1, this tells us that:

absolute(A_n) is less than 2*absolute(aα) + absolute(a) + absolute(b).

To understand the details for why this is true, there are three cases to consider:

Case I: a,b positive

In this case A_n is less than 2*absolute(aα) + absolute(a) + absolute(b).

Case II: a,b negative

In this case, the absolute values result in values the same as in Case I. In this case A_n = -absolute(A_n).

Case III: a,b: one is negative, one is positive

In this case, A_n will be greater than -absolute(A_n) and less than absolute(A_n) so, then:
absolute(A_n) is less than 2*absolute(aα) + absolute(a) + absolute(b)

Since C_n = A_n-1, we know that:

absolute(C_n) is less than 2*absolute(aα) + absolute(a) + absolute(b)

Finally, since (B_n)² = 4A_nC_n + b²-4ac, we know that:

(B_n)² ≤ 4[ 2*absolute(aα) + absolute(a) + absolute(b)]² + absolute(b²-4ac)

(5) The important idea from all this is that there are only a finite number integers that A_n, B_n, and C_n can equal, that is, only those integers which make up the ranges in #4.

QED

Theorem 2: A continued fraction is periodic if and only if it represents an irrational quadratic number.

(1) In Lemma 1, it was shown that a periodic continued fraction is a quadratic irrational number.

(2) Now, I will show that all quadratic irrational numbers are representable as periodic continued fractions.

(3) Let α be a quadratic number (that is, an irrational that is a solution to a quadratic equation)

(4) We know that there are an infinite number of values a_i and α_i that make it up (from Theorem 1 above, otherwise it would not be irrational)

(5) But at each point, there are only a finite number of values A_n, B_n, C_n that describe the equation that α_i solves.

(6) This means that it is inevitable that a specific combination A_n, B_n, C_n repeats and continues to repeat.

(7) For each specific combination of A_n, B_n, C_n, there are two possible real values that satisfy the equation. (This observation comes from the solution to the quadratic equation, found here)

(8) So, eventually, some α_i repeats.

QED

Lemma 3: If a value is representeable by a finite continued fraction with an odd number of elements, then it is representable by an even number of elements and likewise if it is representable by an even number of elements, it can be represented by an odd number of elements.

(1) Let [a₀, a₁, ..., a_n] be a finite continued fraction of n elements.

(2) Now, if a_n ≥ 2, then:
[ a₀, a₁, ..., a_n] = [ a₀, a₁, ..., a_n-1,1 ]

(3) Otherwise, if a_n = 1, then:
[ a₀, a₁, a_n-1,1 ] = [a₀, a₁, ... , a_n-1 + 1 ]

QED

Fermat's Last Theorem

Friday, December 02, 2005

Pell's Equation: The Solution

Monday, November 28, 2005

Continued Fractions: Basic Properties

Topic Index

Completed Proofs

Recommended Books

Required Reading for Experts

About Me

Blog Archive