Fermat's Last Theorem

Sunday, October 21, 2007

Gottfried Wilhelm Leibniz

Gottfried Wilhelm Leibniz was born on July 1, 1646 in Leipzig, Saxony which, today, is part of Germany. His father was a professor of moral philosophy at the University of Leipzig which had opened in Saxony in 1409. His father died when Gottfried was six years old. Young Leibniz inherited his father's library.

Leibniz was raised by his mother. At age seven, he entered the Nicolai School in Leibzig. Leibniz immersed himself in self-study in an effort to be able to read his father's books. By the age of 12 he had grown very advanced in Latin and had begun to study Greek. He would later write about his dissatisfaction with the logic of Aristotle.

At 14, he entered the University of Leipzig where he most likely studied philosophy, mathematics, rhetoric, Latin, Greek, and Hebrew. During the summer of 1663, he visited the University of Jena where he gained his first exposure to fundamental mathematical ideas such as proofs. Leipzig at the time was not very strong in mathematics so it is believed that Jena played a very important role in the development of his understanding of mathematics. Leibniz was greatly influenced by the ideas of Erhard Weigel who believed that all the universe could be viewed in terms of numbers.

Leibniz received his bachelors degree in law and a masters in philosophy from Leipzig. Despite this great progress, when he presented his thesis for his doctorate, his advancement was denied. The details for why this occurred are unclear. Normally, this meant that Leibniz would need to wait a year before resubmitting his doctoral thesis. Instead, Leibniz presented his doctorate thesis to the University of Altdorf where he gained his doctorate.

Leibniz was offered a position at the University of Altdorf which he decided not to accept. Later, he made the acquaintance of Baron Johann Christian von Boineburg. He soon had become "secretary, assistant, librarian, lawyer, and advisor the Baron and his family." (E J Aiton, Leibniz: A biography, Bristol-Boston, 1984) At this point, Leibniz's interests and works rested primarily in literary ambitions. One writer noted that during this period of Leibniz's life, he would have passed as "a typical late renaissance humanist." (G M Ross, Leibniz, Oxford, 1984)

In 1672, Leibniz was sent by Boineburg to meet with the French in an effort to dissuade Louis XIV from invading the German regions. Leibniz put forward a plan of invading Egypt that was very similar to the plan the Napolean would later carry out. At this point, Leibniz met the mathematicians and scientists of Paris. In particular, he studied mathematics and science under Christian Huygens from the Netherlands. His ventures in math and science in Paris were more successful than his political efforts.

Baron Boineburg died on December 15 of 1672. The Baron's family continued to sponsor Leibniz. In January of 1673, Leibniz gave up on his efforts at peace in Paris and went now to England to convince the British of peace. There, he met with Hooke, Boyle, and Pell. Leibniz presented his ideas for an automatic counting machine. On April 19, 1673, Leibniz was elected as a member to the Royal Society of London. Once again, his scientific pursuits were more successful than the political ones.

In 1674, Leibniz began to take interest in the problem of infinitesimals. He corresponded with Oldenburg from the Royal Society who let him know that Newton and Gregory had found very general methods to the problem. By autumn of 1676, Leibniz had worked out much of his notation for calculus. At this time, he received a letter from Newton. In 1677, he received a second letter from Newton. In this second letter, Newton questions whether Leibniz stole Newton's method. Newton pointed out that not a "single previously unsolved problem was solved." (Quoted in the MacTutor article) Later, Leibniz's notation would prove very important in the advancement of calculus.

Leibniz had hoped to join the Academy of Sciences in Paris but no opportunity to join came his way. In October of 1676, he accepted a position as librarian and Court Councillor to the Duke of Hanover, Johann Friedrich.

During this time, Leibniz worked on many outside projects. He worked unsuccessfully on developing wind-powered pumps to drain water from the Harz mountain mines. From these efforts, he developed his knowledge of geology and proposed a theory that the earth was at one time molten lava.

By 1679, he had developed a "binary system of arithmetic." He also worked on the problem of determinants which had written around 1684.

In 1680, the Duke of Hanover died and Leibniz began working for his brother Ernst August. Leibniz began to work on the family tree which included the House of Brunswick. As part of this effort, he traveled to Bavaria, Austria, and Italy. In each of these places, he met with scholars and other famous writers. He would publish the results of his research in nine large volumes. Still, he never completed the work that Ernst August had requested.

In 1684 and 1686, Leibniz began to publish his theory of calculus. The next year, in 1687, Newton published his Principia.

In 1710, Leibniz published Theodicee in which he argued that even if the world is not perfect, it is the best possible world. In 1714, he published his famous work Monadologia.

Unfortunately, it was the dispute with Newton that filled his last years. The main argument against Leibniz were the two letters that he had received from Oldenburg. Leibniz claimed that there was not enough information presented in the letters to give him the methods he found. In 1711, a paper by Keill was read to the Royal Society which accused Leibniz of plagiarism. In 1713, the Royal Society investigated the issue and ruled against Leibniz. Leibniz was never asked to give his version of events and Newton himelf wrote the final report.

In 1714, George Ludwig from the House of Hanover became King of England. Leibniz was not invited to join him.

Leibniz died on November 14, 1716. In his life, he had corresponded with over 600 figures of his time. The Berlin Academy of Science emerged as the result of Leibniz's work. While it is clear that Newton invented calculus first, it is also clear that Leibniz extended Newton's work in very important ways that Newton did not appreciate.

References

"Gottfried Wilhelm von Leibniz", MacTutor
"Gottfried Leibniz", Wikipedia.org

Sunday, August 12, 2007

Newton's Identities: Euler's Generalization

As mentioned earlier, Sir Isaac Newton's main purpose in coming up with his "identities" (see here for introduction to Newton's Identities) was to find a formula for determining whether two cubic equations possessed a common root.

Leonhard Euler was able to find a very general solution for finding this formula for any two equations of any degree. This general solution is today known as a resultant.

In today's blog, I will show the general solution for the resultant and then show that this equation has the properties that it equals 0 if and only if two equations have at least one solution in common.

The content in today's blog is taken from Galois' Theory of Algebraic Equations by Jean-Pierre Tignol.

Definition 1: Resultant

Let P = a_nXⁿ + a_n-1X^n-1 + ... + a₁X + a₀ where a_n ≠ 0

Let Q = b_mX^m + b_m-1X^m-1 + ... + b₁X + b₀ where b_m ≠ 0

The resultant of P and Q is the determinant of the following (m+n) x (m+n) matrix:

For review of computing the determinant using the method of cofactor expansion, see here.

Here is the theorem justifying this construction:

Theorem: Common Roots of Two Polynomials

Let P,Q be the polynomials described in definition.

Let R be the resultant of P,Q

R = 0 if and only if P,Q have a common root

Proof:

(1) Assume that P,Q have a common root u

(2) Then (x - u) divides both P,Q and there exists P₁, Q₁ such that:

P = (x - u)P₁

Q = (x - u)Q₁
and degree of P₁ is less than degree of P [See Theorem here for details]
and degree of Q₁ is less than degree of Q [See Theorem here for details]

(3) We can also see that:

Q₁ = Q/(x - u)
P₁ = P/(x - u)

(4) So that:

PQ₁ = (x - u)P₁*Q/(x - u) = QP₁

(5) We can see that P,Q,P₁,Q₁ are all polynomials and:

There exists a_i such that:

P = a_nxⁿ + a_n-1x^n-1 + ... + a₁x + a₀

where a_n ≠ 0.

There exists b_j such that:

Q = b_mx^m + b_m-1x^m-1 + ... + b₁x + b₀

where b_m ≠ 0

There exists z_k such that:

P₁ = -(z₁x^n-1 + z₂x^n-2 + ... + z_n-1x + z_n)

where z₁ ≠ 0

There exists y_l such that:

Q₁ = y₁x^m-1 + y₂x^m-2 + ... + y_m-1x + y_m

where y₁ ≠ 0

(6) From step #4, we can see that:

PQ₁ - QP₁ = 0

(7) In the expression in step #6, we can see that the coefficient for x^k = ∑ (i+j=k) (a_iy_m-j) + ∑ (i+j=k) (b_iz_n-j) since:

(a) For each term i in P, a_i is the coefficient for xⁱ

(b) For each term j in Q₁, y_m-j is the coefficient for x^j.

Consider 1 = m - (m-1); 2 = m - (m - 2); m-1 = m - (1); m = m - (0)

(c) For PQ₁, the coefficient is ∑ (i+j=k) (a_iy_m-j)

(d) For each term i in Q, b_i is the coefficient for xⁱ

(e) For each term j in P₁, -z_n-j is the coefficient for x_j.

Consider 1 = n - (n-1); 2 = n - (n-2); n-1 = n - (1); n = n - (0)

(f) For QP₁, the coefficient is ∑ (i+j=k) (-b_iz_n-j)

(g) For PQ₁ - QP₁, then, the coefficient is:

∑(i+j=k) (a_iy_m-j + b_iz_n-j) = ∑ (i+j=k) (a_iy_m-j) + ∑(i+j=k) (b_iz_n-j)

(8) We can further simplify this expression by defining s,t such that:

s = m-j
t = n-j

so that:

j = m - s = n - t

and:

i + j =k → i + m - s = k → i - s = k - m

i + j = k → i + n - t = k → i - t = k - n

which gives us:

∑ (i+j=k) (a_iy_m-j) + ∑(i+j=k) (b_iz_n-j) =

∑ (i - s = k - m) (a_iy_s) + ∑ (i - t = k - n) (b_iz_t)

(9) Now since the degree of P is n, the degree of P₁ is n-1, the degree of Q is m, and the degree of Q₁ is m-1, it follows that the degree of PQ₁ = m+n-1 and the degree of QP₁ = m+n-1.

(10) So, we can use the result in step #6 and the result in step #9 to build m + n linear equations where each linear equation represents a different value of k since the sum of coefficents for each power of x must equal 0.

for k = m + n - 1, we have:

a_ny₁x^(m+n-1) + b_mz₁x^(m+n-1) = 0

for k = m + n - 2, we have:

a_ny₂x^(m+n-2) + a_n-1y₁x^(m+n-2) + b_mz₂x^(m+n-2) + b_m-1z₁x^(m+n-2) = 0

...

for k = 1, we have:

a₁y_mx¹ + a₀y_m-1x¹ + b₁z_nx¹ + b₀z_n-1x¹ = 0

for k = 0, we have:

a₀y_m + b₀z_n = 0

(11) Factoring out x^k from each of these equations, gives us:

for k = m + n - 1, we have:

a_ny₁ + b_mz₁ = 0

for k = m + n - 2, we have:

a_ny₂ + a_n-1y₁ + b_mz₂ + b_m-1z₁ = 0

...

for k = 1, we have:

a₁y_m + a₀y_m-1 + b₁z_n + b₀z_n-1 = 0

for k = 0, we have:

a₀y_m + b₀z_n = 0

(12) For each of these linear equations, we can view the unknowns as consisting of y_s and z_t so that we get the following matrix representing a homogeneous system of linear equations:

(13) Now, it is clear that the equation above is none other than:

RX = 0

(14) We also know that there exists a nontrivial solution since from step #5 above,

(15) Since a nontrivial solution exists, we know that det R = 0 [See Theorem 6, here]

(16) Assume that det(R)=0

(17) It follows that R can be expressed as a homogeneous system of linear equations with a nontrivial solution. [See Theorem 6, here]

(18) We can label this nontrivial solution the same step #14 above:

(19) Multiplying out R with the nontrivial solution gives us the same set of m+n equations as step #11 above:

for the first equation, we have:

a_ny₁ + b_mz₁ = 0

for the second equation, we have:

a_ny₂ + a_n-1y₁ + b_mz₂ + b_m-1z₁ = 0

...

for the (m+n-1)th equation, we have:

a₁y_m + a₀y_m-1 + b₁z_n + b₀z_n-1 = 0

for the (m+n)th equation, we have:

a₀y_m + b₀z_n = 0

(20) Since these are the exact same as step #11, we know that we can factor them out into P,Q,P₁,Q₁ just as we did in step #6 and step #7 above.

(21) To complete this proof, let's assume that P and Q are relatively prime -- that is, they don't have at least one solution in common.

(22) Using PQ₁ = QP₁, we conclude that P must divide P₁ since it cannot divide Q (since P,Q are relatively prime).

(23) But this is impossible since P₁ has a lower degree than P.

(24) Therefore, we have a contradiction and we can conclude that P and Q have a solution in common.

QED

References

Jean-Pierre Tignol, Galois' Theory of Algebraic Equations, World Scientific, 2001

Sunday, August 05, 2007

Newton's Identities: Newton's Purpose

When Newton came up with his identities, he had a specific goal in mind. Even though his formula can be generalized to work with any degree of the polynomial (see here for details), Newton was especially interested in cubic polynomials. He wanted to figure out a formula for determining whether two polynomials had a common root.

Newton created his identities for this purpose. In today's blog, I will show how Newton's identities can be used to determine if two polynomials of degree 3 have a common root. In a future blog, I will show Euler's generalization of this idea which led to the concept of the resultant.

Consider two cubic equations:

f(x) = x³ + bx² + cx + d

g(x) = x³ + Bx² + Cx + D

Let r,s,t be the roots of f(x).

It is clear that f(x) and g(x) have common roots if and only if g(r)*g(s)*g(t) = 0.

Here's Newton's insight:

Theorem: Common Roots in Cubic Equations

For any two cubic equations:

f(x) = x³ + bx² + cx + d

g(x) = x³ + Bx² + Cx + D

There exists an equation P(b,c,d,B,C,D) which only equals 0 if and only if the two cubic equations have a common root.

Proof:

(1) Let r,s,t be the three roots for f(x). [We know this is the case from the Fundamental Theorem of Algebra, see here, and also from the general formula for cubic equations, see here]

(2) Assume that g(x) shares a root with f(x) so that either g(r)=0 or g(s)=0 or g(t)=0.

(3) So, clearly:

g(r)*g(s)*g(t) = 0

(4) This means that:

[r³ + br² + cr + d][s³ + bs² + cs + d][t³ + bt² + ct + d] = 0

(5) Now we can divide up this equation into a sum of the following factors:

g(r)*g(s)*g(t) = a₁*r³s³t³ + a₂r³s³t² + ... + a₃₄d³

(6) Since all three equations are symmetric, that it we can switch any two and get the same result, it is clear that all combinations have the same coefficients.

For example, consider every term of the form: r³s²t.

r³*bs²*ct = bc*(r³s²t)

r³*cs*bt² = bc*(r³t²s)

br²*s³*ct = bc*(s³r²t)

cr*s³*bt² = bc*(s³t²r)

br²*cs*t³ = bc*(t³r²s)

cr*bs²*t³ = bc*(t³s²r)

(7) So, by grouping the terms by form (and I will r^xs^yt^z to refer to all combinations of this form), we have:

g(r)g(s)g(t) = r³s³t³ + b*r³s³t² + c*r³s³t + ... + c³rst + d³

(8) It turns out that each of these terms is one of Newton's identities. Besides d³, there are 19 possible combinations. If we combine this with Newton's identities (see here), we can restate the sums r^xs^yt^z into a term that consists of the coefficients B,C,D.

For example:

(every r³s²t) = -CD

So, the term itself is:

bc*(-CD) = -bcCD

(9) Since the entire equation can be expressed in terms b,c,d,B,C,D, we have shown that there exists an equation P(b,c,d,B,C,D) that equals 0 if and only if f(x) and g(x) have a common root.

So, we get:

P = -d³ + Bcd² - B²bd² + B³d² - Cc²d + 2Cbd² - 3BCd² - B²Ccd - C²b²d + 2C²cd + BC²bd - C³d - 3Dbcd + 3Dd² + 2DBb²d + DBcd + seventeen more terms obtained from these by reversing the sign and interchanging B with b, C with c, and D with d, that is D³ - bCD² + b²BD² - ...

QED

References

Harold M. Edwards, Galois Theory, Springer, 1984.

Friday, July 27, 2007

Newton's Identities: Proof of Newton's Formula

In a previous blog, I talked about Newton's Identities and the formula that Sir Isaac Newton used to generate these identities.

In today's blog, I will show a proof for Newton's formula. Interestingly, the proof revolves around a peculiar definition. The proof is taken from Jean-Pierre Tignol's Galois' Theory of Algebraic Equations and the definition in question is denoted as τ(a,b).

If you are reading Tignol's book, you may wonder why I am using same notation (σ_k and s_k) in the opposite way. I do this in order to stay consistent with Harold Edwards who I referenced in my previous blogs.

τ(a,b) is defined as the sum of unique a-combinations of the n roots of a polynomial labeled x₁, x₂, ..., x_n where one of the root is set to the power of b [For review of why a polynomial of degree n necessarily has n roots, see here for discussion on the fundamental theorem of algebra which guarantees this result]. This means that there are two different conditions that need to be considered. When b ≥ 2, regardless of which number is put to the power of b, there are n*C(n-1,a-1) terms in this equation. When b = 1, there are still C(n,a) possible terms in this equation.

If a ≥ 2, then n*C(n,a-1) is greater than C(n,a) since C(n,a) = (n/a)*C(n-1,a-1)

[Here's the reason why: (n/a)*C(n-1,a-1) = (n/a)*(n-1)!/[(n - 1 -[a-1])!(a-1)!] = n!/[(a)[n - 1 - a + 1]!(a-1)!] = n!/[a!][n-a] = C(n,a)]

With these two conditions in mind, we can now look at the definitions that are needed for the proof.

Definition 1: τ(a,b)

for b ≥ 2:

τ(a,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) x_i^bx_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(a-1)}

for b = 1:

τ(a,1) = ∑ (j=1,C(n,a)) x_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(k)}

where:

C(n-1,a-1) = (n-1)!/{(a-1)!(n-1-[a-1])!}

f_j(x) is a function where:

the range of f_j(x) is {1, 2, ..., n}
f_j is such that for all x,y if x ≠ y, then f_j(x) ≠ f_j(y)
each f_j is a unique (a-1)-way selection of n-1 integers (not including i) so that we can assume that f_j(1) is less than f_j(2), etc. [since there is only one permutation of any selection that has this characteristic]

Example:

For n=3 with {x₁, x₂, x₃ }

C(n-1,a-1) = C(2,1) = 2!/1!(2-1)! = 2

So there are n*C(n-1,a-1) =3*2 = 6 terms

τ(2,2) =
x₁²x₂ + x₁²x₃ + x₂²x₁ +
x₂²x₃ +
x₃²x₁ +
x₃²x₂

Definition 2: s_k

s_k = ∑ (i=1, n) x_i^k

Example:

For n=3 with {x₁, x₂, x₃},

s₄ = x₁⁴ + x₂⁴ + x₃⁴

Lemma 1: τ(1,b) = s_b

Proof:

(1) τ(1,b) = ∑ (i=1,n) x_i^b using Definition 1 above.

(2) Then τ(1,b) = s_b using Definition 2 above.

QED

Definition 3: σ_i

σ_k = ∑ (i=1,C(n,k)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(k)}

where:

C(n,k) = (n)!/{(k)!(n-k)!}

f_i(x) is a function where:

the range of f_i(x) is {1, 2, ..., n}
f_i is such that for x,y if x ≠ y, then f_i(x) ≠ f_i(y)
each f_i is a unique k-way selection of n integers so that we can assume that f_i(1) is less than f_i(2), etc. [since there is only one permutation of any selection that has this characteristic]

Example: σ₂

For n=3 with {x₁, x₂, x₃ }

C(n,k) = C(3,2) = 3!/2!(3-2)! = 3

So there are 3 terms

σ₂ = x₁*x₂ + x₁*x₃ + x₂*x₃

Lemma 2: τ(a,1) = σ_a

Proof:

This is clear since for b=1, τ(a,1) and σ_a have the same definition.

QED

Lemma 3:

For a less than n and b greater than 2:

τ(a,b) = σ_as_b-1 - τ(a+1,b-1)

Proof:

(1) τ(a,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) x_i^bx_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(a-1)} [See Definition 1 above]

= ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) x_i^(b-1)x_ix_{f_j(1)}*x_{f_j(2)}*...*xf_j(a-1)

(2) σ_k = ∑ (i=1,C(n,k)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(k)} [See Definition 3 above]

(3) s_b-1 = ∑ (i=1, n) x_i^(b-1)

(4) σ_as_b-1 = [ ∑ (i=1,C(n,a)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(a)}][ ∑ (j=1, n) x_j^(b-1)]

(5) τ(a+1,b-1) = ∑ (i=1, n) ∑ (j=1, C(n-1,a)) x_i^b-1x_{f_j(1)}*x_{f_j(2)}*...*xf_j(a)

(6) Now each term in step #4 consists of the following form:

x_j^b-1*x_{f_i(1)}*...*x_{f_i(a)}

(7) Each term can then be categorized into two cases:

Case I: x_j is the same as one of the values in x_{f_i(1)}, x_{f_i(2)}, ..., x_{f_i(a)}

Case II: x_j is not the same as any of the values in x_{f_i(1)}, x_{f_i(2)}, ..., x_{f_i(a)}

(8) If C₁ is the sum of all the terms that fall in Case I and C₂ is the sum of all terms that fall into Case II, we can see that:

σ_as_b-1 = C₁ + C₂

(9) Now, it is clear that for a less than n and b greater than 2:

C₁ = τ(a,b) [See step #1 above]

(10) It is also clear that for a less than n and b greater than 2:

C₂ = τ(a+1,b-1)

(11) Since C₁ = σ_as_b-1 - C₂, it follows that:

τ(a,b) = σ_as_b-1 - τ(a+1,b-1)

QED

Lemma 4:

if a is less than n:

τ(a,2) = σ_as₁ - (a+1)σ_a+1

Proof:

(1) τ(a,2) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) x_i²x_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(a-1)} = = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) [x_ix_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(a-1)} ]*x_i

(2) σ_a = ∑ (i=1,C(n,a)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(a)}

(3) s₁ = ∑ (i=1, n) x_i

(4) σ_as₁= [∑ (i=1,C(n,a)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(a)}][∑ (j=1, n) x_j]

(5) Now each term in step #4 consists of the following form:

x_j*x_{f_i(1)}*...*x_{f_i(a)}

(7) Each term can then be categorized into two cases:

Case I: x_j is the same as one of the values in x_{f_i(1)}, x_{f_i(2)}, ..., x_{f_i(a)}

Case II: x_j is not the same as any of the values in x_{f_i(1)}, x_{f_i(2)}, ..., x_{f_i(a)}(8) If C₁ is the sum of all the terms that fall in Case I and C₂ is the sum of all terms that fall into Case II, we can see that:
σ_as₁ = C₁ + C₂

(9) Now, it is clear that for a less than n:

C₁ = τ(a,2) [See step #1 above]

(10) It is also clear that for a less than n:

C₂ = (a+1)σ_a+1since:

(a) σ_(a+1) = ∑ (i=1,C(n,a+1)) x_{f_i(1)}*x_{f_i(2)}*...*x_{f_i(a)}*x_{f_i(a+1)}

(b) Now, removing all the terms that are case I, it is clear that there a+1 ways to reach each term in step (a).

For example, the combination x₁*x₂*...*x_a*x_a+1

This can be reached from:

x₁ * (x₂x₃*...*x_a+1) x₂ * (x₁x₃*...*x_a+1) ... x_a+1*(x₁x₂*...*x_a)

We can make the same argument for each possible term in C₂

(11) Since C₁ = σ_as₁ - C₂, it follows that:

τ(a,2) = σ_as₁ - (a+1)σa+1

QED

Lemma 5:

if b ≥ 2, then:

τ(n,b) = σ_ns_b-1

Proof:

(1) τ(n,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) x_i^bx_{f_j(1)}*x_{f_j(2)}*...*x_{f_j(a-1)}

(2) Since b ≥ 2, we have:

τ(n,b) = x₁^bx₂*...*x_n + x₂^bx₁*...*x_n + ... + x_n^bx₁*...*x_n-1 =

= x₁^b-1x₁x₂*...*x_n + x₂^b-1x₁x₂*...*x_n + ... + x_n^b-1x₁*...*x_n =

= (x₁^b-1 + x₂^b-1 + ... + x_n^b-1)(x₁x₂*...x_n)

(3) σ_n = x₁*x₂*...*x_n

(4) s_b-1 = x₁^b-1 + x₂^b-1 + ... + x_n^b-1

(5) It follows from step #2, that:

τ(n,b) = σ_ns_b-1

QED

Theorem 6: Newton's Formula

Proof:

(1) τ(1,k) = s_k [See Lemma 1 above]

(2) Using Lemma 3 above if k ≥ 3, we have:

τ(1,k) = σ₁s_k-1 - τ(2,k-1) = s_k

(3) Assuming that k-1 ≥ 3, Lemma 3 gives us:

τ(2,k-1) = σ₂s_k-2 - τ(3,k-2)

(4) Following this pattern, if k ≤ n, we eventually get the following:

s_k = σ₁s_k-1 - σ₂s_k-2 + ... + (-1)^kτ(k-1,2)

(5) Using Lemma 4 above:

τ(k-1,2) = σ_k-1s₁ - (k)σ_k

(6) Thus, we have:

s_k = σ₁s_k-1 - σ₂s_k-2 + ... + (-1)^kσ_k-1s₁ + (-1)^k+1kσ_k
(7) Subtracting s_k from both sides and then multiplying -1 gives us:

0 = s_k - σ₁s_k-1 + σ₂s_k-2 + ... + (-1)^k-1σ_k-1s₁ + (-1)^kkσ_k

(8) If k is greater than n, then we eventually get to:

s_k = σ₁s_k-1 - σ₂s_k-2 + ... + (-1)ⁿ⁺¹τ(n,k+1-n)

(9) Using Lemma 5 above gives us:

s_k = σ₁s_k-1 - σ₂s_k-2 + ... + (-1)ⁿ⁺¹σ_ns_k-n

(10) Subtracting s_k from both sides and then multiplying by -1 gives us:

0 = s_k - σ₁s_k-1 + σ₂s_k-2 + ... + (-1)ⁿσ_ns_k-n

(11) So, combining step #7 and step #10, if we assume that:

σ₀ = 1 and if k is greater than n, σ_k=0, then:

QED

References

Jean-Pierre Tignol, Galois' Theory of Algebraic Equations, World Scientific, 2001
Harold M. Edwards, Galois Theory, Springer, 1984.

Monday, July 16, 2007

Newton's Identities: Newton's Formula

Newton's identities (see here for an introduction to Newton's identities) are relationships between the roots of a cubic polynomial and its coefficients. They were first presented by Albert Girard but were presented in even greater detail independently by Sir Isaac Newton.

In today's blog, I will present Newton's formula and show how it leads to Newton's identities. In a future blog, I will provide the proof for the formula.

Definition 1: Newton's Formula for Newton's Identities

For a polynomial of degree n, with roots { r₁, ..., r_n }:

For any integer i, let s_i be the sum based on the roots:

s_i = r₁ⁱ + r₂ⁱ + ... + r_nⁱ.

For any integer j, Let σ_j be the elementary symmetric polynomial for j in n [see here for details if needed on elementary symmetric polynomials]

Then, Newton's formula is:

where σ₀ = 1 and if k is greater than n, σ_k=0.

We can now use it to build some equations for s_i and σ_j.

If k=1, then we have:

s₁ = σ₁

If k ≥ 2, then we have:

s_k = ∑ (i=1, k-1) (-1)ⁱ⁺¹s_k-iσ_i + (-1)^k+1kσ_k

Using the above formula gives us:

s₁ = σ₁
s₂ = s₁σ₁ - 2σ₂

s₃ = s₂σ₁ - s₁σ₂ + 3σ₃

s₄ = s₃σ₁ - s₂σ₂ + s₁σ₃ - 4σ₄

s₅ = s₄σ₁ - s₃σ₂ + s₂σ₃ - s₁σ₄ + 5σ₅

...

Now, building each formula based on the previous formula gives us the following formulas for s_k in term so of σ_i:

s₁ = σ1

s₂ = s₁σ₁ - 2σ₂ = (σ₁)σ₁ - 2σ₂ = σ₁² - 2σ₂

s₃ = s₂σ₁ - s₁σ₂ + 3σ₃ = (σ₁² - 2σ₂)σ₁ - (σ₁)σ₂ + 3σ₃ = σ₁³ -3σ₁σ₂ + 3σ₃

s₄ = s₃σ₁ - s₂σ₂ + s₁σ₃ - 4σ₄ = σ₁( σ₁³ -3σ₁σ₂ + 3σ₃) - σ₂(σ₁² - 2σ₂) + σ₃(σ₁) - 4σ₄ =

= σ₁⁴ - 3σ₁²σ₂ + 3σ₁σ₃ - σ₁²σ₂ + 2σ₂² + σ₁σ₃ - 4σ₄ =

= σ₁⁴ -4σ₁²σ₂ + 4σ₁σ₃ + 2σ₂² - 4σ₄

...

Further, we can use these same formulas for each σ_i so that:

σ₁ = s₁

σ₂ = (1/2)[s₁σ₁ - s₂]

σ₃ = (1/3)[s₃ - s₂σ₁ + s₁σ₂]

σ₄ = (1/4)[-s₄ + s₃σ₁ - s₂σ₂ + s₁σ₁

σ₅ = (1/5)[s₅ - s₄σ₁ + s₃σ₂ - s₂σ₃ + s₁σ₄]

Now, I will show these equations can be used to derive Newton's identities for cubic polynomials (that is, where n = 3). [See here for review of Newton's identities] where I am assuming an equation of the following form:

x³ + bx² + cx + d = 0

Here are the justifications for each formula presented previously.

Identity 1: r + s + t = -b

Proof:

σ₁ = r + s + t [See Definition 1, here]

σ₁ = (-1)¹(b) = -b [See Lemma 1, here]

QED

Identity 2: r² + s² + t² = b² - 2c

Proof:

s₂ = r² + s² + t² [See Definition 1 above]

s₂ = σ₁² - 2σ₂ [See formula above]

σ₁² - 2σ₂= ((-1)¹b)² - 2(-1)²(c) = b² - 2c.

QED

Identity 3: r³ + s³ + t³ = -b³ + 3bc - 3d

Proof:

s₃ = r³ + s³ + t³ [See Definition 1 above]

s₃ = σ₁³ -3σ₁σ₂ + 3σ₃[See formula above]

σ₁³ -3σ₁σ₂ + 3σ3 = [(-1)¹b]³ - 3(-1)¹b(-1)²c + 3(-1)³d =

= -b³ +3bc -3d.

QED

Identity 4: rs + rt + st = c

Proof:

σ₂ = rs + rt + sr + st +tr + ts [See Definition 1, here]

σ₂ = (-1)²c = c

QED

Identity 5: r²s + r²t + s²r + s²t + t²r + t²s = -bc + 3d

Proof:

(rs + rt + sr + st +tr + ts)(r + s + t) - 3rst = r²s + r²t + s²r + s²t + t²r + t²s

(rs + rt + sr + st +tr + ts)(r + s + t) - 3rst = σ₂σ₁ - 3σ₃ = (-1)¹b(-1)²c - 3*(-1)³d = -bc + 3d.

QED

Identity 6: r³s + r³t + s³r + s³t + t³r + t³s = b²c - 2c² - bd

Proof:

(r³ + s³ + t³)(r + s + t) - (r⁴ + s⁴ + t⁴) = r³s + r³t + s³r + s³t + t³r + t³s

(r³ + s³ + t³)(r + s + t) - (r⁴ + s⁴ + t⁴) =( s₃)(σ₁) - s₄

( s₃)(σ₁) - s₄ = (σ₁³ -3σ₁σ₂ + 3σ₃)(σ₁) - (σ₁⁴ - 4σ₁²σ₂ + 4σ₁σ₃ + 2σ₂² - 4σ₄) =

= σ₁²σ₂ - σ₁σ₃ - 2σ₂² + 4σ₄ = [(-1)b]²(-1)²c - 2[(-1)²c]² - [(-1)b(-1)³d] =

= b²c - 2c² - bd + 0 = b²c - 2c² - bd

QED

Identity 7: r²s² + r²t² + s²t² = c² - 2bd

Proof:

r²s² + r²t² + s²t²= (1/2)(r² + s² + t²)(r² + s² + t²) - (1/2)(r⁴ + s⁴ + t⁴) - =

= (1/2)s₂*s₂ - (1/2)s₄ =

= (1/2)(σ₁² - 2σ₂)(σ₁² - 2σ₂) - (1/2)( σ₁⁴ - 4σ₁²σ₂ + 4σ₁σ₃ + 2σ₂² - 4σ₄) =

= (1/2)σ₁⁴ - 2σ₁²σ₂ + 2σ₂² - (1/2)σ₁⁴ + 2σ₁²σ₂ - 2σ₁σ₃ - σ₂² + 2σ₄=

= σ₂² - 2σ₁σ₃ + 2σ₄ =

= (c)² - 2*(-1)(b)(-1)(d) + 2*(0) = c² -2bd.

QED

Identity 8: r³s² + r³t² + s³r² + s³t² + t³r² + t³s² = -bc² + 2b²d + cd

Proof:

(1) (r²s² + r²t² + s²t²)(r + s + t) - (rst)(rs + rt + st) =
r³s² + r³t² + s³r² + s³t² + t³r² + t³s²

(2) (r²s² + r²t² + s²t²)(r + s + t) - (rst)(rs + rt + st) =

=(σ₂² - 2σ₁σ₃ + 2σ₄ )(σ₁) - (σ₃)(σ₂) =

= σ₁σ₂² - 2σ₁²σ₃ + 2σ₁σ₄ - σ₂σ₃ =

= (-1)b(c)² - 2(b)²(-d) + 2(-b)(0) - (c)(-d) =

= -bc² + 2b²d + cd

QED

Identity 9: r³s³ + r³t³ + s³t³ = c³ - 3bcd + 3d²

Proof:

(1) r³s³ + r³t³ + s³t³ = (r²s² + r²t² + s²t²)(rs + rt + st) - (r³s²t + r³st² + s³r²t + s³t²r + t³r²s + t³s²r )

(2) (r²s² + r²t² + s²t²)(rs + rt + st) - (r³s²t + r³st² + s³r²t + s³t²r + t³r²s + t³s²r ) = (c² - 2bd)(σ₂) - (bcd - 3d²) =

= (c² - 2bd)(c) - (bcd - 3d²) = c³ - 2bcd -bcd + 3d² =

= c³ - 3bcd +3d²

QED

Identity 10: rst = -d

Proof:

σ₃ = rst

σ₃ = (-1)³d = -d

QED

Identity 11: r²st + s²rt + t²rs = bd

Proof:

r²st + s²rt + t²rs = rst(r + s + t) = σ₃*σ₁ = (-1)(d)(-1)(b) = bd

QED

Identity 12: r³st + s³rt + t³rs = -b²d + 2cd

Proof:

(rst)(r² + s² + t²) = r³st + s³rt + t³rs

(rst)(r² + s² + t²) = σ₃s₂ =

= σ₃(σ₁² - 2σ₂) = (-d)(b² -2c) = -b²d + 2cd.

QED

Identity 13: r²s²t + r²st² + rs²t² = -cd

Proof:

(rst)(rt + rs + st) = r²s²t + r²st² + rs²t²

(rst)(rt + rs + st) = σ₃σ₂ = (-d)(c) = -cd.

QED
Identity 14: r³s²t + r³st² + s³r²t + s³t²r + t³r²s + t³s²r = bcd - 3d²

Proof:

(rst)[(rs + rt + st)(r + s + t) - 3rst] = r³s²t + r³st² + s³r²t + s³t²r + t³r²s + t³s²r
(rst)[(rs + rt + st)(r + s + t) - 3rst] = σ₃[(σ₂)(σ₁) - 3σ₃] =
σ₃[(σ₂)(σ₁) - 3σ₃] = (-1)d[c(-b) - 3(-1)d] = (-d)[-bc + 3d] = bcd - 3d²

QED

Identity 15: r³s³t + r³t³s + s³t³r = -c²d + 2bd²

Proof:

(rst)[(1/2)(r² + s² + t²)(r² + s² + t²) - (1/2)(r⁴ + s⁴ + t⁴) ] = r³s³t + r³t³s + s³t³r

(rst)[(1/2)(r² + s² + t²)(r² + s² + t²) - (1/2)(r⁴ + s⁴ + t⁴) ] =

= σ₃[(1/2)(s₂)(s₂) - (1/2)s₄ ] =

= σ₃[(1/2)(σ₁² - 2σ₂)(σ₁² - 2σ₂) - (1/2)( σ₁⁴ - 4σ₁²σ₂ + 4σ₁σ₃ + 2σ₂² - 4σ₄)] =

= σ₃[(1/2)σ₁⁴ - 2σ₁²σ₂ + 2σ₂² - (1/2)σ₁⁴ + 2σ₁²σ₂ - 2σ₁σ₃ - 2σ₂² + 2*0)] =

= σ₃[σ₂² -2σ₁σ₃] = (-1)d[(c)² - 2*(-1)b(-1)d] =

= (-d)[c² - 2bd] = -c²d + bd².

QED

Identity 16: r²s²t² = d²

Proof:

r²s²t² = (rst)² = (σ₃)² = [(-1)³d]² = d²

QED

Identity 17: r³s²t² + s³r²t² + t³r²s² = -bd²

Proof:

(rst)(rst)(r + s + t) = r³s²t² + s³r²t² + t³r²s²

(rst)(rst)(r + s + t) = σ₃²*σ₁ = (-b)d²= -bd²

QED

Identity 18: r³s³t² + r³t³s² + s³t³r² = cd²

Proof:

(rs + rt + st)(rst)(rst) = r³s³t² + r³t³s² + s³t³r²(rs + rt + st)(rst)(rst) = σ₂*σ₃² = (c)(d)² = cd²

QED

Identity 19: r³s³t³ = -d³

Proof:

r³s³t³ = (rst)³ = (σ₃)³ = (-d)³ = -d³

QED

References

"Newton's Identities", Wikipedia
Jean-Pierre Tignol, Galois' Theory of Algebraic Equations, World Scientific, 2001
Harold M. Edwards, Galois Theory, Springer, 1984.

Fermat's Last Theorem

Sunday, October 21, 2007

Gottfried Wilhelm Leibniz

Sunday, August 12, 2007

Newton's Identities: Euler's Generalization

Sunday, August 05, 2007

Newton's Identities: Newton's Purpose

Friday, July 27, 2007

Newton's Identities: Proof of Newton's Formula

Monday, July 16, 2007

Newton's Identities: Newton's Formula

Topic Index

Completed Proofs

Recommended Books

Required Reading for Experts

About Me

Blog Archive