Sunday, October 21, 2007

Gottfried Wilhelm Leibniz

Gottfried Wilhelm Leibniz was born on July 1, 1646 in Leipzig, Saxony which, today, is part of Germany. His father was a professor of moral philosophy at the University of Leipzig which had opened in Saxony in 1409. His father died when Gottfried was six years old. Young Leibniz inherited his father's library.

Leibniz was raised by his mother. At age seven, he entered the Nicolai School in Leibzig. Leibniz immersed himself in self-study in an effort to be able to read his father's books. By the age of 12 he had grown very advanced in Latin and had begun to study Greek. He would later write about his dissatisfaction with the logic of Aristotle.

At 14, he entered the University of Leipzig where he most likely studied philosophy, mathematics, rhetoric, Latin, Greek, and Hebrew. During the summer of 1663, he visited the University of Jena where he gained his first exposure to fundamental mathematical ideas such as proofs. Leipzig at the time was not very strong in mathematics so it is believed that Jena played a very important role in the development of his understanding of mathematics. Leibniz was greatly influenced by the ideas of Erhard Weigel who believed that all the universe could be viewed in terms of numbers.

Leibniz received his bachelors degree in law and a masters in philosophy from Leipzig. Despite this great progress, when he presented his thesis for his doctorate, his advancement was denied. The details for why this occurred are unclear. Normally, this meant that Leibniz would need to wait a year before resubmitting his doctoral thesis. Instead, Leibniz presented his doctorate thesis to the University of Altdorf where he gained his doctorate.

Leibniz was offered a position at the University of Altdorf which he decided not to accept. Later, he made the acquaintance of Baron Johann Christian von Boineburg. He soon had become "secretary, assistant, librarian, lawyer, and advisor the Baron and his family." (E J Aiton, Leibniz: A biography, Bristol-Boston, 1984) At this point, Leibniz's interests and works rested primarily in literary ambitions. One writer noted that during this period of Leibniz's life, he would have passed as "a typical late renaissance humanist." (G M Ross, Leibniz, Oxford, 1984)

In 1672, Leibniz was sent by Boineburg to meet with the French in an effort to dissuade Louis XIV from invading the German regions. Leibniz put forward a plan of invading Egypt that was very similar to the plan the Napolean would later carry out. At this point, Leibniz met the mathematicians and scientists of Paris. In particular, he studied mathematics and science under Christian Huygens from the Netherlands. His ventures in math and science in Paris were more successful than his political efforts.

Baron Boineburg died on December 15 of 1672. The Baron's family continued to sponsor Leibniz. In January of 1673, Leibniz gave up on his efforts at peace in Paris and went now to England to convince the British of peace. There, he met with Hooke, Boyle, and Pell. Leibniz presented his ideas for an automatic counting machine. On April 19, 1673, Leibniz was elected as a member to the Royal Society of London. Once again, his scientific pursuits were more successful than the political ones.

In 1674, Leibniz began to take interest in the problem of infinitesimals. He corresponded with Oldenburg from the Royal Society who let him know that Newton and Gregory had found very general methods to the problem. By autumn of 1676, Leibniz had worked out much of his notation for calculus. At this time, he received a letter from Newton. In 1677, he received a second letter from Newton. In this second letter, Newton questions whether Leibniz stole Newton's method. Newton pointed out that not a "single previously unsolved problem was solved." (Quoted in the MacTutor article) Later, Leibniz's notation would prove very important in the advancement of calculus.

Leibniz had hoped to join the Academy of Sciences in Paris but no opportunity to join came his way. In October of 1676, he accepted a position as librarian and Court Councillor to the Duke of Hanover, Johann Friedrich.

During this time, Leibniz worked on many outside projects. He worked unsuccessfully on developing wind-powered pumps to drain water from the Harz mountain mines. From these efforts, he developed his knowledge of geology and proposed a theory that the earth was at one time molten lava.

By 1679, he had developed a "binary system of arithmetic." He also worked on the problem of determinants which had written around 1684.

In 1680, the Duke of Hanover died and Leibniz began working for his brother Ernst August. Leibniz began to work on the family tree which included the House of Brunswick. As part of this effort, he traveled to Bavaria, Austria, and Italy. In each of these places, he met with scholars and other famous writers. He would publish the results of his research in nine large volumes. Still, he never completed the work that Ernst August had requested.

In 1684 and 1686, Leibniz began to publish his theory of calculus. The next year, in 1687, Newton published his Principia.

In 1710, Leibniz published Theodicee in which he argued that even if the world is not perfect, it is the best possible world. In 1714, he published his famous work Monadologia.

Unfortunately, it was the dispute with Newton that filled his last years. The main argument against Leibniz were the two letters that he had received from Oldenburg. Leibniz claimed that there was not enough information presented in the letters to give him the methods he found. In 1711, a paper by Keill was read to the Royal Society which accused Leibniz of plagiarism. In 1713, the Royal Society investigated the issue and ruled against Leibniz. Leibniz was never asked to give his version of events and Newton himelf wrote the final report.

In 1714, George Ludwig from the House of Hanover became King of England. Leibniz was not invited to join him.

Leibniz died on November 14, 1716. In his life, he had corresponded with over 600 figures of his time. The Berlin Academy of Science emerged as the result of Leibniz's work. While it is clear that Newton invented calculus first, it is also clear that Leibniz extended Newton's work in very important ways that Newton did not appreciate.

References

Sunday, August 12, 2007

Newton's Identities: Euler's Generalization

As mentioned earlier, Sir Isaac Newton's main purpose in coming up with his "identities" (see here for introduction to Newton's Identities) was to find a formula for determining whether two cubic equations possessed a common root.

Leonhard Euler was able to find a very general solution for finding this formula for any two equations of any degree. This general solution is today known as a resultant.

In today's blog, I will show the general solution for the resultant and then show that this equation has the properties that it equals 0 if and only if two equations have at least one solution in common.

The content in today's blog is taken from Galois' Theory of Algebraic Equations by Jean-Pierre Tignol.

Definition 1: Resultant

Let P = anXn + an-1Xn-1 + ... + a1X + a0 where an ≠ 0

Let Q = bmXm + bm-1Xm-1 + ... + b1X + b0 where bm ≠ 0

The resultant of P and Q is the determinant of the following (m+n) x (m+n) matrix:



For review of computing the determinant using the method of cofactor expansion, see here.

Here is the theorem justifying this construction:

Theorem: Common Roots of Two Polynomials

Let P,Q be the polynomials described in definition.

Let R be the resultant of P,Q

R = 0 if and only if P,Q have a common root

Proof:

(1) Assume that P,Q have a common root u

(2) Then (x - u) divides both P,Q and there exists P1, Q1 such that:

P = (x - u)P1

Q = (x - u)Q1

and degree of P1 is less than degree of P [See Theorem here for details]
and degree of Q1 is less than degree of Q [See Theorem here for details]

(3) We can also see that:

Q1 = Q/(x - u)
P1 = P/(x - u)

(4) So that:

PQ1 = (x - u)P1*Q/(x - u) = QP1

(5) We can see that P,Q,P1,Q1 are all polynomials and:

There exists ai such that:

P = anxn + an-1xn-1 + ... + a1x + a0

where an ≠ 0.

There exists bj such that:

Q = bmxm + bm-1xm-1 + ... + b1x + b0

where bm ≠ 0

There exists zk such that:

P1 = -(z1xn-1 + z2xn-2 + ... + zn-1x + zn)

where z1 ≠ 0

There exists yl such that:

Q1 = y1xm-1 + y2xm-2 + ... + ym-1x + ym

where y1 ≠ 0

(6) From step #4, we can see that:

PQ1 - QP1 = 0

(7) In the expression in step #6, we can see that the coefficient for xk = ∑ (i+j=k) (aiym-j) + ∑ (i+j=k) (bizn-j) since:

(a) For each term i in P, ai is the coefficient for xi

(b) For each term j in Q1, ym-j is the coefficient for xj.

Consider 1 = m - (m-1); 2 = m - (m - 2); m-1 = m - (1); m = m - (0)

(c) For PQ1, the coefficient is ∑ (i+j=k) (aiym-j)

(d) For each term i in Q, bi is the coefficient for xi

(e) For each term j in P1, -zn-j is the coefficient for xj.

Consider 1 = n - (n-1); 2 = n - (n-2); n-1 = n - (1); n = n - (0)

(f) For QP1, the coefficient is ∑ (i+j=k) (-bizn-j)

(g) For PQ1 - QP1, then, the coefficient is:

∑(i+j=k) (aiym-j + bizn-j) = ∑ (i+j=k) (aiym-j) + ∑(i+j=k) (bizn-j)

(8) We can further simplify this expression by defining s,t such that:

s = m-j
t = n-j

so that:

j = m - s = n - t

and:

i + j =k → i + m - s = k → i - s = k - m

i + j = k → i + n - t = k → i - t = k - n

which gives us:

∑ (i+j=k) (aiym-j) + ∑(i+j=k) (bizn-j) =

∑ (i - s = k - m) (aiys) + ∑ (i - t = k - n) (bizt)

(9) Now since the degree of P is n, the degree of P1 is n-1, the degree of Q is m, and the degree of Q1 is m-1, it follows that the degree of PQ1 = m+n-1 and the degree of QP1 = m+n-1.

(10) So, we can use the result in step #6 and the result in step #9 to build m + n linear equations where each linear equation represents a different value of k since the sum of coefficents for each power of x must equal 0.

for k = m + n - 1, we have:

any1x(m+n-1) + bmz1x(m+n-1) = 0

for k = m + n - 2, we have:

any2x(m+n-2) + an-1y1x(m+n-2) + bmz2x(m+n-2) + bm-1z1x(m+n-2) = 0

...

for k = 1, we have:

a1ymx1 + a0ym-1x1 + b1znx1 + b0zn-1x1 = 0

for k = 0, we have:

a0ym + b0zn = 0

(11) Factoring out xk from each of these equations, gives us:

for k = m + n - 1, we have:

any1 + bmz1 = 0

for k = m + n - 2, we have:

any2 + an-1y1 + bmz2 + bm-1z1 = 0

...

for k = 1, we have:

a1ym + a0ym-1 + b1zn + b0zn-1 = 0

for k = 0, we have:

a0ym + b0zn = 0

(12) For each of these linear equations, we can view the unknowns as consisting of ys and zt so that we get the following matrix representing a homogeneous system of linear equations:



(13) Now, it is clear that the equation above is none other than:

RX = 0

(14) We also know that there exists a nontrivial solution since from step #5 above,



(15) Since a nontrivial solution exists, we know that det R = 0 [See Theorem 6, here]

(16) Assume that det(R)=0

(17) It follows that R can be expressed as a homogeneous system of linear equations with a nontrivial solution. [See Theorem 6, here]

(18) We can label this nontrivial solution the same step #14 above:



(19) Multiplying out R with the nontrivial solution gives us the same set of m+n equations as step #11 above:

for the first equation, we have:

any1 + bmz1 = 0

for the second equation, we have:

any2 + an-1y1 + bmz2 + bm-1z1 = 0

...

for the (m+n-1)th equation, we have:

a1ym + a0ym-1 + b1zn + b0zn-1 = 0

for the (m+n)th equation, we have:

a0ym + b0zn = 0

(20) Since these are the exact same as step #11, we know that we can factor them out into P,Q,P1,Q1 just as we did in step #6 and step #7 above.

(21) To complete this proof, let's assume that P and Q are relatively prime -- that is, they don't have at least one solution in common.

(22) Using PQ1 = QP1, we conclude that P must divide P1 since it cannot divide Q (since P,Q are relatively prime).

(23) But this is impossible since P1 has a lower degree than P.

(24) Therefore, we have a contradiction and we can conclude that P and Q have a solution in common.

QED

References

Sunday, August 05, 2007

Newton's Identities: Newton's Purpose

When Newton came up with his identities, he had a specific goal in mind. Even though his formula can be generalized to work with any degree of the polynomial (see here for details), Newton was especially interested in cubic polynomials. He wanted to figure out a formula for determining whether two polynomials had a common root.

Newton created his identities for this purpose. In today's blog, I will show how Newton's identities can be used to determine if two polynomials of degree 3 have a common root. In a future blog, I will show Euler's generalization of this idea which led to the concept of the resultant.

Consider two cubic equations:

f(x) = x3 + bx2 + cx + d

g(x) = x3 + Bx2 + Cx + D

Let r,s,t be the roots of f(x).

It is clear that f(x) and g(x) have common roots if and only if g(r)*g(s)*g(t) = 0.

Here's Newton's insight:

Theorem: Common Roots in Cubic Equations

For any two cubic equations:

f(x) = x3 + bx2 + cx + d

g(x) = x3 + Bx2 + Cx + D

There exists an equation P(b,c,d,B,C,D) which only equals 0 if and only if the two cubic equations have a common root.

Proof:

(1) Let r,s,t be the three roots for f(x). [We know this is the case from the Fundamental Theorem of Algebra, see here, and also from the general formula for cubic equations, see here]

(2) Assume that g(x) shares a root with f(x) so that either g(r)=0 or g(s)=0 or g(t)=0.

(3) So, clearly:

g(r)*g(s)*g(t) = 0

(4) This means that:

[r3 + br2 + cr + d][s3 + bs2 + cs + d][t3 + bt2 + ct + d] = 0

(5) Now we can divide up this equation into a sum of the following factors:

g(r)*g(s)*g(t) = a1*r3s3t3 + a2r3s3t2 + ... + a34d3

(6) Since all three equations are symmetric, that it we can switch any two and get the same result, it is clear that all combinations have the same coefficients.

For example, consider every term of the form: r3s2t.

r3*bs2*ct = bc*(r3s2t)

r3*cs*bt2 = bc*(r3t2s)

br2*s3*ct = bc*(s3r2t)

cr*s3*bt2 = bc*(s3t2r)

br2*cs*t3 = bc*(t3r2s)

cr*bs2*t3 = bc*(t3s2r)

(7) So, by grouping the terms by form (and I will rxsytz to refer to all combinations of this form), we have:

g(r)g(s)g(t) = r3s3t3 + b*r3s3t2 + c*r3s3t + ... + c3rst + d3

(8) It turns out that each of these terms is one of Newton's identities. Besides d3, there are 19 possible combinations. If we combine this with Newton's identities (see here), we can restate the sums rxsytz into a term that consists of the coefficients B,C,D.

For example:

(every r3s2t) = -CD

So, the term itself is:

bc*(-CD) = -bcCD

(9) Since the entire equation can be expressed in terms b,c,d,B,C,D, we have shown that there exists an equation P(b,c,d,B,C,D) that equals 0 if and only if f(x) and g(x) have a common root.

So, we get:

P = -d3 + Bcd2 - B2bd2 + B3d2 - Cc2d + 2Cbd2 - 3BCd2 - B2Ccd - C2b2d + 2C2cd + BC2bd - C3d - 3Dbcd + 3Dd2 + 2DBb2d + DBcd + seventeen more terms obtained from these by reversing the sign and interchanging B with b, C with c, and D with d, that is D3 - bCD2 + b2BD2 - ...

QED

References

Friday, July 27, 2007

Newton's Identities: Proof of Newton's Formula

In a previous blog, I talked about Newton's Identities and the formula that Sir Isaac Newton used to generate these identities.

In today's blog, I will show a proof for Newton's formula. Interestingly, the proof revolves around a peculiar definition. The proof is taken from Jean-Pierre Tignol's Galois' Theory of Algebraic Equations and the definition in question is denoted as Ï„(a,b).

If you are reading Tignol's book, you may wonder why I am using same notation (σk and sk) in the opposite way. I do this in order to stay consistent with Harold Edwards who I referenced in my previous blogs.

Ï„(a,b) is defined as the sum of unique a-combinations of the n roots of a polynomial labeled x1, x2, ..., xn where one of the root is set to the power of b [For review of why a polynomial of degree n necessarily has n roots, see here for discussion on the fundamental theorem of algebra which guarantees this result]. This means that there are two different conditions that need to be considered. When b ≥ 2, regardless of which number is put to the power of b, there are n*C(n-1,a-1) terms in this equation. When b = 1, there are still C(n,a) possible terms in this equation.

If a ≥ 2, then n*C(n,a-1) is greater than C(n,a) since C(n,a) = (n/a)*C(n-1,a-1)

[Here's the reason why: (n/a)*C(n-1,a-1) = (n/a)*(n-1)!/[(n - 1 -[a-1])!(a-1)!] = n!/[(a)[n - 1 - a + 1]!(a-1)!] = n!/[a!][n-a] = C(n,a)]

With these two conditions in mind, we can now look at the definitions that are needed for the proof.

Definition 1: Ï„(a,b)

for b ≥ 2:

Ï„(a,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xibxfj(1)*xfj(2)*...*xfj(a-1)

for b = 1:

Ï„(a,1) = ∑ (j=1,C(n,a)) xfj(1)*xfj(2)*...*xfj(k)

where:

C(n-1,a-1) = (n-1)!/{(a-1)!(n-1-[a-1])!}

fj(x) is a function where:
  • the range of fj(x) is {1, 2, ..., n}
  • fj is such that for all x,y if x ≠ y, then fj(x) ≠ fj(y)
  • each fj is a unique (a-1)-way selection of n-1 integers (not including i) so that we can assume that fj(1) is less than fj(2), etc. [since there is only one permutation of any selection that has this characteristic]
Example:

For n=3 with {x1, x2, x3 }

C(n-1,a-1) = C(2,1) = 2!/1!(2-1)! = 2

So there are n*C(n-1,a-1) =3*2 = 6 terms

Ï„(2,2) =
x12x2 + x12x3 + x22x1 +
x22x3 +
x32x1 +
x32x2

Definition 2: sk

sk = ∑ (i=1, n) xik

Example:

For n=3 with {x1, x2, x3},

s4 = x14 + x24 + x34

Lemma 1: Ï„(1,b) = sb

Proof:

(1) Ï„(1,b) = ∑ (i=1,n) xib using Definition 1 above.

(2) Then Ï„(1,b) = sb using Definition 2 above.

QED

Definition 3: σi

σk = ∑ (i=1,C(n,k)) xfi(1)*xfi(2)*...*xfi(k)

where:

C(n,k) = (n)!/{(k)!(n-k)!}

fi(x) is a function where:
  • the range of fi(x) is {1, 2, ..., n}
  • fi is such that for x,y if x ≠ y, then fi(x) ≠ fi(y)
  • each fi is a unique k-way selection of n integers so that we can assume that fi(1) is less than fi(2), etc. [since there is only one permutation of any selection that has this characteristic]
Example: σ2

For n=3 with {x1, x2, x3 }

C(n,k) = C(3,2) = 3!/2!(3-2)! = 3

So there are 3 terms

σ2 = x1*x2 + x1*x3 + x2*x3

Lemma 2: τ(a,1) = σa

Proof:

This is clear since for b=1, τ(a,1) and σa have the same definition.

QED

Lemma 3:

For a less than n and b greater than 2:

τ(a,b) = σasb-1 - τ(a+1,b-1)

Proof:

(1) Ï„(a,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xibxfj(1)*xfj(2)*...*xfj(a-1) [See Definition 1 above]

= ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xi(b-1)xixfj(1)*xfj(2)*...*xfj(a-1)

(2) σk = ∑ (i=1,C(n,k)) xfi(1)*xfi(2)*...*xfi(k) [See Definition 3 above]

(3) sb-1 = ∑ (i=1, n) xi(b-1)

(4) σasb-1 = [ ∑ (i=1,C(n,a)) xfi(1)*xfi(2)*...*xfi(a)][ ∑ (j=1, n) xj(b-1)]

(5) Ï„(a+1,b-1) = ∑ (i=1, n) ∑ (j=1, C(n-1,a)) xib-1xfj(1)*xfj(2)*...*xfj(a)

(6) Now each term in step #4 consists of the following form:

xjb-1*xfi(1)*...*xfi(a)

(7) Each term can then be categorized into two cases:

Case I: xj is the same as one of the values in xfi(1), xfi(2), ..., xfi(a)

Case II: xj is not the same as any of the values in xfi(1), xfi(2), ..., xfi(a)

(8) If C1 is the sum of all the terms that fall in Case I and C2 is the sum of all terms that fall into Case II, we can see that:

σasb-1 = C1 + C2

(9) Now, it is clear that for a less than n and b greater than 2:

C1 = Ï„(a,b) [See step #1 above]

(10) It is also clear that for a less than n and b greater than 2:

C2 = Ï„(a+1,b-1)

(11) Since C1 = σasb-1 - C2, it follows that:

τ(a,b) = σasb-1 - τ(a+1,b-1)

QED

Lemma 4:

if a is less than n:

τ(a,2) = σas1 - (a+1)σa+1

Proof:

(1) Ï„(a,2) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xi2xfj(1)*xfj(2)*...*xfj(a-1) = = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) [xixfj(1)*xfj(2)*...*xfj(a-1) ]*xi

(2) σa = ∑ (i=1,C(n,a)) xfi(1)*xfi(2)*...*xfi(a)

(3) s1 = ∑ (i=1, n) xi

(4) σas1 = [∑ (i=1,C(n,a)) xfi(1)*xfi(2)*...*xfi(a)][∑ (j=1, n) xj]

(5) Now each term in step #4 consists of the following form:

xj*xfi(1)*...*xfi(a)

(7) Each term can then be categorized into two cases:

Case I: xj is the same as one of the values in xfi(1), xfi(2), ..., xfi(a)

Case II: xj is not the same as any of the values in xfi(1), xfi(2), ..., xfi(a)

(8) If C1 is the sum of all the terms that fall in Case I and C2 is the sum of all terms that fall into Case II, we can see that:
σas1 = C1 + C2

(9) Now, it is clear that for a less than n:

C1 = Ï„(a,2) [See step #1 above]

(10) It is also clear that for a less than n:

C2 = (a+1)σa+1 since:

(a) σ(a+1) = ∑ (i=1,C(n,a+1)) xfi(1)*xfi(2)*...*xfi(a)*xfi(a+1)

(b) Now, removing all the terms that are case I, it is clear that there a+1 ways to reach each term in step (a).

For example, the combination x1*x2*...*xa*xa+1

This can be reached from:

x1 * (x2x3*...*xa+1) x2 * (x1x3*...*xa+1) ... xa+1*(x1x2*...*xa)

We can make the same argument for each possible term in C2

(11) Since C1 = σas1 - C2, it follows that:

τ(a,2) = σas1 - (a+1)σa+1

QED

Lemma 5:

if b ≥ 2, then:

τ(n,b) = σnsb-1

Proof:

(1) Ï„(n,b) = ∑ (i=1, n) ∑ (j=1, C(n-1,a-1)) xibxfj(1)*xfj(2)*...*xfj(a-1)

(2) Since b ≥ 2, we have:

Ï„(n,b) = x1bx2*...*xn + x2bx1*...*xn + ... + xnbx1*...*xn-1 =

= x1b-1x1x2*...*xn + x2b-1x1x2*...*xn + ... + xnb-1x1*...*xn =

= (x1b-1 + x2b-1 + ... + xnb-1)(x1x2*...xn)

(3) σn = x1*x2*...*xn

(4) sb-1 = x1b-1 + x2b-1 + ... + xnb-1

(5) It follows from step #2, that:

τ(n,b) = σnsb-1

QED

Theorem 6: Newton's Formula



Proof:

(1) Ï„(1,k) = sk [See Lemma 1 above]

(2) Using Lemma 3 above if k ≥ 3, we have:

τ(1,k) = σ1sk-1 - τ(2,k-1) = sk

(3) Assuming that k-1 ≥ 3, Lemma 3 gives us:

τ(2,k-1) = σ2sk-2 - τ(3,k-2)

(4) Following this pattern, if k ≤ n, we eventually get the following:

sk = σ1sk-1 - σ2sk-2 + ... + (-1)kτ(k-1,2)

(5) Using Lemma 4 above:

τ(k-1,2) = σk-1s1 - (k)σk

(6) Thus, we have:

sk = σ1sk-1 - σ2sk-2 + ... + (-1)kσk-1s1 + (-1)k+1kσk

(7) Subtracting sk from both sides and then multiplying -1 gives us:

0 = sk - σ1sk-1 + σ2sk-2 + ... + (-1)k-1σk-1s1 + (-1)kkσk

(8) If k is greater than n, then we eventually get to:

sk = σ1sk-1 - σ2sk-2 + ... + (-1)n+1τ(n,k+1-n)

(9) Using Lemma 5 above gives us:

sk = σ1sk-1 - σ2sk-2 + ... + (-1)n+1σnsk-n

(10) Subtracting sk from both sides and then multiplying by -1 gives us:

0 = sk - σ1sk-1 + σ2sk-2 + ... + (-1)nσnsk-n

(11) So, combining step #7 and step #10, if we assume that:

σ0 = 1 and if k is greater than n, σk=0, then:



QED

References

Monday, July 16, 2007

Newton's Identities: Newton's Formula

Newton's identities (see here for an introduction to Newton's identities) are relationships between the roots of a cubic polynomial and its coefficients. They were first presented by Albert Girard but were presented in even greater detail independently by Sir Isaac Newton.

In today's blog, I will present Newton's formula and show how it leads to Newton's identities. In a future blog, I will provide the proof for the formula.

Definition 1: Newton's Formula for Newton's Identities

For a polynomial of degree n, with roots { r1, ..., rn }:

For any integer i, let si be the sum based on the roots:

si = r1i + r2i + ... + rni.

For any integer j, Let σj be the elementary symmetric polynomial for j in n [see here for details if needed on elementary symmetric polynomials]

Then, Newton's formula is:



where σ0 = 1 and if k is greater than n, σk=0.

We can now use it to build some equations for si and σj.

If k=1, then we have:

s1 = σ1

If k ≥ 2, then we have:

sk = ∑ (i=1, k-1) (-1)i+1sk-iσi + (-1)k+1kσk

Using the above formula gives us:

s1 = σ1

s2 = s1σ1 - 2σ2

s3 = s2σ1 - s1σ2 + 3σ3

s4 = s3σ1 - s2σ2 + s1σ3 - 4σ4

s5 = s4σ1 - s3σ2 + s2σ3 - s1σ4 + 5σ5

...

Now, building each formula based on the previous formula gives us the following formulas for sk in term so of σi:

s1 = σ1

s2 = s1σ1 - 2σ2 = (σ1)σ1 - 2σ2 = σ12 - 2σ2

s3 = s2σ1 - s1σ2 + 3σ3 = (σ12 - 2σ2)σ1 - (σ1)σ2 + 3σ3 = σ13 -3σ1σ2 + 3σ3

s4 = s3σ1 - s2σ2 + s1σ3 - 4σ4 = σ1( σ13 -3σ1σ2 + 3σ3) - σ2(σ12 - 2σ2) + σ3(σ1) - 4σ4 =

= σ14 - 3σ12σ2 + 3σ1σ3 - σ12σ2 + 2σ22 + σ1σ3 - 4σ4 =

= σ14 -4σ12σ2 + 4σ1σ3 + 2σ22 - 4σ4


...

Further, we can use these same formulas for each σi so that:

σ1 = s1

σ2 = (1/2)[s1σ1 - s2]

σ3 = (1/3)[s3 - s2σ1 + s1σ2]

σ4 = (1/4)[-s4 + s3σ1 - s2σ2 + s1σ1

σ5 = (1/5)[s5 - s4σ1 + s3σ2 - s2σ3 + s1σ4]


Now, I will show these equations can be used to derive Newton's identities for cubic polynomials (that is, where n = 3). [See here for review of Newton's identities] where I am assuming an equation of the following form:

x3 + bx2 + cx + d = 0

Here are the justifications for each formula presented previously.

Identity 1:
r + s + t = -b

Proof:

σ1 = r + s + t
[See Definition 1, here]

σ1 = (-1)1(b) = -b
[See Lemma 1, here]

QED

Identity 2: r2 + s2 + t2 = b2 - 2c

Proof:

s2 = r2 + s2 + t2 [See Definition 1 above]

s2 = σ12 - 2σ2 [See formula above]

σ12 - 2σ2 = ((-1)1b)2 - 2(-1)2(c) = b2 - 2c.

QED

Identity 3: r3 + s3 + t3 = -b3 + 3bc - 3d

Proof:

s3 = r3 + s3 + t3 [See Definition 1 above]

s3 = σ13 -3σ1σ2 + 3σ3 [See formula above]

σ13 -3σ1σ2 + 3σ3 = [(-1)1b]3 - 3(-1)1b(-1)2c + 3(-1)3d =

= -b3 +3bc -3d.

QED

Identity 4: rs + rt + st = c

Proof:

σ2 = rs + rt + sr + st +tr + ts [See Definition 1, here]

σ2 = (-1)2c = c

QED

Identity 5: r2s + r2t + s2r + s2t + t2r + t2s = -bc + 3d

Proof:

(rs + rt + sr + st +tr + ts)(r + s + t) - 3rst = r2s + r2t + s2r + s2t + t2r + t2s

(rs + rt + sr + st +tr + ts)(r + s + t) - 3rst = σ2σ1 - 3σ3 = (-1)1b(-1)2c - 3*(-1)3d = -bc + 3d.

QED

Identity 6: r3s + r3t + s3r + s3t + t3r + t3s = b2c - 2c2 - bd

Proof:

(r3 + s3 + t3)(r + s + t) - (r4 + s4 + t4) = r3s + r3t + s3r + s3t + t3r + t3s

(r3 + s3 + t3)(r + s + t) - (r4 + s4 + t4) =( s3)(σ1) - s4

( s3)(σ1) - s4 = (σ13 -3σ1σ2 + 3σ3)(σ1) - (σ14 - 4σ12σ2 + 4σ1σ3 + 2σ22 - 4σ4) =

= σ12σ2 - σ1σ3 - 2σ22 + 4σ4 = [(-1)b]2(-1)2c - 2[(-1)2c]2 - [(-1)b(-1)3d] =

= b2c - 2c2 - bd + 0 = b2c - 2c2 - bd

QED

Identity 7: r2s2 + r2t2 + s2t2 = c2 - 2bd

Proof:

r2s2 + r2t2 + s2t2 = (1/2)(r2 + s2 + t2)(r2 + s2 + t2) - (1/2)(r4 + s4 + t4) - =

= (1/2)s2*s2 - (1/2)s4 =

= (1/2)(σ12 - 2σ2)(σ12 - 2σ2) - (1/2)( σ14 - 4σ12σ2 + 4σ1σ3 + 2σ22 - 4σ4) =

= (1/2)σ14 - 2σ12σ2 + 2σ22 - (1/2)σ14 + 2σ12σ2 - 2σ1σ3 - σ22 + 2σ4 =

= σ22 - 2σ1σ3 + 2σ4 =

= (c)2 - 2*(-1)(b)(-1)(d) + 2*(0) = c2 -2bd.

QED

Identity 8: r3s2 + r3t2 + s3r2 + s3t2 + t3r2 + t3s2 = -bc2 + 2b2d + cd

Proof:

(1) (r2s2 + r2t2 + s2t2)(r + s + t) - (rst)(rs + rt + st) =
r3s2 + r3t2 + s3r2 + s3t2 + t3r2 + t3s2

(2) (r2s2 + r2t2 + s2t2)(r + s + t) - (rst)(rs + rt + st) =

=(σ22 - 2σ1σ3 + 2σ4 )(σ1) - (σ3)(σ2) =

= σ1σ22 - 2σ12σ3 + 2σ1σ4 - σ2σ3 =

= (-1)b(c)2 - 2(b)2(-d) + 2(-b)(0) - (c)(-d) =

= -bc2 + 2b2d + cd

QED


Identity 9: r3s3 + r3t3 + s3t3 = c3 - 3bcd + 3d2

Proof:

(1) r3s3 + r3t3 + s3t3 = (r2s2 + r2t2 + s2t2)(rs + rt + st) - (r3s2t + r3st2 + s3r2t + s3t2r + t3r2s + t3s2r )

(2) (r2s2 + r2t2 + s2t2)(rs + rt + st) - (r3s2t + r3st2 + s3r2t + s3t2r + t3r2s + t3s2r ) = (c2 - 2bd)(σ2) - (bcd - 3d2) =

=
(c2 - 2bd)(c) - (bcd - 3d2) = c3 - 2bcd -bcd + 3d2 =

= c3 - 3bcd +3d2

QED

Identity 10: rst = -d

Proof:

σ3 = rst

σ3 = (-1)3d = -d

QED

Identity 11: r2st + s2rt + t2rs = bd

Proof:

r2st + s2rt + t2rs = rst(r + s + t) = σ3*σ1 = (-1)(d)(-1)(b) = bd

QED


Identity 12: r3st + s3rt + t3rs = -b2d + 2cd

Proof:

(rst)(r2 + s2 + t2) = r3st + s3rt + t3rs

(rst)(r2 + s2 + t2) = σ3s2 =

= σ3(
σ12 - 2σ2) = (-d)(b2 -2c) = -b2d + 2cd.

QED

Identity 13: r2s2t + r2st2 + rs2t2 = -cd

Proof:

(rst)(rt + rs + st) = r2s2t + r2st2 + rs2t2

(rst)(rt + rs + st) = σ3σ2 = (-d)(c) = -cd.

QED
Identity 14: r3s2t + r3st2 + s3r2t + s3t2r + t3r2s + t3s2r = bcd - 3d2

Proof:

(rst)[(rs + rt + st)(r + s + t) - 3rst] = r3s2t + r3st2 + s3r2t + s3t2r + t3r2s + t3s2r
(rst)[(rs + rt + st)(r + s + t) - 3rst] = σ3[(σ2)(σ1) - 3σ3] =
σ3[(σ2)(σ1) - 3σ3] = (-1)d[c(-b) - 3(-1)d] = (-d)[-bc + 3d] = bcd - 3d2

QED

Identity 15: r3s3t + r3t3s + s3t3r = -c2d + 2bd2

Proof:

(rst)[(1/2)(r2 + s2 + t2)(r2 + s2 + t2) - (1/2)(r4 + s4 + t4) ] = r3s3t + r3t3s + s3t3r

(rst)[(1/2)(r2 + s2 + t2)(r2 + s2 + t2) - (1/2)(r4 + s4 + t4) ] =

= σ3[(1/2)(s2)(s2) - (1/2)s4 ] =

= σ3[(1/2)(σ12 - 2σ2)(σ12 - 2σ2) - (1/2)( σ14 - 4σ12σ2 + 4σ1σ3 + 2σ22 - 4σ4)] =

= σ3[(1/2)σ14 - 2σ12σ2 + 2σ22 - (1/2)σ14 + 2σ12σ2 - 2σ1σ3 - 2σ22 + 2*0)] =

= σ3[σ22 -2σ1σ3] = (-1)d[(c)2 - 2*(-1)b(-1)d] =

= (-d)[c2 - 2bd] = -c2d + bd2.

QED

Identity 16: r2s2t2 = d2

Proof:

r2s2t2 = (rst)2 = (σ3)2 = [(-1)3d]2 = d2

QED

Identity 17: r3s2t2 + s3r2t2 + t3r2s2 = -bd2

Proof:

(rst)(rst)(r + s + t) = r3s2t2 + s3r2t2 + t3r2s2

(rst)(rst)(r + s + t) = σ32*σ1 = (-b)d2= -bd2

QED

Identity 18: r3s3t2 + r3t3s2 + s3t3r2 = cd2

Proof:

(rs + rt + st)(rst)(rst) =
r3s3t2 + r3t3s2 + s3t3r2

(rs + rt + st)(rst)(rst) = σ2*σ32 = (c)(d)2 = cd2

QED

Identity 19: r3s3t3 = -d3

Proof:

r3s3t3 = (rst)3 = (σ3)3 = (-d)3 = -d3

QED

References