Fermat's Last Theorem: 2008-07-27

It is perhaps a bit surprising that a study of functions with multiple parameters led to the proof by Niels Abel that the quintic equation was not solvable by radicals.

In today's blog, I will focus on a very interesting result from Augustin-Louis Cauchy. He discovered that there are limits to the number of values a function of multiple parameters can take when you change the order of the parameters. The content in today's blog is taken from Peter Pesic's Abel's Proof.

Joseph-Louis Lagrange had shown that the number of values a function of n parameters can take from permuting parameters will necessarily divide n!. I went over this result in a previous entry.

Cauchy found additional limits on the number of values a function can take. For a function with n parameters, if p is the highest prime that divides n, then the function can take 1, 2, or at least p possible values from permuting the order of its parameters.

That a function with n parameteres can take 1 value is easy to show. Consider the following function:

f(x₁, x₂, ..., x_n) = x₁ + x₂ + ... + x_n

Clearly, swapping any two parameters doesn't change the value so it is clear that no permutation will change its value. In this case, the function of n parameters can only take on 1 value.

It is also easy to show a function of n parameters can take on 2 values. Consider the following function:

f(x₁, x₂, ..., x_n) = (x₁ - x₂) *(x₁ - x₃)*... *(x₁ - x_n)*...*(x_n-1 - x_n)

Now, any swap of two parameters will keep the absolute value but change the sign of the function. So, the absolute value stays the same.

So, this is where Cauchy's Theorem comes in. Cauchy proves that if a function of n parameters takes on more than 2 values, then it necessarily takes on at least p values where p is the highest prime dividing n.

Here's an example where this occurs. Consider the following function:

f(x₁, x₂, x₃, x₄, x₅) = (x₁ + x₂ + x₃ + x₄) - x₅

How many values can this function take on if we permute the parameters? Since swapping x₅ changes the value of the function, we can see that there are at least 5 possible values (1 + 4) that the function can take.

Let's start out with some definitions:

Definition 1: f(P)^u

Let P be a permutation and let f(P)^u represent the value of the function after the permutation P is applied to the function f, u times. I will use f(P)⁰ to mean that the permutation has been applied 0 times.

Definition 2: Order of a Permutation

A permutation P is said to be of order k if P^k(f(x₁, ..., x_n)) = f(x₁, ..., x_n). Using the above notation, this means that f(P)^k = f(P)⁰.

In other words, the permutation after being applied k times return the function to its original ordering of the parameters.

Definition 3: f(P)^-u

By -u, I mean the inverse of the permutation. Since each permutation is one-to-one and onto, (see here for review if needed), it follows that it has an inverse which is also a permutation. f(P)^-u means that we apply the inverse of the permutation u times.

This gives us the result that if we apply the permutation P u times to f(P)^-u, we are left with f(P)^-u+u = f(P)⁰

Now, I will use these definitions in the following lemmas from Cauchy:

Lemma 1:

Let f be a function that takes n parameters.

Let p be the largest prime that divides n

Let m be the number of values that f takes on when we permute the order of f's parameters.

If m is less than p, then any permutation of order p will leave the value of f unchanged.

Proof:

(1) Let P be a permutation of order p such as (x₁ → x₂ → x₃ → ... → x_p → x₁)

(2) So, we have f(P)^p = f(P)⁰

(3) Now, let's consider the set of p-1 orderings that f takes as we apply P to f:

f(P)⁰, f(P)¹, f(P)², ..., f(P)^p-1

(4) Since we are assuming that m is at most p-1, it follows that two of these values must be the same.

(5) Let's label them r and r' where 0 ≤ r' ≤ p-2 and 1 ≤ r ≤ p-1 and further r' is less than r so that:

f(P)^r = f(P)^r'

(6) Now since both r, r' are less than p, we know that p-r ≥ 1.

(7) Now, if we apply the permutation P to both values (p-r) times, we get:

f(P)^r+p-r = f(P)^r'+p-r

(8) Since r+p-r = p and f(P)^p = f(P)⁰, we are left with:

f(P)⁰ = f(P)^r'+p-r

(9) Let j = r'+p-r so that we have:

f(P)⁰ = f(P)^j

(10) We can see that f(P)⁰ = f(P)^bj where b is any integer that we choose since:

(a) For b=1, this is clearly the case from step #9.

(b) We assume it is true up to b-1 so that:

f(P)⁰ = f(P)^(b-1)j = f(P)^bj-j

(c) Now, we apply P to each side j times so that we have:

f(P)^0+j = f(P)^bj-j+j

(d) This gives us that:

f(P)^j = f(P)^bj

(e) Applying step #9 gives us:

f(P)⁰ = f(P)^bj

(f) We can make the same argument for the case when b is negative.

(g) Let P^-1 be the inverse permutation of P.

(h) Apply P^-1 j times to each value in step #9 gives us:

f(P)^-j = f(P)⁰

(h) Now, we assume that it is true up to b-1 so that we have:

f(P)^-j(b-1) = f(P)^-bj+jf(P)⁰

(i) Now, we apply P^-1 j times to each side to get:

f(P)^-bj = f(P)^-j

(j) Applying step #10i gives us:

f(P)^-bj = f(P)⁰

(11) Now, we know that j is a number less than p since j = r' + p - r and r is greater than r'.

(12) Since p is prime, it follows that gcd(p,j)=1 and using Bezout's Identity, we know that there exists integers a',b such that:
a'p + bj = 1

(13) Let a = -a', then we have:

-ap + bj = 1

which is the same as:

bj = ap + 1

(14) Appyling this equation to f gives us:
f(P)^bj = f(P)^ap+1

(15) From step #10 this gives us that:

f(P)⁰ = f(P)^ap+1

(16) But from step #2, this gives us that:

f(P)⁰ = f(P)^ap
(17) This means that:

f(P)^ap = f(P)^ap+1

(18) But this is only possible if P doesn't change the value of the function.

QED

Let's explore Lemma 1 in more detail. Consider our function with 2 values where:

f(x₁, x₂, x₃) = (x₁ - x₂)*(x₁ - x₃)*(x₂ - x₃)

Since n=3, the highest prime is 3. Let's see how many permutations there are with the order of 3.

P = (x₁ → x₂ → x₃ → x₁)

So, applying P once gives us:

f(P)¹ = (x₂ - x₃)*(x₂ - x₁)*(x₃ - x₁) = (x₂ - x₃)*(-1)*(x₁ - x₂)*(-1)*(x₁ - x₃) = (x₁ - x₂)*(x₁ - x₃)*(x₂ - x₃)

So, we see that Lemma 1 holds.

Now, it turns out that this lemma has a surprising corollary.

Corollary 1.1:

Let f be a function that takes n parameters.

Let p be the largest prime that divides n

Let m be the number of values that f takes on when we permute the order of f's parameters.

If m is less than p, then any permutation of order 3 will leave the value of f unchanged.

Proof:

(1) Let P₁ be a permutation of order 3 which we can define as x_i₁ → x_i₂ → x_i₃ → x_i₁

(2) I will show that it is equivalent to application of two permutations of order p.

(3) Let P₂ be the permutation of order p such that:

x_i₁ → x_i₂ → ... → x_{i_p}

We can view this as (1234...p)

(4) Let P₃ be another permutation of order p such that:

x_i₂ → x_i₃, x_i₃ → x_i₁, x_i₄ → x_i₂, x_i₅ → x_i₄, x_i₆ → x_i₅, ..., x_i₁ → x_{i_p}

We can view this as (1p...5423)

(5) Now, if we perform the first permutation (P₂) and then the second (P₃), we get:

x_i₁ → x_i₂ → x_i₃, x_i₂ → x_i₃ → x_i₁, x_i₃ → x_i₄ → x_i₂, x_i₄ → x_i₅ → x_i₄, x_i₅ → x_i₆ → x_i₅, ..., x_{i_p} → x_i₁ → x_{i_p}

(6) In other words, we have the three-order permutation:

x_i₁ → x_i₂ → x_i₃ → x_i₁

(7) Since both order-p permutations of step #3 and step #4 do not change the value of the function and since P₁ = P₃(P₂), it follows P₁ does not change the value of the function.

QED

Let's see this result in action. Let's revise our example above to now have 5 parameters so that we have n=5, p=5 and the number of different values are 2:

f(x₁, x₂, x₃, x₄, x₅) = (x₁ - x₂)*(x₁ - x₃)*...*(x₂ - x₃)...*(x₄ - x₅)

Now, let's see what happens when we apply the following permutation of order 3:

P = (x₁ → x₂ → x₃ → x₁)

We get:

f(P)¹ = (x₂ - x₃)*(x₂ - x₄)*...*(x₄ - x₅)

Now, we only need to consider the cases where the the first element is greater than the second element. This occurred twice so that we now have:

(x₂ - x₁)*(x₃ - x₁)

But if it occurs only twice, then we have:

(-1)*(x₁ - x₂)*(-1)*(x₁ - x₃) = (x₁-x₂)*(x₁ - x₃)

So, we see that the corrollary holds.

Cauchy was able to take this result one step farther:

Corollary 1.2:

Let f be a function that takes n parameters.

Let p be the largest prime that divides n

Let m be the number of values that f takes on when we permute the order of f's parameters.

If m is less than p, then application of any two permutation of order 2 will leave the value of f unchanged.

Proof:

(1) Let P₁ be an order-2 permutation such that:

x_j₁ → x_j₂

(2) Let P₂ be an order-2 permutation with overlap with P₁

x_j₂ → x_j₃

(3) So, we can define an order-3 permutation that is equivalent to the application of these two permutations of order 2:

x_j₁ → x_j₂ → x_j₃

(4) Again, we know that application of the two overlapping order-2 permutations cannot change the value. If they did, this would imply that the above order-3 permutation would also change the value which goes against Corollary 1.1 above.

(5) Let's redefine P₂ so that it does not overlap with P₁ so that:

x_j₃ → x_j₄

(6) But in this case, it is equivalent to application of two order-3 permutations.

(7) Let us define P₃ as:

x_j₁ → x_j₂→ x_j₃

(8) And define P₄ as:

x_j₃ → x_j₁ → x_j₄

(9) We can see that they are equivalent since we now have:

x_j₁ → x_j₂ → x_j₂

x_j₂ → x_j₃ → x_j₁

x_j₃ → x_j₁ → x_j₄

x_j₄ → x_j₄ → x_j₃

(10) Again, the application of two order-2 permutations cannot change the value since the application of two order-3 permutations cannot change the value.

QED

Now, we are ready for Cauchy's main theorem.

Theorem 2: For a function with n parameters, if p is the highest prime that divides n, then the function can take 1, 2, or at least p possible values from permuting the order of its parameters.

Proof:

(1) To prove this theorem, we only need to prove that if the number of values is less than p, then it is 1 or 2.

(2) Assume that the number of values that the function takes is less than p.

(3) Assume further that the number of values is at least 3.

(4) Let's label them V₁, V₂, V₃

(5) Let's define a permutation P₁ as the reordering of V₁ so that it becomes V₂ so that:

Let V₁ = f(x_i₁, x_i₂, ..., x_{i_n})

Let V₂ = f(x_j₁, x_j₂, ..., x_{j_n})

Then P₁ = (x_i₁ → x_j₁, ..., x_{i_n} → x_{j_n})

(6) Now, we know that P₁ is an order 2 permutation since:

(a) We can view P₁ as sequence of order 2 permutations (see step 5 above)

(b) Each of the permutations is either changing the value or keeping the value. By Corollary 1.2 above, if one of the permutations changes the value, then any order-2 permutation applied to it must undo the change.

(c) So, clearly if the value changes, only of the sequence of order-2 permutations changes the value.

(7) Using the same logic as step #5, we can define a permutation that changes V₂ to V₃ and we can assume that this permutation consists of a sequence of order-2 permutations.

(8) But by the same logic as step #6, P₂ must also be an order-2 permutation.

(9) But now we have a contradiction since if we can apply P₁ to V₁ to change the value to V₂ and we can apply P₂ to V₂ to change the value to V₃, then we have a situation where application of two 2-order permutations results in a change of value which contradicts Corollary 1.2.

(10) Therefore, we must reject our assumption in step #3 and assume that the number of values is at most 2.

QED

References

Peter Pesic, Abel's Proof: An Essay on the Sources and Meaning of Mathematical Unsolvability, The MIT Press, 2004.

Fermat's Last Theorem

Saturday, August 02, 2008

Cauchy's Theorem on the Permutations of a Function

Topic Index

Completed Proofs

Recommended Books

Required Reading for Experts

About Me

Blog Archive