Cayley–Hamilton theorem
In linear algebra, the Cayley–Hamilton theorem states that every square matrix over a commutative ring satisfies its own characteristic equation.
The characteristic polynomial of an matrix is defined as, where is the determinant operation, is a variable scalar element of the base ring, and is the identity matrix. Since each entry of the matrix is either constant or linear in, the determinant of is a degree- monic polynomial in, so it can be written as
By replacing the scalar variable with the matrix, one can define an analogous matrix polynomial expression,
The Cayley–Hamilton theorem states that this polynomial expression is equal to the zero matrix, which is to say that that is, the characteristic polynomial is an annihilating polynomial for
One use for the Cayley–Hamilton theorem is that it allows to be expressed as a linear combination of the lower matrix powers of :
When the ring is a field, the Cayley–Hamilton theorem is equivalent to the statement that the minimal polynomial of a square matrix divides its characteristic polynomial.
A special case of the theorem was first proved by Hamilton in 1853 in terms of inverses of linear functions of quaternions. This corresponds to the special case of certain real or complex matrices. Cayley in 1858 stated the result for and smaller matrices, but only published a proof for the case. As for matrices, Cayley stated “..., I have not thought it necessary to undertake the labor of a formal proof of the theorem in the general case of a matrix of any degree”. The general case was first proved by Ferdinand Frobenius in 1878.
Examples
1 × 1 matrices
For a matrix, the characteristic polynomial is given by, and so is trivial.2 × 2 matrices
As a concrete example, letIts characteristic polynomial is given by
The Cayley–Hamilton theorem claims that, if we define
then
We can verify by computation that indeed,
For a generic matrix,
the characteristic polynomial is given by, so the Cayley–Hamilton theorem states that
which is indeed always the case, evident by working out the entries of.
Applications
Determinant and inverse matrix
For a general invertible matrix, i.e., one with nonzero determinant, −1 can thus be written as an order polynomial expression in : As indicated, the Cayley–Hamilton theorem amounts to the identityThe coefficients are given by the elementary symmetric polynomials of the eigenvalues of. Using Newton identities, the elementary symmetric polynomials can in turn be expressed in terms of power sum symmetric polynomials of the eigenvalues:
where is the trace of the matrix. Thus, we can express in terms of the trace of powers of.
In general, the formula for the coefficients is given in terms of complete exponential Bell polynomials as
In particular, the determinant of equals. Thus, the determinant can be written as the trace identity:
Likewise, the characteristic polynomial can be written as
and, by multiplying both sides by , one is led to an expression for the inverse of as a trace identity,
Another method for obtaining these coefficients for a general matrix, provided no root be zero, relies on the following alternative expression for the determinant,
Hence, by virtue of the Mercator series,
where the exponential only needs be expanded to order, since is of order, the net negative powers of automatically vanishing by the C–H theorem. Differentiation of this expression with respect to allows one to express the coefficients of the characteristic polynomial for general as determinants of matrices,
; Examples
For instance, the first few Bell polynomials are = 1,,, and.
Using these to specify the coefficients of the characteristic polynomial of a matrix yields
The coefficient gives the determinant of the matrix, minus its trace, while its inverse is given by
It is apparent from the general formula for cn−''k'', expressed in terms of Bell polynomials, that the expressions
always give the coefficients of and of in the characteristic polynomial of any matrix, respectively. So, for a matrix, the statement of the Cayley–Hamilton theorem can also be written as
where the right-hand side designates a matrix with all entries reduced to zero. Likewise, this determinant in the case, is now
This expression gives the negative of coefficient of in the general case, as seen below.
Similarly, one can write for a matrix,
where, now, the determinant is,
and so on for larger matrices. The increasingly complex expressions for the coefficients is deducible from Newton's identities or the Faddeev–LeVerrier algorithm.
''n''-th power of matrix
The Cayley–Hamilton theorem always provides a relationship between the powers of , which allows one to simplify expressions involving such powers, and evaluate them without having to compute the power n or any higher powers of.As an example, for the theorem gives
Then, to calculate, observe
Likewise,
Notice that we have been able to write the matrix power as the sum of two terms. In fact, matrix power of any order can be written as a matrix polynomial of degree at most, where is the size of a square matrix. This is an instance where Cayley–Hamilton theorem can be used to express a matrix function, which we will discuss below systematically.
Matrix functions
Given an analytic functionand the characteristic polynomial of degree of an matrix, the function can be expressed using long division as
where is some quotient polynomial and is a remainder polynomial such that.
By the Cayley–Hamilton theorem, replacing by the matrix gives, so one has
Thus, the analytic function of the matrix can be expressed as a matrix polynomial of degree less than.
Let the remainder polynomial be
Since, evaluating the function at the eigenvalues of yields
This amounts to a system of linear equations, which can be solved to determine the coefficients. Thus, one has
When the eigenvalues are repeated, that is for some, two or more equations are identical; and hence the linear equations cannot be solved uniquely. For such cases, for an eigenvalue with multiplicity, the first derivatives of vanish at the eigenvalue. This leads to the extra linearly independent solutions
which, combined with others, yield the required equations to solve for.
Finding a polynomial that passes through the points is essentially an interpolation problem, and can be solved using Lagrange or Newton interpolation techniques, leading to Sylvester's formula.
For example, suppose the task is to find the polynomial representation of
The characteristic polynomial is, and the eigenvalues are. Let. Evaluating at the eigenvalues, one obtains two linear equations, and.
Solving the equations yields and. Thus, it follows that
If, instead, the function were, then the coefficients would have been and ; hence
As a further example, when considering
then the characteristic polynomial is, and the eigenvalues are.
As before, evaluating the function at the eigenvalues gives us the linear equations and ; the solution of which gives, and. Thus, for this case,
which is a rotation matrix.
Standard examples of such usage is the exponential map from the Lie algebra of a matrix Lie group into the group. It is given by a matrix exponential,
Such expressions have long been known for,
where the are the Pauli matrices and for,
which is Rodrigues' rotation formula. For the notation, see 3D rotation group#A note on Lie algebras.
More recently, expressions have appeared for other groups, like the Lorentz group, and, as well as. The group is the conformal group of spacetime, its simply connected cover. The expressions obtained apply to the standard representation of these groups. They require knowledge of the eigenvalues of the matrix to exponentiate. For , closed expressions have been obtained for all irreducible representations, i.e. of any spin.
File:GeorgFrobenius.jpg|220px|thumb|right|Ferdinand Georg Frobenius, German mathematician. His main interests were elliptic functions, differential equations, and later group theory.
In 1878 he gave the first full proof of the Cayley-Hamilton theorem.
Algebraic number theory
The Cayley–Hamilton theorem is an effective tool for computing the minimal polynomial of algebraic integers. For example, given a finite extension of and an algebraic integer which is a non-zero linear combination of the we can compute the minimal polynomial of by finding a matrix representing the -linear transformationIf we call this transformation matrix, then we can find the minimal polynomial by applying the Cayley–Hamilton theorem to.
Proofs
The Cayley–Hamilton theorem is an immediate consequence of the existence of the Jordan normal form for matrices over algebraically closed fields, see. In this section, direct proofs are presented.As the examples above show, obtaining the statement of the Cayley–Hamilton theorem for an matrix
requires two steps: first the coefficients of the characteristic polynomial are determined by development as a polynomial in of the determinant
and then these coefficients are used in a linear combination of powers of that is equated to the zero matrix:
The left-hand side can be worked out to an matrix whose entries are polynomial expressions in the set of entries of, so the Cayley–Hamilton theorem states that each of these expressions equals. For any fixed value of, these identities can be obtained by tedious but straightforward algebraic manipulations. None of these computations, however, can show why the Cayley–Hamilton theorem should be valid for matrices of all possible sizes, so a uniform proof for all is needed.
Preliminaries
If a vector of size is an eigenvector of with eigenvalue, in other words if, thenwhich is the zero vector since . This holds for all possible eigenvalues, so the two matrices equated by the theorem certainly give the same result when applied to any eigenvector. Now if admits a basis of eigenvectors, in other words if is diagonalizable, then the Cayley–Hamilton theorem must hold for, since two matrices that give the same values when applied to each element of a basis must be equal.
Consider now the function which maps matrices to matrices given by the formula, i.e. which takes a matrix and plugs it into its own characteristic polynomial. Not all matrices are diagonalizable, but for matrices with complex coefficients many of them are: the set of diagonalizable complex square matrices of a given size is dense in the set of all such square matrices. Now viewed as a function we see that this function is continuous. This is true because the entries of the image of a matrix are given by polynomials in the entries of the matrix. Since
and since the set is dense, by continuity this function must map the entire set of matrices to the zero matrix. Therefore, the Cayley–Hamilton theorem is true for complex numbers, and must therefore also hold for - or -valued matrices.
While this provides a valid proof, the argument is not very satisfactory, since the identities represented by the theorem do not in any way depend on the nature of the matrix, nor on the kind of entries allowed. We shall therefore now consider only arguments that prove the theorem directly for any matrix using algebraic manipulations only; these also have the benefit of working for matrices with entries in any commutative ring.
There is a great variety of such proofs of the Cayley–Hamilton theorem, of which several will be given here. They vary in the amount of abstract algebraic notions required to understand the proof. The simplest proofs use just those notions needed to formulate the theorem, but involve technical computations that render somewhat mysterious the fact that they lead precisely to the correct conclusion. It is possible to avoid such details, but at the price of involving more subtle algebraic notions: polynomials with coefficients in a non-commutative ring, or matrices with unusual kinds of entries.