Mercer's theorem

In mathematics, specifically functional analysis, Mercer's theorem is a representation of a symmetric positive-definite function on a square as a sum of a convergent sequence of product functions. This theorem, presented in, is one of the most notable results of the work of James Mercer. It is an important theoretical tool in the theory of integral equations; it is used in the Hilbert space theory of stochastic processes, for example the Karhunen–Loève theorem; and it is also used in the reproducing kernel Hilbert space theory where it characterizes a symmetric positive-definite kernel as a reproducing kernel.

Introduction

To explain Mercer's theorem, we first consider an important special case; see [|below] for a more general formulation.
A kernel, in this context, is a symmetric continuous function
where for all.
K is said to be a positive-definite kernel if and only if
for all finite sequences of points x₁, ..., x_n of and all choices of real numbers c₁, ..., c_n. Note that the term "positive-definite" is well-established in literature despite the weak inequality in the definition.
The fundamental characterization of stationary positive-definite kernels is given by Bochner's theorem. It states that a continuous function is positive-definite if and only if it can be expressed as the Fourier transform of a finite non-negative measure :
This spectral representation reveals the connection between positive definiteness and harmonic analysis, providing a stronger and more direct characterization of positive definiteness than the abstract definition in terms of inequalities when the kernel is stationary, e.g, when it can be expressed as a 1-variable function of the distance between points rather than the 2-variable function of the positions of pairs of points.
Associated to K is a linear operator on functions defined by the integral
We assume can range through the space
of real-valued square-integrable functions L²; however, in many cases the associated reproducing kernel Hilbert space can be strictly larger than L². Since T_K is a linear operator, the eigenvalues and eigenfunctions of T_K exist.
Theorem. Suppose K is a continuous symmetric positive-definite kernel. Then there is an orthonormal basis
_i of L² consisting of eigenfunctions of T_K such that the corresponding
sequence of eigenvalues _i is nonnegative. The eigenfunctions corresponding to non-zero eigenvalues are continuous on and K has the representation
where the convergence is absolute and uniform.

Details

We now explain in greater detail the structure of the proof of
Mercer's theorem, particularly how it relates to spectral theory of compact operators.

The map K ↦ T_K is injective.T_K is a non-negative symmetric compact operator on L²; moreover K ≥ 0.

To show compactness, show that the image of the unit ball of L² under T_K is equicontinuous and apply Ascoli's theorem, to show that the image of the unit ball is relatively compact in C with the uniform norm and a fortiori in L².
Now apply the spectral theorem for compact operators on Hilbert
spaces to T_K to show the existence of the
orthonormal basis _i of
L²
If λ_i ≠ 0, the eigenvector e_i is seen to be continuous on . Now
which shows that the sequence
converges absolutely and uniformly to a kernel K₀ which is easily seen to define the same operator as the kernel K. Hence K=''K₀ from which Mercer's theorem follows.
Finally, to show non-negativity of the eigenvalues one can write and expressing the right hand side as an integral well-approximated by its Riemann sums, which are non-negative
by positive-definiteness of K'', implying, implying.

Trace

The following is immediate:
Theorem. Suppose K is a continuous symmetric positive-definite kernel; T_K has a sequence of nonnegative
eigenvalues _i. Then
This shows that the operator T_K is a trace class operator and

Generalizations

Mercer's theorem itself is a generalization of the result that any symmetric positive-semidefinite matrix is the Gramian matrix of a set of vectors.
The first generalization replaces the interval with any compact Hausdorff space and Lebesgue measure on is replaced by a finite countably additive measure μ on the Borel algebra of X whose support is X. This means that μ > 0 for any nonempty open subset U of X.
A recent generalization replaces these conditions by the following: the set X is a first-countable topological space endowed with a Borel measure μ. X is the support of μ and, for all x in X, there is an open set U containing x and having finite measure. Then essentially the same result holds:
Theorem. Suppose K is a continuous symmetric positive-definite kernel on X. If the function κ is L¹_μ, where κ := K for all x in X, then there is an orthonormal set
_i of L²_μ consisting of eigenfunctions of T_K such that corresponding
sequence of eigenvalues _i is nonnegative. The eigenfunctions corresponding to non-zero eigenvalues are continuous on X and K has the representation
where the convergence is absolute and uniform on compact subsets of X.
The next generalization deals with representations of measurable kernels.
Let be a σ-finite measure space. An L² kernel on X is a function
L² kernels define a bounded operator T_K by the formula
T_K is a compact operator. If the kernel K is symmetric, by the spectral theorem, T_K has an orthonormal basis of eigenvectors. Those eigenvectors that correspond to non-zero eigenvalues can be arranged in a sequence _i.
Theorem. If K is a symmetric positive-definite kernel on, then
where the convergence in the L² norm. Note that when continuity of the kernel is not assumed, the expansion no longer converges uniformly.

Mercer's condition

A real-valued function K is said to fulfill Mercer's condition if for all square-integrable functions g one has

Discrete analog

This is analogous to the definition of a positive-semidefinite matrix. This is a matrix of dimension, which satisfies, for all vectors, the property

Examples

A positive constant function
satisfies Mercer's condition, as then the integral becomes by Fubini's theorem
which is indeed non-negative.