Permutation matrix

In mathematics, particularly in matrix theory, a permutation matrix is a square binary matrix that has exactly one entry of 1 in each row and each column with all other entries 0. An permutation matrix can represent a permutation of elements. Pre-multiplying an -row matrix by a permutation matrix, forming, results in permuting the rows of, while post-multiplying an -column matrix, forming, permutes the columns of.
Every permutation matrix P is orthogonal, with its inverse equal to its transpose:. Indeed, permutation matrices can be characterized as the orthogonal matrices whose entries are all non-negative.

The two permutation/matrix correspondences

There are two natural one-to-one correspondences between permutations and permutation matrices, one of which works along the rows of the matrix, the other along its columns. Here is an example, starting with a permutation in two-line form at the upper left:
The row-based correspondence takes the permutation to the matrix at the upper right. The first row of has its 1 in the third column because. More generally, we have where when and otherwise.
The column-based correspondence takes to the matrix at the lower left. The first column of has its 1 in the third row because. More generally, we have where is 1 when and 0 otherwise. Since the two recipes differ only by swapping i with j, the matrix is the transpose of ; and, since is a permutation matrix, we have. Tracing the other two sides of the big square, we have and.

Permutation matrices permute rows or columns

Multiplying a matrix M by either or on either the left or the right will permute either the rows or columns of M by either or ⁻¹. The details are a bit tricky.
To begin with, when we permute the entries of a vector by some permutation, we move the entry of the input vector into the slot of the output vector. Which entry then ends up in, say, the first slot of the output? Answer: The entry for which, and hence. Arguing similarly about each of the slots, we find that the output vector is
even though we are permuting by, not by. Thus, in order to permute the entries by, we must permute the indices by.
Now, suppose that we pre-multiply some n-row matrix by the permutation matrix. By the rule for matrix multiplication, the entry in the product is
where is 0 except when, when it is 1. Thus, the only term in the sum that survives is the term in which, and the sum reduces to. Since we have permuted the row index by, we have permuted the rows of M themselves by. A similar argument shows that post-multiplying an n-column matrix M by permutes its columns by.
The other two options are pre-multiplying by or post-multiplying by, and they permute the rows or columns respectively by ⁻¹, instead of by.

The transpose is also the inverse

A related argument proves that, as we claimed above, the transpose of any permutation matrix P also acts as its inverse, which implies that P is invertible. If, then the entry of its transpose is. The entry of the product is then
Whenever, the term in this sum is the product of two different entries in the column of P; so all terms are 0, and the sum is 0. When, we are summing the squares of the entries in the row of P, so the sum is 1. The product is thus the identity matrix. A symmetric argument shows the same for, implying that P is invertible with.

Multiplying permutation matrices

Given two permutations of elements and, the product of the corresponding column-based permutation matrices and is given, as you might expect, by
where the composed permutation applies first and then, working from right to left:
This follows because pre-multiplying some matrix by and then pre-multiplying the resulting product by gives the same result as pre-multiplying just once by the combined.
For the row-based matrices, there is a twist: The product of and is given by
with applied before in the composed permutation. This happens because we must post-multiply to avoid inversions under the row-based option, so we would post-multiply first by and then by.
Some people, when applying a function to an argument, write the function after the argument, rather than before it. When doing linear algebra, they work with linear spaces of row vectors, and they apply a linear map to an argument by using the map's matrix to post-multiply the argument's row vector. They often use a left-to-right composition operator, which we here denote using a semicolon; so the composition is defined either by
or, more elegantly, by
with applied first. That notation gives us a simpler rule for multiplying row-based permutation matrices:

Matrix group

When is the identity permutation, which has for all i, both and are the identity matrix.
There are permutation matrices, since there are permutations and the map is a one-to-one correspondence between permutations and permutation matrices. By the formulas above, those permutation matrices form a group of order under matrix multiplication, with the identity matrix as its identity element, a group that we denote. The group is a subgroup of the general linear group of invertible matrices of real numbers. Indeed, for any field F, the group is also a subgroup of the group, where the matrix entries belong to F.
Let denote the symmetric group, or group of permutations, on where the group operation is the standard, right-to-left composition ""; and let denote the opposite group, which uses the left-to-right composition "". The map that takes to its column-based matrix is a faithful representation, and similarly for the map that takes to.

Doubly stochastic matrices

Every permutation matrix is doubly stochastic. The set of all doubly stochastic matrices is called the Birkhoff polytope, and the permutation matrices play a special role in that polytope. The Birkhoff–von Neumann theorem says that every doubly stochastic real matrix is a convex combination of permutation matrices of the same order, with the permutation matrices being precisely the extreme points of the Birkhoff polytope. The Birkhoff polytope is thus the convex hull of the permutation matrices.

Linear-algebraic properties

Just as each permutation is associated with two permutation matrices, each permutation matrix is associated with two permutations, as we can see by relabeling the example in the big square above starting with the matrix P at the upper right:
So we are here denoting the inverse of C as and the inverse of R as. We can then compute the linear-algebraic properties of P from some combinatorial properties that are shared by the two permutations and.
A point is fixed by just when it is fixed by, and the trace of P is the number of such shared fixed points. If the integer k is one of them, then the standard basis vector is an eigenvector of P.
To calculate the complex eigenvalues of P, write the permutation as a composition of disjoint cycles, say. For, let the length of the cycle be, and let be the set of complex solutions of, those solutions being the roots of unity. The multiset union of the is then the multiset of eigenvalues of P. Since writing as a product of cycles would give the same number of cycles of the same lengths, analyzing would give the same result. The multiplicity of any eigenvalue v is the number of i for which contains v.
From group theory we know that any permutation may be written as a composition of transpositions. Therefore, any permutation matrix factors as a product of row-switching elementary matrices, each of which has determinant −1. Thus, the determinant of the permutation matrix P is the sign of the permutation, which is also the sign of.

Restricted forms

Costas array, a permutation matrix in which the displacement vectors between the entries are all distinct
n-queens puzzle, a permutation matrix in which there is at most one entry in each diagonal and antidiagonal