Penalty method

In mathematical optimization, penalty methods are a certain class of algorithms for solving constrained optimization problems.
A penalty method replaces a constrained optimization problem by a series of unconstrained problems whose solutions ideally converge to the solution of the original constrained problem. The unconstrained problems are formed by adding a term, called a penalty function, to the objective function that consists of a penalty parameter multiplied by a measure of violation of the constraints. The measure of violation is nonzero when the constraints are violated and is zero in the region where constraints are not violated.

Description

Let us say we are solving the following constrained problem:
subject to
This problem can be solved as a series of unconstrained minimization problems
where
In the above equations, is the exterior penalty function while is the penalty coefficient. When the penalty coefficient is 0, f_p = f, meaning that we do not take the constraints into account.
In each iteration of the method, we increase the penalty coefficient, solve the unconstrained problem and use the solution as the initial guess for the next iteration. Solutions of the successive unconstrained problems will asymptotically converge to the solution of the original constrained problem.
Common penalty functions in constrained optimization are the quadratic penalty function and the deadzone-linear penalty function.

Convergence

We first consider the set of global optimizers of the original problem, X*.Assume that the objective f has bounded level sets, and that the original problem is feasible. Then:

For every penalty coefficient p, the set of global optimizers of the penalized problem, X_p*, is non-empty.
For every ε>0, there exists a penalty coefficient p such that the set X_p* is contained in an ε-neighborhood of the set X*.

This theorem is helpful mostly when f_p is convex, since in this case, we can find the global optimizers of f_p.
A second theorem considers local optimizers. Let x* be a non-degenerate local optimizer of the original problem. Then, there exists a neighborhood V* of x*, and some p₀>0, such that for all p>''p₀, the penalized objective f_p has exactly one critical point in V*, and x* approaches x* as p''→∞. Also, the objective value f is weakly-increasing with p.

Practical applications

Image compression optimization algorithms can make use of penalty functions for selecting how best to compress zones of colour to single representative values. The penalty method is often used in computational mechanics, especially in the Finite element method, to enforce conditions such as e.g. contact.
The advantage of the penalty method is that, once we have a penalized objective with no constraints, we can use any unconstrained optimization method to solve it. The disadvantage is that, as the penalty coefficient p grows, the unconstrained problem becomes ill-conditioned - the coefficients are very large, and this may cause numeric errors and slow convergence of the unconstrained minimization.