Non-negative least squares


In mathematical optimization, the problem of non-negative least squares is a type of constrained least squares problem where the coefficients are not allowed to become negative. That is, given a matrix and a vector of response variables, the goal is to find
Here means that each component of the vector should be non-negative, and denotes the Euclidean norm.
Non-negative least squares problems turn up as subproblems in matrix decomposition, e.g. in algorithms for PARAFAC and non-negative matrix/tensor factorization. The latter can be considered a generalization of NNLS.
Another generalization of NNLS is bounded-variable least squares, with simultaneous upper and lower bounds.

Quadratic programming version

The NNLS problem is equivalent to a quadratic programming problem
where = and =. This problem is convex, as is positive semidefinite and the non-negativity constraints form a convex feasible set.

Algorithms

The first widely used algorithm for solving this problem is an active-set method published by Lawson and Hanson in their 1974 book Solving Least Squares Problems. In pseudocode, this algorithm looks as follows:
  • Inputs:
  • * a real-valued matrix of dimension,
  • * a real-valued vector of dimension,
  • * a real value, the tolerance for the stopping criterion.
  • Initialize:
  • * Set.
  • * Set.
  • * Set to an all-zero vector of dimension.
  • * Set.
  • * Let denote the sub-vector with indexes from R
  • Main loop: while and :
  • * Let in be the index of in.
  • * Add to.
  • * Remove from.
  • * Let be restricted to the variables included in.
  • * Let be vector of same length as.
Let denote the sub-vector with indexes from P
Let denote the sub-vector with indexes from R.
  • * Set
  • * Set to zero
  • * While :
  • ** Let.
  • ** Set to.
  • ** Move to all indices in such that.
  • ** Set
  • ** Set to zero.
  • * Set to.
  • * Set to.
  • Output: x
This algorithm takes a finite number of steps to reach a solution and smoothly improves its candidate solution as it goes, but is very slow in practice, owing largely to the computation of the pseudoinverse. Variants of this algorithm are available in MATLAB as the routine and in SciPy as.
Many improved algorithms have been suggested since 1974. Fast NNLS is an optimized version of the Lawson–Hanson algorithm. Other algorithms include variants of Landweber's gradient descent method, coordinate-wise optimization based on the quadratic programming problem above, and an active set method called TNT-NN.