Point-set registration


In computer vision, pattern recognition, and robotics, point-set registration, also known as point-cloud registration or scan matching, is the process of finding a spatial transformation that aligns two point clouds. The purpose of finding such a transformation includes merging multiple data sets into a globally consistent model, and mapping a new measurement to a known data set to identify features or to estimate its pose. Raw 3D point cloud data are typically obtained from Lidars and RGB-D cameras. 3D point clouds can also be generated from computer vision algorithms such as triangulation, bundle adjustment, and more recently, monocular image depth estimation using deep learning. For 2D point set registration used in image processing and feature-based image registration, a point set may be 2D pixel coordinates obtained by feature extraction from an image, for example corner detection. Point cloud registration has extensive applications in autonomous driving, motion estimation and 3D reconstruction, object detection and pose estimation, robotic manipulation, simultaneous localization and mapping, panorama stitching, virtual and augmented reality, and medical imaging.
As a special case, registration of two point sets that only differ by a 3D rotation, is called the Wahba Problem and also related to the orthogonal procrustes problem.

Formulation

The problem may be summarized as follows:
Let be two finite size point sets in a finite-dimensional real vector space, which contain and points respectively. The problem is to find a transformation to be applied to the moving "model" point set such that the difference between and the static "scene" set is minimized. In other words, a mapping from to is desired which yields the best alignment between the transformed "model" set and the "scene" set. The mapping may consist of a rigid or non-rigid transformation. The transformation model may be written as, using which the transformed, registered model point set is:
The output of a point set registration algorithm is therefore the optimal transformation such that is best aligned to, according to some defined notion of distance function :
where is used to denote the set of all possible transformations that the optimization tries to search for. The most popular choice of the distance function is to take the square of the Euclidean distance for every pair of points:
where denotes the vector 2-norm, is the corresponding point in set that attains the shortest distance to a given point in set after transformation. Minimizing such a function in rigid registration is equivalent to solving a least squares problem.

Types of algorithms

When the correspondences are given before the optimization, for example, using feature matching techniques, then the optimization only needs to estimate the transformation. This type of registration is called correspondence-based registration. On the other hand, if the correspondences are unknown, then the optimization is required to jointly find out the correspondences and transformation together. This type of registration is called simultaneous pose and correspondence registration.

Rigid registration

Given two point sets, rigid registration yields a rigid transformation which maps one point set to the other. A rigid transformation is defined as a transformation that does not change the distance between any two points. Typically such a transformation consists of translation and rotation. In rare cases, the point set may also be mirrored. In robotics and computer vision, rigid registration has the most applications.

Non-rigid registration

Given two point sets, non-rigid registration yields a non-rigid transformation which maps one point set to the other. Non-rigid transformations include affine transformations such as scaling and shear mapping. However, in the context of point set registration, non-rigid registration typically involves nonlinear transformation. If the eigenmodes of variation of the point set are known, the nonlinear transformation may be parametrized by the eigenvalues. A nonlinear transformation may also be parametrized as a thin plate spline.

Other types

Some approaches to point set registration use algorithms that solve the more general graph matching problem. However, the computational complexity of such methods tend to be high and they are limited to rigid registrations.
In this article, we will only consider algorithms for rigid registration, where the transformation is assumed to contain 3D rotations and translations.
The PCL is an open-source framework for n-dimensional point cloud and 3D geometry processing. It includes several point registration algorithms.

Correspondence-based registration

Correspondence-based methods assume the putative correspondences are given for every point. Therefore, we arrive at a setting where both point sets and have points and the correspondences are given.

Outlier-free registration

In the simplest case, one can assume that all the correspondences are correct, meaning that the points are generated as follows:where is a uniform scaling factor, is a proper 3D rotation matrix, is a 3D translation vector and models the unknown additive noise. Specifically, if the noise is assumed to follow a zero-mean isotropic Gaussian distribution with standard deviation, i.e.,, then the following optimization can be shown to yield the maximum likelihood estimate for the unknown scale, rotation and translation:Note that when the scaling factor is 1 and the translation vector is zero, then the optimization recovers the formulation of the Wahba problem. Despite the non-convexity of the optimization due to non-convexity of the set, seminal work by Berthold K.P. Horn showed that actually admits a closed-form solution, by decoupling the estimation of scale, rotation and translation. Similar results were discovered by Arun et al. In addition, in order to find a unique transformation, at least non-collinear points in each point set are required.
More recently, Briales and Gonzalez-Jimenez have developed a semidefinite relaxation using Lagrangian duality, for the case where the model set contains different 3D primitives such as points, lines and planes. Interestingly, the semidefinite relaxation is empirically tight, i.e., a certifiably globally optimal solution can be extracted from the solution of the semidefinite relaxation.

Robust registration

The least squares formulation is known to perform arbitrarily badly in the presence of outliers. An outlier correspondence is a pair of measurements that departs from the generative model. In this case, one can consider a different generative model as follows:where if the th pair is an inlier, then it obeys the outlier-free model, i.e., is obtained from by a spatial transformation plus some small noise; however, if the th pair is an outlier, then can be any arbitrary vector. Since one does not know which correspondences are outliers beforehand, robust registration under the generative model is of paramount importance for computer vision and robotics deployed in the real world, because current feature matching techniques tend to output highly corrupted correspondences where over of the correspondences can be outliers.
Next, we describe several common paradigms for robust registration.

Maximum consensus

seeks to find the largest set of correspondences that are consistent with the generative model for some choice of spatial transformation. Formally speaking, maximum consensus solves the following optimization:where denotes the cardinality of the set. The constraint in enforces that every pair of measurements in the inlier set must have residuals smaller than a pre-defined threshold. Unfortunately, recent analyses have shown that globally solving problem is NP-Hard, and global algorithms typically have to resort to branch-and-bound techniques that take exponential-time complexity in the worst case.
Although solving consensus maximization exactly is hard, there exist efficient heuristics that perform quite well in practice. One of the most popular heuristics is the Random Sample Consensus scheme. RANSAC is an iterative hypothesize-and-verify method. At each iteration, the method first randomly samples 3 out of the total number of correspondences and computes a hypothesis using Horn's method, then the method evaluates the constraints in to count how many correspondences actually agree with such a hypothesis. The algorithm terminates either after it has found a consensus set that has enough correspondences, or after it has reached the total number of allowed iterations. RANSAC is highly efficient because the main computation of each iteration is carrying out the closed-form solution in Horn's method. However, RANSAC is non-deterministic and only works well in the low-outlier-ratio regime, because its runtime grows exponentially with respect to the outlier ratio.
To fill the gap between the fast but inexact RANSAC scheme and the exact but exhaustive BnB optimization, recent researches have developed deterministic approximate methods to solve consensus maximization.

Outlier removal

Outlier removal methods seek to pre-process the set of highly corrupted correspondences before estimating the spatial transformation. The motivation of outlier removal is to significantly reduce the number of outlier correspondences, while maintaining inlier correspondences, so that optimization over the transformation becomes easier and more efficient.
Parra et al. have proposed a method called Guaranteed Outlier Removal that uses geometric constraints to prune outlier correspondences while guaranteeing to preserve inlier correspondences. GORE has been shown to be able to drastically reduce the outlier ratio, which can significantly boost the performance of consensus maximization using RANSAC or BnB. Yang and Carlone have proposed to build pairwise translation-and-rotation-invariant measurements from the original set of measurements and embed TRIMs as the edges of a graph whose nodes are the 3D points. Since inliers are pairwise consistent in terms of the scale, they must form a clique within the graph. Therefore, using efficient algorithms for computing the maximum clique of a graph can find the inliers and effectively prune the outliers. The maximum clique based outlier removal method is also shown to be quite useful in real-world point set registration problems. Similar outlier removal ideas were also proposed by Parra et al..