A Review of Point Cloud Alignment Methods and Their Applications

: Point cloud alignment is a key technique in the field of computer vision, which involves estimating the transformation between two-point clouds. With the development of optimization methods and deep learning techniques, the robustness and efficiency of point cloud alignment have been significantly improved. Recent studies have combined these two methods to further optimize the performance. Meanwhile, advances in 3D sensing and 3D reconstruction techniques have given rise to the new research field of cross-source point cloud alignment. This paper reviews recent advances in point cloud alignment, including both homologous and cross-source alignment techniques, and explores the interconnection of optimization and deep learning techniques. In addition, the paper reviews relevant benchmark datasets and explores their applications in different domains. Finally, future research directions for point cloud alignment are also described.


Introduction
Point cloud alignment is a technique that has a wide range of applications in several fields, including but not limited to 3D modeling [3], autonomous driving [10], cultural heritage preservation [11], medical image processing, and augmented reality.However, point cloud data are usually incomplete, contain noise and errors, and exhibit inconsistent morphology depending on the acquisition viewpoints, all of which pose challenges for point cloud alignment [4] [5].From the viewpoint of point cloud data sources, point cloud alignment can be divided into two categories: homologous and cross-source, and from the viewpoint of technical implementation, homologous point cloud alignment can be divided into several categories: feature-based methods, optimization-based methods, and methods using deep learning [7] [8]. Figure 1 illustrates some representative methods.Feature-based alignment methods first extract feature points from the point cloud, then find out the correspondence between these feature points and calculate the optimal transformation matrix to align these points.Optimization-based methods, on the other hand, gradually optimize the alignment accuracy between point clouds through an iterative process.In recent years, with the rapid development of artificial intelligence technology, the use of deep learning for point cloud alignment has become a research hotspot, and this type of method automatically learns and extracts point cloud features by training deep neural networks, which shows good performance and a wide range of application prospects [9].

Theoretical Foundations
Point cloud data consists of a set of points in threedimensional space, where each point contains positional information (X, Y, Z coordinates) and may also contain additional information such as color and intensity.These data are typically acquired by laser scanners, stereo vision systems, or other 3D imaging techniques.Due to the limitations of equipment accuracy and environmental factors, point cloud data usually contain noise and incomplete information, which creates challenges for point cloud data processing and analysis [12].Accurate integration of point cloud data from different sources into a unified 3D model is essential for detailed spatial analysis and decision making.In architecture and engineering, accurate 3D models of buildings or construction sites can be generated by aligning point clouds acquired from multiple scans [13].In archaeology, point cloud alignment allows researchers to reconstruct complete models of ruins or artifacts for better historical analysis [14].
The point cloud alignment technique consists of three basic steps: feature extraction, feature matching, and transform estimation.The feature extraction stage aims to identify key feature points and their descriptors from the point cloud [15].In the feature matching stage, the corresponding matching points are identified by comparing feature points between different point clouds [16].With the development of deep learning techniques, especially Convolutional Neural Networks (CNNs) and Deep Belief Networks (DBNs), the automation performance of the feature extraction and matching process has been significantly improved [17].These methods improve the accuracy and efficiency of the matching process by learning complex patterns from large amounts of data.In addition, handling large-scale point cloud data and improving the robustness of algorithms are also hot issues in current research [18,19].
Despite the significant progress in point cloud alignment techniques, improving the alignment accuracy and processing speed is still a major challenge in this field.Especially in complex environments, how to ensure the robustness and accuracy of the alignment is the key to current research.

Optimization-based Methods
Optimization-based point cloud alignment methods have an important place in the field of computer vision.One of the most famous methods is the Iterative Nearest Point (ICP) algorithm proposed by Besl and McKay [1] in 1992.This algorithm minimizes the distance between these pairs of points by iteratively searching for the nearest pairs of points in two point clouds and computing a rigid-body transformation.Although ICP performs well in the simple case, it is highly dependent on initial estimates and is prone to fall into local optimal solutions.To overcome these limitations, researchers have developed several variants of ICP.
In order to improve the processing speed and efficiency of large-scale point cloud data, Rusinkiewicz and Levoy [20] proposed multiresolution ICP in 2001, which effectively reduces the computation time and improves the accuracy of the alignment by iteratively aligning point cloud data with different resolutions and gradually refining the transforms from coarse to fine.Also dedicated to improving the robustness of the algorithm, Chetverikov et al [21] proposed in 2002 that Trimmed ICP eliminates a certain percentage of point pairs (usually distant points) during the iterative process, which improves the robustness to noise and outliers.In the same year, Granger and Pennec [22] proposed Probabilistic ICP, which optimizes the alignment process by introducing a probabilistic model of point-pair matching, which takes into account the uncertainty in the alignment process and improves the performance of the algorithm in the presence of noise and measurement errors.
In 2009, Segal et al [23] proposed the generalized ICP (GICP).This method uses point-to-plane distances instead of the traditional point-to-point distances to optimize the alignment accuracy by defining a local plane around each point and minimizing the distance to that plane.In 2010, Myronenko and Song [24] proposed Coherent Point Drift (CPD), a probabilistic-based alignment framework for nonrigidly transformed point cloud data.CPD is implemented via the Expectation Maximization (EM) algorithm, which assumes that one set of point clouds is a realization of a Gaussian mixture model of another set of point clouds, thus optimizing the alignment between point clouds.
In 2011, Bergstrom and Edlund [25] proposed Robust ICP. this method further enhances the robustness of the alignment process by introducing robust statistical methods to identify and reject outliers.in 2016, Zhou et al [26] proposed Fast Global Registration (FGR).This method utilizes an efficient optimization framework to quickly converge to the global optimal solution, reduces the computational complexity by constructing a sparse linear system and using an edge-based four-parameter alignment technique, and is suitable for processing large-scale point cloud data.
These optimization-based point cloud alignment methods have been improved theoretically and have shown increasing reliability and efficiency in practical applications.With the increase in computational power and further development of algorithms, it is expected that these methods will play an even greater role in areas such as autonomous driving, robot navigation, and cultural heritage digitization.

Feature Learning Methods
In the field of point cloud alignment, there are several feature-based methods that have attracted much attention in academia and industry for their remarkable performance and wide range of applications.First, Rusu et al [27] proposed FPFH (Fast Point Feature Histograms) in 2009.This method dramatically improves the computational efficiency of feature description by computing the local geometric feature histograms of points, which is especially suitable for fast matching and alignment of large-scale point cloud data.
Subsequently, Tombari et al [28] proposed SHOT (Signature of Histograms of Orientations) in 2010, a method that utilizes the local geometric information of points to generate rotationally invariant feature descriptors, which effectively improves the matching accuracy of complex point cloud data.
Earlier studies include SPIN (Spin Images) proposed by Johnson and Hebert [29] in 1999.This method successfully captures the geometric features of the point cloud surface by generating rotationally invariant local feature images, and is widely used in object recognition and alignment tasks.
In 2004, Frome et al [30] proposed 3D Shape Context, which effectively captures the local geometric structure of the point cloud by generating a spherical region around the point and calculating the histogram of local geometric features, thus significantly improving the accuracy and robustness of point cloud matching.
Following this, in 2009, Zhong [31] proposed the ISS (Intrinsic Shape Signatures) method.By extracting key points in the point cloud and generating descriptors, this method demonstrates strong robustness to noise and is suitable for matching sparse and dense point clouds.
Finally, RoPS (Rotational Projection Statistics), proposed by Guo et al [32] in 2013, provides rotational invariance and is suitable for point cloud alignment in different orientations by projecting local neighborhoods of points to multiple rotational planes and computing statistical features to generate descriptors.
Each of the above methods is unique in capturing and describing the geometric features of the point cloud, which has greatly promoted the development and application of point cloud alignment techniques and demonstrated their excellent performance in various types of complex environments.

End-to-end Learning Approach
The end-to-end method outputs the transformation matrix directly from the point cloud data through neural networks, eliminating the need for manual feature extraction and corresponding point matching.This approach simplifies the alignment process and improves the alignment efficiency.The end-to-end point cloud learning approach is also easier on the training data than training key point detectors and key point descriptors because of its evaluation by the goodness of the alignment, while the evaluation of key point detectors and key point descriptors is relatively difficult.
First, Charles R. Qi [33]   In addition, Yue Wang and Justin M. Solomon [35] proposed DCP (Deep Closest Point) in 2019, which achieves efficient alignment between point clouds through a pairwise learning strategy and a microscopic SVD (Singular Value Decomposition) layer.As shown in Figure 3.DCP employs a Transformer structure to capture local and global point cloud relationships, thus greatly improving the alignment efficiency and accuracy.In the same year, Wu et al [36] proposed DeepVCP (Deep Vectorized Correspondence Prediction), which learns the vectorized correspondences between point cloud pairs to predict the point pairs.DeepVCP utilizes the prediction of vectorized correspondences to guide the point cloud alignment, and demonstrates superior performance in the presence of large amounts of noise and occlusion.The superior performance of DeepVCP in the presence of large amounts of noise and occlusion is demonstrated.
FGR (Fast Global Registration) proposed by J. Zeng et al [36] in 2016, on the other hand, utilizes an efficient optimization framework to quickly converge to a globally optimal solution.By constructing a sparse linear system and using an edge-based four-parameter alignment technique, FGR drastically reduces the computational complexity and is suitable for processing large-scale point cloud data.
In addition, Yew and Lee [38] proposed RPM-Net (Robust Point Matching Network) in 2020, which enhances robustness to noise and occlusion through an end-to-end learning framework utilizing a weighted matching strategy and a local feature aggregation technique, enabling it to perform well in various complex environments.Finally, Bai et al [39] proposed PointDSC (Point Distributions to Subspaces Correspondence) in 2021, by learning the distribution of point clouds and the correspondence of subspaces, PointDSC utilizes deep spatial consistency techniques to further improve the alignment in the process of point cloud alignment in terms of alignment accuracy and efficiency in the process of point cloud alignment, demonstrating its wide potential in practical applications.

Cross-source Point Cloud Alignment
Cross-source point cloud alignment is an important direction in the field of point cloud processing, which involves the precise alignment of point cloud data from different sensors or different viewpoints.This type of alignment is particularly challenging because point clouds from different sources may have significant differences such as resolution, scale, noise level, and density.Below are some important cross-source point cloud alignment methods that demonstrate how to overcome these differences and achieve effective data fusion.
Coherent Point Drift (CPD) is a probabilistic alignment method proposed by Myronenko and Song [40] in 2010, which is mainly used to deal with point cloud data with different resolutions and scales.CPD assumes that one point cloud is an instance of a Gaussian mixture model of another point cloud, and uses an expectation maximization algorithm to estimate the correspondence between the two and the advantage of CPD is that it is robust and can effectively handle non-rigid transformations between point clouds, which makes it very useful in cross-source point cloud alignment.Robust Point Matching (RPM), proposed in 1998 by Gold et al [41], is an iterative correspondence point alignment technique that improves the robustness of the alignment by introducing soft correspondences and a deterministic annealing process.RPM not only focuses on the Euclidean distances between the points, but also takes into account the overall geometrical structure of the shapes, which makes it particularly suitable for data from different sensors.data.The method adjusts the strength of the correspondence by gradually decreasing the temperature parameter, thus avoiding falling into local minima during the global optimization process.
Multi-modal Surface Matching (MSM) was proposed by Robinson et al [42] in 2014 to address the problem of aligning data from different modalities.MMSM does this by constructing high-dimensional feature vectors describing the local shape of the surface of each point cloud and using these features to perform the alignment.The key to this approach is the ability to deal with the inherent differences between data captured by different sensors, such as point clouds generated by optical scanners and LIDAR, to achieve a more accurate and stable alignment.
Sparse Canonical Correlation Analysis (SCCA) [43] is a statistical method for analyzing and understanding the relationship between two high-dimensional data sets.In point cloud alignment, SCCA can be used to discover correlated features in different point cloud datasets that may be captured by different sensors under different conditions.By maximizing the correlation between these features, SCCA helps align and fuse point cloud data from different sources.
The Generalized-ICP algorithm itself was proposed by Segal et al [44] in 2009, and research on incorporating scale estimation into the ICP framework began to increase around 2012 to accommodate the need for alignment of point cloud data at different scales.na This approach aligns the point cloud by simultaneously optimizing the rotational, translational, and scaling parameters to make it more suitable for processing point cloud data from data with different resolution sensors.
These cross-source point cloud alignment methods provide powerful support for 3D modeling, robot navigation, and other multi-sensor integration applications by overcoming the discrepancies between data acquired from different devices or under different environmental conditions through a variety of techniques.As technology continues to advance, it is expected that more innovative methods will be developed to further improve the performance of cross-source point cloud alignment.

Datasets and benchmarking
The research and development of point cloud alignment techniques relies on high-quality datasets that not only provide a rich resource for algorithm development, but are also used for benchmarking to evaluate the performance of different algorithms.The following are some of the key datasets that are widely used within the field of point cloud alignment, each with their unique characteristics and application purposes.

ShapeNet
ShapeNet is a widely used database of 3D objects containing tens of thousands of models collected from multiple categories (e.g., furniture, vehicles, tools, etc.).These models are provided in a standardized format suitable for shape analysis and point cloud alignment tasks.The 3D models provided by ShapeNet are often used to train and test deep learning models in recognizing and aligning 3D shapes.

3DMatch
The 3DMatch dataset is specifically designed for 3D point cloud matching and alignment tasks.It contains a large number of 3D point cloud segments collected from real-world scenes that have been acquired in indoor environments using high-precision scanners.3DMatch is used to evaluate the effectiveness and robustness of point cloud alignment algorithms when working with real-world data by providing a series of paired point clouds and ground-truth transformations between them.

KITTI
The KITTI dataset is a benchmark dataset in autonomous driving research and contains a variety of sensor data collected from moving vehicles, including LiDAR scans for point cloud generation.The point cloud alignment component of the KITTI dataset focuses specifically on the evaluation of vehicle environment awareness capabilities, such as road and vehicle detection, and 3D reconstruction.This makes KITTI an ideal platform for evaluating and developing point cloud alignment techniques for autonomous driving applications.

ModelNet
ModelNet is a large-scale 3D CAD model dataset containing carefully selected models from 40 object classes (e.g., chairs, tables, airplanes, etc.)As shown in Figure 4. ModelNet is commonly used for benchmarking 3D object recognition and point cloud alignment, and is especially useful for investigating the algorithms' generalization capabilities.
These datasets and benchmark tests are crucial for the development of point cloud alignment algorithms.They not only provide the basis for algorithm development and testing, but also drive progress and innovation in point cloud alignment techniques through systematic evaluation and comparison.As more specialized datasets are developed, it is expected that point cloud alignment techniques will continue to improve to meet the growing demand of applications.(1) In the field of autonomous driving, point cloud alignment technology is one of the core technologies for achieving precise vehicle positioning and environment modeling.By real-time alignment of point cloud data from on-board LiDAR sensors, autonomous driving systems are able to construct detailed 3D maps of their surroundings and continuously update dynamic changes around the vehicle, such as the locations of pedestrians, other vehicles, and roadblocks.This information is critical for path planning, obstacle avoidance decisions, and vehicle control systems, ensuring the operational safety and efficiency of self-driving vehicles.
(2) In cultural heritage preservation, point cloud alignment techniques are used to reconstruct accurate 3D models of monuments from scanned data acquired at different times and from different viewpoints.These 3D models are not only used for documentation and archiving, but also support the restoration and maintenance work of monuments by assessing possible damages or structural problems through virtual reconstruction.
These application examples clearly demonstrate the wide range of applications and the critical role of point cloud alignment technology in a variety of industries, and it is expected that the scope and impact of its application will further expand as related technologies advance.

Conclusion
After years of development, point cloud alignment technology has achieved a lot of results, and with the development of point cloud alignment technology and indepth into the production and life, more applications of point cloud alignment technology puts forward higher challenges.
(1) Processing large-scale point cloud data.With the improvement of the resolution of scanning equipment and the expansion of application areas, the amount of generated point cloud data increases dramatically.How to efficiently process these large-scale datasets, especially in resource-limited environments such as mobile devices or real-time systems, is a major challenge.
(2) Real-time processing requirements.For applications such as self-driving vehicles, robot navigation, etc., point cloud data needs to be quickly and accurately aligned in realtime or near real-time environments.This requires algorithms to have not only high accuracy but also very low latency.
In order to cope with the challenges facing point cloud alignment, improve the performance and practicality of point cloud alignment algorithms, and look forward to the future development trend of point cloud alignment algorithms, the following features are needed: (1) Adaptive learning methods.Future research can focus more on developing adaptive learning algorithms that can automatically adjust their parameters or structures according to different application scenarios and data characteristics.This adaptivity can significantly improve the generalization and performance of the algorithms on unseen data.
(2) Integration of deep learning with traditional models.Although deep learning has shown great potential in point cloud alignment, combining it with traditional geometric processing methods (e.g., ICP and its variants) may be able to capitalize on the stability and low-resource-consumption advantages of traditional methods while maintaining the flexibility of deep learning.
(3) New Sensing Technologies.Explore new sensing technologies and alignment methods to better capture details in complex environments while reducing equipment costs and computational requirements for data processing.
As a core technology in the field of 3D vision, point cloud alignment will continue to expand its application in multiple fields with the improvement of computing power and the advancement of machine learning methods.The incorporation of deep learning in particular has brought revolutionary improvements to point cloud alignment, and future research is expected to address the limitations of existing techniques and promote the further development of this technology.

Fig 1 .
Fig 1. Categories of point cloud alignment et al. proposed PointNet in 2017, a pioneering work that learns global features directly from the original point cloud via a deep neural network, while utilizing symmetric functions (e.g., maximum pooling) to ensure that the model is insensitive to the order of the input point cloud.As shown in Figure 2. In the same year, Qi et al [34] further introduced PointNet++, which introduces a local feature learning mechanism based on the original PointNet, and effectively captures both local and global features of the point cloud through multi-scale grouping (MSG) and hierarchical feature learning strategies, thus significantly improving the accuracy and robustness of the alignment.

Fig 4 .
Fig 4. Sample diagram of data set