


Point cloud registration is inescapable for 3D vision! Understand all mainstream solutions and challenges in one article
Apr 02, 2024 am 11:31 AM作為點(diǎn)集合的點(diǎn)雲(yún)有望透過(guò)3D重建、工業(yè)檢測(cè)和機(jī)器人操作中,在獲取和生成物體的三維(3D)表面資訊方面帶來(lái)一場(chǎng)改變。最具挑戰(zhàn)性但必不可少的過(guò)程是點(diǎn)雲(yún)配準(zhǔn),即獲得一個(gè)空間變換,該變換將在兩個(gè)不同座標(biāo)中獲得的兩個(gè)點(diǎn)雲(yún)對(duì)齊並匹配。這篇綜述介紹了點(diǎn)雲(yún)配準(zhǔn)的概述和基本原理,對(duì)各種方法進(jìn)行了系統(tǒng)的分類和比較,並解決了點(diǎn)雲(yún)配準(zhǔn)中存在的技術(shù)問(wèn)題,試圖為該領(lǐng)域以外的學(xué)術(shù)研究人員和工程師提供指導(dǎo),並促進(jìn)點(diǎn)雲(yún)配準(zhǔn)統(tǒng)一願(yuàn)景的討論。
點(diǎn)雲(yún)獲取的一般方式
分為主動(dòng)和被動(dòng)方式,由感測(cè)器主動(dòng)獲取的點(diǎn)雲(yún)為主動(dòng)方式,後期通過(guò)重建的方式為被動(dòng)。
從SFM到MVS的密集重建。 (a)SFM。 (b)SfM產(chǎn)生的點(diǎn)雲(yún)範(fàn)例。 (c)PMVS演算法流程圖,一種基於patch的多視角立體演算法。 (d)PMVS產(chǎn)生的密集點(diǎn)雲(yún)範(fàn)例。
結(jié)構(gòu)光重建方法:
#剛性配準(zhǔn)與非剛性配準(zhǔn)
在一個(gè)環(huán)境中,變換可以分解為旋轉(zhuǎn)和平移,在適當(dāng)?shù)膭傂宰儞Q後,一個(gè)點(diǎn)雲(yún)被映射到另一點(diǎn)雲(yún),同時(shí)保持相同的形狀和大小。
在非剛性配準(zhǔn)中,建立非剛性變換以將掃描資料wrap到目標(biāo)點(diǎn)雲(yún)。非剛性變換包含反射、旋轉(zhuǎn)、縮放和平移,而不是剛性配準(zhǔn)僅包含平移和旋轉(zhuǎn)。非剛性配準(zhǔn)的使用主要有兩個(gè)原因:(1) 資料收集的非線性和校準(zhǔn)誤差會(huì)導(dǎo)致剛性物體掃描的低頻扭曲;(2) 對(duì)隨著時(shí)間改變其形狀和移動(dòng)場(chǎng)景或目標(biāo)執(zhí)行配準(zhǔn)。
剛性配準(zhǔn)的範(fàn)例:(a)兩個(gè)點(diǎn)雲(yún):讀取點(diǎn)雲(yún)(綠色)和參考點(diǎn)雲(yún)(紅色);不使用(b)和使用(c)剛性配準(zhǔn)演算法的情況下,點(diǎn)雲(yún)融合到公共座標(biāo)系中。
然而,點(diǎn)雲(yún)配準(zhǔn)的效能被Variant Overlap、雜訊和異常值、高運(yùn)算成本、配準(zhǔn)成功的各種指標(biāo)受限。
配準(zhǔn)的方法有哪些?
在過(guò)去的幾十年裡,人們提出了越來(lái)越多的點(diǎn)雲(yún)配準(zhǔn)方法,從經(jīng)典的ICP演算法到與深度學(xué)習(xí)技術(shù)相結(jié)合的解決方案。
1)ICP方案
ICP演算法是一種迭代演算法,可在理想條件下確保配準(zhǔn)的準(zhǔn)確性、收斂速度和穩(wěn)定性。從某種意義上說(shuō),ICP可以被視為期望最大化(EM)問(wèn)題,因此它基於對(duì)應(yīng)關(guān)係計(jì)算和更新新的變換,然後應(yīng)用於讀取數(shù)據(jù),直到誤差度量收斂。然而,這並不能保證ICP達(dá)到全域最優(yōu),ICP演算法可以大致分為四個(gè)步驟:如下圖所示,點(diǎn)選擇、點(diǎn)匹配、點(diǎn)拒絕和誤差度量最小化。
2)基於特徵的方法
正如我們?cè)诨禝CP的演算法中所看到的,在變換估計(jì)之前,建立對(duì)應(yīng)關(guān)係是至關(guān)重要的。如果我們獲得描述兩個(gè)點(diǎn)雲(yún)之間正確關(guān)係的適當(dāng)對(duì)應(yīng)關(guān)係,則可以保證最終結(jié)果。因此,我們可以在掃描目標(biāo)上貼上地標(biāo),或者在後處理中手動(dòng)拾取等效點(diǎn)對(duì),以計(jì)算感興趣點(diǎn)(拾取點(diǎn))的變換,這種變換最終可以應(yīng)用於讀取點(diǎn)雲(yún)。如圖12(c)所示,點(diǎn)雲(yún)載入在同一座標(biāo)系中,並繪製成不同的顏色。圖12(a)和12(b)顯示了在不同視點(diǎn)捕獲的兩個(gè)點(diǎn)雲(yún),分別從參考資料和讀取資料中選擇點(diǎn)對(duì),配準(zhǔn)結(jié)果如圖12(d)所示。然而,這些方法對(duì)不能附著地標(biāo)的測(cè)量對(duì)象既不友好,也不能應(yīng)用於需要自動(dòng)配準(zhǔn)的應(yīng)用。同時(shí),為了最小化對(duì)應(yīng)關(guān)係的搜尋空間,並避免在基於ICP的演算法中假設(shè)初始變換,引入了基於特徵的配準(zhǔn),其中提取了研究人員設(shè)計(jì)的關(guān)鍵點(diǎn)。通常,關(guān)鍵點(diǎn)檢測(cè)和對(duì)應(yīng)關(guān)係建立是此方法的主要步驟。
關(guān)鍵點(diǎn)擷取的常用方法包括PFH、SHOT等,設(shè)計(jì)一種演算法來(lái)移除異常值和有效地基於inliers的估計(jì)變換同樣很重要。
3)基於學(xué)習(xí)的方法
在使用點(diǎn)雲(yún)作為輸入的應(yīng)用程式中,估計(jì)特徵描述符的傳統(tǒng)策略在很大程度上依賴點(diǎn)雲(yún)中目標(biāo)的獨(dú)特幾何特性。然而,現(xiàn)實(shí)世界的數(shù)據(jù)往往因目標(biāo)而異,可能包含平面、異常值和雜訊。此外,去除的失配通常包含有用的信息,可以用於學(xué)習(xí)?;秾W(xué)習(xí)的技術(shù)可以適用於對(duì)語(yǔ)義資訊進(jìn)行編碼,並且可以在特定任務(wù)中推廣。大多數(shù)與機(jī)器學(xué)習(xí)技術(shù)整合的配準(zhǔn)策略比經(jīng)典方法更快、更穩(wěn)健,並靈活地?cái)U(kuò)展到其他任務(wù),如物體姿勢(shì)估計(jì)和物體分類。同樣,基於學(xué)習(xí)的點(diǎn)雲(yún)配準(zhǔn)的一個(gè)關(guān)鍵挑戰(zhàn)是如何提取對(duì)點(diǎn)雲(yún)的空間變化不變、對(duì)噪音和異常值更具穩(wěn)健性的特徵。
以學(xué)習(xí)為基礎(chǔ)的方法代表作為:PointNet 、PointNet 、PCRNet ?、Deep Global Registration 、Deep Closest Point、Partial Registration Network 、Robust Point Matching 、PointNetLK 、3DRegNet。
4)具有機(jī)率密度函數(shù)的方法
基於機(jī)率密度函數(shù)(PDF)的點(diǎn)雲(yún)配準(zhǔn),使得使用統(tǒng)計(jì)模型進(jìn)行配準(zhǔn)是一個(gè)研究得很好的問(wèn)題,該方法的關(guān)鍵思想是用特定的機(jī)率密度函數(shù)表示數(shù)據(jù),如高斯混合模型(GMM)和常態(tài)分佈(ND)。配準(zhǔn)任務(wù)被重新表述為對(duì)齊兩個(gè)相應(yīng)分佈的問(wèn)題,然後是測(cè)量和最小化它們之間的統(tǒng)計(jì)差異的目標(biāo)函數(shù)。同時(shí),由於PDF的表示,點(diǎn)雲(yún)可以被視為一個(gè)分佈,而不是許多單獨(dú)的點(diǎn),因此它避免了對(duì)應(yīng)關(guān)係的估計(jì),並且具有良好的抗噪聲性能,但通常比基於ICP的方法慢。
5)其它方法
Fast Global Registration ???焖偃蚺錅?zhǔn)(FGR)為點(diǎn)雲(yún)配準(zhǔn)提供了一種無(wú)需初始化的快速策略。具體來(lái)說(shuō),F(xiàn)GR對(duì)覆蓋的表面的候選匹配進(jìn)行操作並且不執(zhí)行對(duì)應(yīng)關(guān)係更新或最近點(diǎn)查詢,該方法的特殊之處在於,可以直接透過(guò)在表面上密集定義的穩(wěn)健目標(biāo)的單一最佳化來(lái)產(chǎn)生聯(lián)合配準(zhǔn)。然而,現(xiàn)有的解決點(diǎn)雲(yún)配準(zhǔn)的方法通常在兩個(gè)點(diǎn)雲(yún)之間產(chǎn)生候選或多個(gè)對(duì)應(yīng)關(guān)係,然後計(jì)算和更新全域結(jié)果。此外,在快速全域配準(zhǔn)中,在最佳化中會(huì)立即建立對(duì)應(yīng)關(guān)係,並且不會(huì)在以下步驟中再次進(jìn)行估計(jì)。因此,避免了昂貴的最近鄰查找,以保持低的計(jì)算成本。結(jié)果,迭代步驟中用於每個(gè)對(duì)應(yīng)關(guān)係的線性處理和用於姿態(tài)估計(jì)的線性系統(tǒng)是有效的。 FGR在多個(gè)資料集上進(jìn)行評(píng)估,如UWA基準(zhǔn)和Stanford Bunny,與點(diǎn)對(duì)點(diǎn)和點(diǎn)頂線的ICP以及Go ICP等ICP變體進(jìn)行比較。實(shí)驗(yàn)顯示FGR在存在噪音的情況下表現(xiàn)出色!
Four-point congruent set algorithm: 4-point congruent set (4PCS) provides an initial transformation for reading data without requiring starting position assumptions. Typically, a rigid registration transformation between two point clouds can be uniquely defined by a pair of triples, one from the reference data and the other from the read data. However, in this method, it looks for special 4-points bases by searching in a small potential set, i.e. 4 coplanar congruent points in each point cloud, as shown in Figure 27. Solving the optimal rigid transformation in the largest set of common points (LCP) problem. This algorithm achieves close performance when the overlap of paired point clouds is low and outliers are present. In order to adapt to different applications, many researchers have introduced more important works related to the classic 4PCS solution.
The above is the detailed content of Point cloud registration is inescapable for 3D vision! Understand all mainstream solutions and challenges in one article. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Written above & the author’s personal understanding Three-dimensional Gaussiansplatting (3DGS) is a transformative technology that has emerged in the fields of explicit radiation fields and computer graphics in recent years. This innovative method is characterized by the use of millions of 3D Gaussians, which is very different from the neural radiation field (NeRF) method, which mainly uses an implicit coordinate-based model to map spatial coordinates to pixel values. With its explicit scene representation and differentiable rendering algorithms, 3DGS not only guarantees real-time rendering capabilities, but also introduces an unprecedented level of control and scene editing. This positions 3DGS as a potential game-changer for next-generation 3D reconstruction and representation. To this end, we provide a systematic overview of the latest developments and concerns in the field of 3DGS for the first time.

Written previously, today we discuss how deep learning technology can improve the performance of vision-based SLAM (simultaneous localization and mapping) in complex environments. By combining deep feature extraction and depth matching methods, here we introduce a versatile hybrid visual SLAM system designed to improve adaptation in challenging scenarios such as low-light conditions, dynamic lighting, weakly textured areas, and severe jitter. sex. Our system supports multiple modes, including extended monocular, stereo, monocular-inertial, and stereo-inertial configurations. In addition, it also analyzes how to combine visual SLAM with deep learning methods to inspire other research. Through extensive experiments on public datasets and self-sampled data, we demonstrate the superiority of SL-SLAM in terms of positioning accuracy and tracking robustness.

You must remember, especially if you are a Teams user, that Microsoft added a new batch of 3DFluent emojis to its work-focused video conferencing app. After Microsoft announced 3D emojis for Teams and Windows last year, the process has actually seen more than 1,800 existing emojis updated for the platform. This big idea and the launch of the 3DFluent emoji update for Teams was first promoted via an official blog post. Latest Teams update brings FluentEmojis to the app Microsoft says the updated 1,800 emojis will be available to us every day

When the gossip started spreading that the new Windows 11 was in development, every Microsoft user was curious about how the new operating system would look like and what it would bring. After speculation, Windows 11 is here. The operating system comes with new design and functional changes. In addition to some additions, it comes with feature deprecations and removals. One of the features that doesn't exist in Windows 11 is Paint3D. While it still offers classic Paint, which is good for drawers, doodlers, and doodlers, it abandons Paint3D, which offers extra features ideal for 3D creators. If you are looking for some extra features, we recommend Autodesk Maya as the best 3D design software. like

0.Written in front&& Personal understanding that autonomous driving systems rely on advanced perception, decision-making and control technologies, by using various sensors (such as cameras, lidar, radar, etc.) to perceive the surrounding environment, and using algorithms and models for real-time analysis and decision-making. This enables vehicles to recognize road signs, detect and track other vehicles, predict pedestrian behavior, etc., thereby safely operating and adapting to complex traffic environments. This technology is currently attracting widespread attention and is considered an important development area in the future of transportation. one. But what makes autonomous driving difficult is figuring out how to make the car understand what's going on around it. This requires that the three-dimensional object detection algorithm in the autonomous driving system can accurately perceive and describe objects in the surrounding environment, including their locations,

1 Introduction Neural Radiation Fields (NeRF) are a fairly new paradigm in the field of deep learning and computer vision. This technology was introduced in the ECCV2020 paper "NeRF: Representing Scenes as Neural Radiation Fields for View Synthesis" (which won the Best Paper Award) and has since become extremely popular, with nearly 800 citations to date [1 ]. The approach marks a sea change in the traditional way machine learning processes 3D data. Neural radiation field scene representation and differentiable rendering process: composite images by sampling 5D coordinates (position and viewing direction) along camera rays; feed these positions into an MLP to produce color and volumetric densities; and composite these values ??using volumetric rendering techniques image; the rendering function is differentiable, so it can be passed

Written above & the author’s personal understanding: At present, in the entire autonomous driving system, the perception module plays a vital role. The autonomous vehicle driving on the road can only obtain accurate perception results through the perception module. The downstream regulation and control module in the autonomous driving system makes timely and correct judgments and behavioral decisions. Currently, cars with autonomous driving functions are usually equipped with a variety of data information sensors including surround-view camera sensors, lidar sensors, and millimeter-wave radar sensors to collect information in different modalities to achieve accurate perception tasks. The BEV perception algorithm based on pure vision is favored by the industry because of its low hardware cost and easy deployment, and its output results can be easily applied to various downstream tasks.

Project link written in front: https://nianticlabs.github.io/mickey/ Given two pictures, the camera pose between them can be estimated by establishing the correspondence between the pictures. Typically, these correspondences are 2D to 2D, and our estimated poses are scale-indeterminate. Some applications, such as instant augmented reality anytime, anywhere, require pose estimation of scale metrics, so they rely on external depth estimators to recover scale. This paper proposes MicKey, a keypoint matching process capable of predicting metric correspondences in 3D camera space. By learning 3D coordinate matching across images, we are able to infer metric relative
