Class-level object pose estimation based on point clouds

[Abstract]Aiming at the class-level object pose estimation problem, a method is proposed that only uses the point cloud scanned by the depth camera as input. When only the point cloud category of the target object is known, Next, a method to accurately estimate the three-dimensional pose of the target object. This method does not need to rely on a large number of manually annotated data sets with labels. It only uses virtual simulation technology to simulate the produced data, and can achieve higher accuracy on real data sets. This method first filters the background noise of the input point cloud, then uses the center prediction module to standardize the point cloud, then uses a method based on the corresponding category template point cloud deformation to predict its standard coordinate system coordinates, and finally obtains it through the least squares method. The three-dimensional pose of the target object. Experimental results show that this method has better generalization performance and higher accuracy on real data.

[Keywords]  Posture estimation; Point cloud processing; Deep learning; Pattern recognition

0 Preface

In order to grab the target, the robotic arm first needs to understand the scene and identify the target of interest. Then it needs to solve the posture of the target of interest relative to the coordinate system of the robot's camera, and then select a suitable grabbing point on the surface of the target of interest and send it to the robot. Decision-making system to capture. The robotic arm needs to sense the surrounding environment through visual sensors (such as RGB cameras or depth cameras, and corresponding perception algorithms). Compared with traditional RGB cameras, depth cameras can perceive the depth information of objects in the scene, which can help robots perceive the real world more accurately. With the introduction of depth cameras such as Kinect and RealSense, depth cameras that can obtain depth information have been greatly developed and popularized, and are widely used as the "eyes" of robotic arms to perceive the real world. How to perceive and understand the work scene through the depth camera, detect or segment the target object, accurately estimate the target posture, and select the appropriate grabbing point have become core issues in robotic arm research.

In recent years, target detection and segmentation based on deep learning has achieved remarkable results, and related technologies are relatively mature, and have been successfully deployed in a large number of applications in the industry. Target attitude estimation generally refers to attitude estimation with 6 degrees of freedom, that is, the rigid transformation of the target object relative to the current camera coordinate system: (R,t)∈SE(3), where R

Guess you like

Origin blog.csdn.net/weixin_57147647/article/details/134916319