To complete the design, reproduce the object classification, part segmentation and scene segmentation of PointNet++, find some inspiration and ideas, and make a record of stepping on the pit.
download code
https://github.com/yanx27/Pointnet_Pointnet2_pytorch
My operating environment is pytorch1.7+cuda11.0.
train
The PointNet++ code enables 3D object classification, object part segmentation, and semantic scene segmentation.
object classification
Download the dataset ModelNet40 and store it in the folder data/modelnet40_normal_resampled/
.
## e.g., pointnet2_ssg without normal features
python train_classification.py --model pointnet2_cls_ssg --log_dir pointnet2_cls_ssg
python test_classification.py --log_dir pointnet2_cls_ssg
## e.g., pointnet2_ssg with normal features
python train_classification.py --model pointnet2_cls_ssg --use_normals --log_dir pointnet2_cls_ssg_normal
python test_classification.py --use_normals --log_dir pointnet2_cls_ssg_normal
## e.g., pointnet2_ssg with uniform sampling
python train_classification.py --model pointnet2_cls_ssg --use_uniform_sample --log_dir pointnet2_cls_ssg_fps
python test_classification.py --use_uniform_sample --log_dir pointnet2_cls_ssg_fps
- When running the code in the main folder,
python train_classification.py --model pointnet2_cls_ssg --log_dir pointnet2_cls_ssg
an error may be reported:
ImportError: cannot import name 'PointNetSetAbstraction'
the reason is that the working directory when the pointnet2_cls_ssg.py file is imported is the models folder, but the actual working directory is the upper-level directory of the models, so it needs to be changed infrom pointnet2_utils import PointNetSetAbstraction
pointnet2_cls_ssg.pyfrom models.pointnet2_utils import PointNetSetAbstraction
.
Refer to the README.md file. Classification is not my main focus, so I will skip it here.
Parts division
Part segmentation is to separate the parts of an object, such as separating the chair legs of a chair.
Download the dataset ShapeNet and store it in the folder data/shapenetcore_partanno_segmentation_benchmark_v0_normal/
.
Running is also simple:
## e.g., pointnet2_msg
python train_partseg.py --model pointnet2_part_seg_msg --normal --log_dir pointnet2_part_seg_msg
python test_partseg.py --normal --log_dir pointnet2_part_seg_msg
Shapenet data set txt file format: the first three points are xyz, the position coordinates of the point cloud, the last three points are the normal information of the point cloud, and the last point is the small category to which this point belongs, that is, 1 means that it belongs to 50 small first in category.
Write a code to visualize the txt file of the shapenet dataset with open3d (random color matching):
import open3d as o3d
import numpy as np
'''
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
ROOT_DIR = os.path.dirname(BASE_DIR)
sys.path.append(BASE_DIR)
sys.path.append(os.path.join(ROOT_DIR, 'data_utils'))
'''
txt_path = '/home/lin/CV_AI_learning/Pointnet_Pointnet2_pytorch-master/data/shapenetcore_partanno_segmentation_benchmark_v0_normal/02691156/1b3c6b2fbcf834cf62b600da24e0965.txt'
# 通过numpy读取txt点云
pcd = np.genfromtxt(txt_path, delimiter=" ")
pcd_vector = o3d.geometry.PointCloud()
# 加载点坐标
# txt点云前三个数值一般对应x、y、z坐标,可以通过open3d.geometry.PointCloud().points加载
# 如果有法线或颜色,那么可以分别通过open3d.geometry.PointCloud().normals或open3d.geometry.PointCloud().colors加载
pcd_vector.points = o3d.utility.Vector3dVector(pcd[:, :3])
pcd_vector.colors = o3d.utility.Vector3dVector(pcd[:, 3:6])
o3d.visualization.draw_geometries([pcd_vector])
The GPU memory is not enough to reduce the batch_size.
I trained here, and then the best_model.pth of the code continued to train for 150 rounds. The RTX3080 single graphics card took six to seven minutes to train for one round, and it took more than half a day for 150 rounds.
The code on the Internet basically tests some parameters of the segmentation and ends, without visualization. Refer to this blog to visualize the results: Visualization of PointNet++ segmentation prediction results . This blog first uses the network to save the prediction results of the input image as a txt file, and then uses Matplotlib for visualization. The process is a bit complicated. It is simpler to use open3d for visualization. The code is as follows:
import tqdm
import matplotlib
import torch
import os
import warnings
import numpy as np
import open3d as o3d
from torch.utils.data import Dataset
import pybullet as p
from models.pointnet2_part_seg_msg import get_model as pointnet2
import time
warnings.filterwarnings('ignore')
matplotlib.use("Agg")
def pc_normalize(pc):
centroid = np.mean(pc, axis=0)
pc = pc - centroid
m = np.max(np.sqrt(np.sum(pc ** 2, axis=1)))
pc = pc / m
return pc,centroid,m
def generate_pointcloud(color_image, depth_image,width=1280,height=720,fov=50,near=0.01,far=5):
rgbd_image = o3d.geometry.RGBDImage.create_from_color_and_depth(color_image, depth_image,convert_rgb_to_intensity=False)
intrinsic = o3d.camera.PinholeCameraIntrinsic(o3d.camera.PinholeCameraIntrinsicParameters.Kinect2DepthCameraDefault )
aspect = width / height
projection_matrix = p.computeProjectionMatrixFOV(fov, aspect, near, far)
intrinsic.set_intrinsics(width=width, height=height, fx=projection_matrix[0]*width/2, fy=projection_matrix[5]*height/2, cx=width/2, cy=height/2)
point_cloud = o3d.geometry.PointCloud.create_from_rgbd_image(rgbd_image, intrinsic)
point_cloud.estimate_normals( search_param=o3d.geometry.KDTreeSearchParamHybrid(radius=0.01, max_nn=30))
return point_cloud
class PartNormalDataset(Dataset):
def __init__(self, point_cloud, npoints=2500, normal_channel=False):
self.npoints = npoints # 采样点数
self.cat = {}
self.normal_channel = normal_channel # 是否使用法向信息
position_data = np.asarray(point_cloud.points)
normal_data = np.asarray(point_cloud.normals)
self.raw_pcd = np.hstack([position_data,normal_data]).astype(np.float32)
self.cat = {'board':'12345678'}
# 输出的是元组,('Airplane',123.txt)
self.classes = {'board': 0}
data = self.raw_pcd
if not self.normal_channel: # 判断是否使用法向信息
self.point_set = data[:, 0:3]
else:
self.point_set = data[:, 0:6]
self.point_set[:, 0:3],self.centroid,self.m = pc_normalize(self.point_set[:, 0:3]) # 做一个归一化
choice = np.random.choice(self.point_set.shape[0], self.npoints, replace=True) # 对一个类别中的数据进行随机采样 返回索引,允许重复采样
# resample
self.point_set = self.point_set[choice, :] # 根据索引采样
def __getitem__(self, index):
cat = list(self.cat.keys())[0]
cls = self.classes[cat] # 将类名转换为索引
cls = np.array([cls]).astype(np.int32)
return self.point_set, cls, self.centroid, self.m # pointset是点云数据,cls十六个大类别,seg是一个数据中,不同点对应的小类别
def __len__(self):
return 1
class Generate_txt_and_3d_img:
def __init__(self,num_classes,testDataLoader,model,visualize = False):
self.testDataLoader = testDataLoader
self.num_classes = num_classes
self.heat_map = False # 控制是否输出heatmap
self.visualize = visualize # 是否open3d可视化
self.model = model
self.generate_predict()
self.o3d_draw_3d_img()
def __getitem__(self, index):
return self.predict_pcd_colored
def generate_predict(self):
for _, (points, label,centroid,m) in tqdm.tqdm(enumerate(self.testDataLoader),
total=len(self.testDataLoader),smoothing=0.9):
#点云数据、整个图像的标签、每个点的标签、 没有归一化的点云数据(带标签)torch.Size([1, 7, 2048])
points = points.transpose(2, 1)
#print('1',target.shape) # 1 torch.Size([1, 2048])
xyz_feature_point = points[:, :6, :]
model = self.model
seg_pred, _ = model(points, self.to_categorical(label, 1))
seg_pred = seg_pred.cpu().data.numpy()
if self.heat_map:
out = np.asarray(np.sum(seg_pred,axis=2))
seg_pred = ((out - np.min(out) / (np.max(out) - np.min(out))))
else:
seg_pred = np.argmax(seg_pred, axis=-1) # 获得网络的预测结果 b n c
seg_pred = np.concatenate([np.asarray(xyz_feature_point), seg_pred[:, None, :]],
axis=1).transpose((0, 2, 1)).squeeze(0)
self.predict_pcd = seg_pred
self.centroid = centroid
self.m = m
def o3d_draw_3d_img(self):
pcd = self.predict_pcd
pcd_vector = o3d.geometry.PointCloud()
# 加载点坐标
pcd_vector.points = o3d.utility.Vector3dVector(self.m * pcd[:, :3] + self.centroid)
# colors = np.random.randint(255, size=(2,3))/255
colors = np.array([[0.8, 0.8, 0.8],[1,0,0]])
pcd_vector.colors = o3d.utility.Vector3dVector(colors[list(map(int,pcd[:, 6])),:])
if self.visualize:
coord_mesh = o3d.geometry.TriangleMesh.create_coordinate_frame(size = 0.1, origin = [0,0,0])
o3d.visualization.draw_geometries([pcd_vector,coord_mesh])
self.predict_pcd_colored = pcd_vector
def to_categorical(self,y, num_classes):
""" 1-hot encodes a tensor """
new_y = torch.eye(num_classes)[y.cpu().data.numpy(),]
if (y.is_cuda):
return new_y.cuda()
return new_y
def load_models(model_dict={'PonintNet': [pointnet2(num_classes=2,normal_channel=True).eval(),r'./log/part_seg/pointnet2_part_seg_msg/checkpoints']}):
model = list(model_dict.values())[0][0]
checkpoints_dir = list(model_dict.values())[0][1]
weight_dict = torch.load(os.path.join(checkpoints_dir,'best_model.pth'))
model.load_state_dict(weight_dict['model_state_dict'])
return model
class Open3dVisualizer():
def __init__(self):
self.point_cloud = o3d.geometry.PointCloud()
self.o3d_started = False
self.vis = o3d.visualization.VisualizerWithKeyCallback()
self.vis.create_window()
def __call__(self, points, colors):
self.update(points, colors)
return False
def update(self, points, colors):
coord_mesh = o3d.geometry.TriangleMesh.create_coordinate_frame(size = 0.15, origin = [0,0,0])
self.point_cloud.points = points
self.point_cloud.colors = colors
# self.point_cloud.transform([[1,0,0,0],[0,-1,0,0],[0,0,-1,0],[0,0,0,1]])
# self.vis.clear_geometries()
# Add geometries if it is the first time
if not self.o3d_started:
self.vis.add_geometry(self.point_cloud)
self.vis.add_geometry(coord_mesh)
self.o3d_started = True
else:
self.vis.update_geometry(self.point_cloud)
self.vis.update_geometry(coord_mesh)
self.vis.poll_events()
self.vis.update_renderer()
if __name__ =='__main__':
num_classes = 2 # 填写数据集的类别数 如果是s3dis这里就填13 shapenet这里就填50
color_image = o3d.io.read_image('image/rgb1.jpg')
depth_image = o3d.io.read_image('image/depth1.png')
point_cloud = generate_pointcloud(color_image=color_image, depth_image=depth_image)
TEST_DATASET = PartNormalDataset(point_cloud,npoints=30000, normal_channel=True)
testDataLoader = torch.utils.data.DataLoader(TEST_DATASET, batch_size=1, shuffle=False, num_workers=0,drop_last=True)
predict_pcd = Generate_txt_and_3d_img(num_classes,testDataLoader,load_models(),visualize = True)
Change the previous code to visualize the prediction of a single point cloud. The point cloud is generated by the GRB image and the depth image. If you want to directly input the point cloud, you can change the code a little bit. Currently, it is only for the data in the shapenet dataset format. It should be noted here that if it is selected during training --normal
, it normal_channel
must be changed to True
.
Look at the training effect and use a chair file in modelnet40 to make predictions.
It can be seen that the chair is roughly divided into four pieces, but the backrest and legs of the chair are well divided, but part of the armrest is divided into the seat cushion. After all, the training time is not long. The modelnet40 data set is only used for classification, and there is no segmentation annotation, so here is a visualization of the chair point cloud marked in shapenet, and look at the segmentation of each part of the chair (not the chair above).
Here it is more obvious that the chair is divided into four parts: backrest, armrest, cushion and legs.
After initially observing the effect, you can start to try to make your own dataset for training. You can refer to this article I wrote: "CloudCompare Creates ShapeNet Format Point Cloud Dataset" .
scene segmentation
Part segmentation networks can be easily extended to semantic scene segmentation, where point labels become semantic object classes instead of target part labels.
Experiments are conducted on the Stanford 3D semantic analysis dataset. This dataset contains 3D scans of 6 regions from Matterport scanners, including 271 rooms. Each point in the scan is annotated with a semantic label from one of 13 categories (chair, table, floor, wall, etc. plus clutter).
First download the file: S3DIS , save it to the folder data/s3dis/Stanford3dDataset_v1.2_Aligned_Version/
.
Process the data, and the data will be saved data/stanford_indoor3d/
.
cd data_utils
python collect_indoor3d_data.py
run:
## Check model in ./models
## e.g., pointnet2_ssg
python train_semseg.py --model pointnet2_sem_seg --test_area 5 --log_dir pointnet2_sem_seg
python test_semseg.py --log_dir pointnet2_sem_seg --test_area 5 --visual
After the above operations are completed, log/sem_seg/pointnet2_sem_seg/visual/
the obj file of the prediction result will be generated, which can be visualized with open3d, but the obj file cannot be visualized with the o3d.io.read_triangle_mesh function, because the obj file generated here also has color information to represent semantics information, so it has to be read as list data and then defined as o3d.geometry.PointCloud() variable display, the code is as follows:
import copy
import numpy as np
import open3d as o3d
import os
objFilePath = 'log/sem_seg/pointnet2_sem_seg/visual/Area_5_office_8_gt.obj'
with open(objFilePath) as file:
points = []
while 1:
line = file.readline()
if not line:
break
strs = line.split(" ")
if strs[0] == "v":
points.append(np.array(strs[1:7],dtype=float))
if strs[0] == "vt":
break
# points原本为列表,需要转变为矩阵,方便处理
pcd = np.array(points)
pcd_vector = o3d.geometry.PointCloud()
pcd_vector.points = o3d.utility.Vector3dVector(pcd[:, :3])
pcd_vector.colors = o3d.utility.Vector3dVector(pcd[:,3:6])
o3d.visualization.draw_geometries([pcd_vector])
Effect of office_8 in Kangkang Area_5:
original picture:
ground truth:
predict:
OK, I have roughly reproduced this PointNet++, and focused on the next point cloud segmentation to prepare for the completion of the project. Generally speaking, the process of training and prediction is not difficult, but for the health effect, the visualization part took quite a long time. Part segmentation and scene segmentation are essentially the same thing, that is, the two segmentations use different models for training in the code. Afterwards, I plan to make my own data set for training. First, use the model of part segmentation to do it. After all, it is a bit troublesome to make the scene segmentation into a data set in the form of S3DIS. In short, following this blog will definitely be able to run through PointNet++.
Add a self-made book seam recognition project using pointnet++. There are data sets and codes in GitHub: https://github.com/struggler176393/Pointnet_book_seam .