Personalized data set training, testing and visualization based on pointNet++ model [with source code]
1. Create a personalized data set
According to the ModelNet40 data set, the personalized data set requires the preparation of configuration files for classifying point cloud data and recording point cloud data.
1.1 Classified point cloud data
1.1.1 Introduction to the original data set
There are several txt point cloud data under each point cloud category folder.
Each point cloud data has several rows of point data. Each row of point data contains 6 data such as x, y, z, nx, ny, nz.
x, y, and z are the spatial coordinates of the point cloud, and nx, ny, and nz are the normal vectors of the point cloud.
Install CloudCompare software to open txt point cloud files
1.1.2 Create personalized data sets
The goal of this experiment is to detect the opening and closing status of the pantograph. Two types of point cloud data, open and close, need to be collected. Test is an empty directory that stores the final real unlabeled test data. process_data.py is an automatically generated data set configuration file. Script, which will be introduced later
1.2 Configuration file for recording point cloud data
1.2.1 Introduction to the original data set
In addition to classifying point cloud data, some configuration files are also needed to record the point cloud data file path, category, training set, and test set.
(1) filelist.txt records the relative file path of point cloud data
(2) modelnet40_shape_names.txt records all categories
(3) modelnet40_train.txt records the training set divided from the data set
(4) modelnet40_test.txt records the data from Centrally divided test set
1.2.2 Create personalized data sets
Write the process_data.py data set preprocessing script file
import os
import random
import shutil
from pathlib import Path
# 获取当前程序执行的工作目录
current_directory = os.getcwd()
directory_name = os.path.basename(current_directory)
# 带类别标签的数据集所在目录
folder_paths = [Path(current_directory + "/open"), Path(current_directory + "/close")]
# 初始化目录中文件
test_folder_path = Path(current_directory + "/Test")
shutil.rmtree(test_folder_path)
os.mkdir(test_folder_path)
for folder_path in folder_paths:
file_list = list(folder_path.glob("*"))
for i, file_path in enumerate(file_list, start=1):
new_file_name = f"{str(i).zfill(4)}{file_path.suffix}"
new_file_path = folder_path / new_file_name
if new_file_path.exists():
continue
file_path.rename(new_file_path)
# 生成以下目标文件(配置文件)
txt_file_path = Path(current_directory + "/filelist.txt") # 记录全部带标签的数据集文件(带后缀名)
txt_file_path2 = Path(current_directory + f"/{directory_name}_train.txt") # 记录全部带标签的训练文件(不带后缀名)
txt_file_path3 = Path(current_directory + f"/{directory_name}_test.txt") # 记录全部带标签的测试文件(不带后缀名)
txt_file_path4 = Path(current_directory + f"/{directory_name}_shape_names.txt") # 记录全部类别(不带后缀名)
txt_file_path5 = Path(current_directory + f"/{directory_name}_realtest.txt") # 记录不带标签的真实测试文件(不带后缀名)
# 打开目标文件
with txt_file_path.open("w") as txt_file, txt_file_path2.open("w") as txt_file2, txt_file_path3.open(
"w") as txt_file3, txt_file_path4.open("w") as txt_file4:
for folder_path in folder_paths:
folder_name = folder_path.name
file_list = list(folder_path.glob("*"))
random.shuffle(file_list) # 随机打乱文件列表
total_files = len(file_list)
train_files_count = int(total_files * 0.8) # 80% of the files for training
for i, file_path in enumerate(file_list, start=1):
new_file_name = f"{folder_name}_{str(i).zfill(4)}{file_path.suffix}"
new_file_path = folder_path / new_file_name
file_path.rename(new_file_path) # 将数据集中的文件名重命名(类别_编号)
txt_file.write(
str(new_file_path.relative_to(os.path.dirname(folder_path))) + "\n") # 将新文件的相对路径写入filelist.txt
if i <= train_files_count:
txt_file2.write(new_file_path.stem + "\n") # 将前80%的训练数据(不包含后缀)写入_train.txt
else:
txt_file3.write(new_file_path.stem + "\n") # 将后20%的测试数据(不包含后缀)写入_test.txt
txt_file4.write(folder_name + "\n") # 将文件夹名(类别名)写入_shape_names.txt
# 以下代码是生成真实测试数据Test目录(这里就是简单抽取训练集中的一些数据,标签是通过目录名称体现,这里可以认为这些数据不带标签)
# 从open目录中随机选择n个文件
n_test = 5
open_files = list(Path(current_directory + "/open").glob("*"))
selected_open_files = random.sample(open_files, n_test)
# 从close目录中随机选择n个文件
close_files = list(Path(current_directory + "/close").glob("*"))
selected_close_files = random.sample(close_files, n_test)
# 将选中的文件复制到Test目录中
test_folder_path = Path(current_directory + "/Test"