3D point cloud data set production record [LiDAR]

I've been working with robots for the past two years. Earlier this year, instead of just focusing on cameras, I decided to start using lidar. So, after a lot of research, I chose the 32-bundle RoboSense device.
Insert image description here

Recommended: Use NSDT editor to quickly build programmable 3D scenes

I had to spend some time setting it up, especially creating a suitable stand that could also carry the camera. After some experimenting, the lidar was finally ready and I declared that I was in love with this data.

The next step in my project is to start developing a system to detect and track 3D objects using LiDAR point clouds. Applications are diverse but include detecting fixed objects (buildings, traffic signs, etc.) to create 3D maps, as well as detecting moving objects (pedestrians, cars, etc.) to avoid collisions.

Before developing any of the above applications, I first needed to learn how to efficiently load point cloud data into TensorFlow (the tool I use for deep learning). Currently, my dataset consists of 12,200 point cloud image pairs. This image is used as background to understand what the lidar is looking at. I also preprocessed all point clouds to only display data approximately within the camera's field of view, rather than the original 360° view.
Insert image description here

Trying to load data into TensorFlow was more challenging than I expected. First, point clouds are stored as PCD (Point Cloud Data) files, which is a file format used to store 3D point cloud data. TensorFlow cannot handle this type of file directly, so conversion is required. The Open3D library is an easy-to-use point cloud manipulation tool. Using this tool, I can easily load the PCD file and extract the points as NumPy arrays of X, Y, and Z coordinates. Another tool NSDT 3DConvert was used to visualize the PCD point cloud and confirm that the points were correctly extracted on Google Colab:
Insert image description here

https://3dconvert.nsdt.cloud

With the new tool in hand, I uploaded 12,200 PCDs and 12,200 JPGs to my Google Drive and connected it to Google Colab. I then created some code to load the PCD, extract the points, and put them into a NumPy array, a structure that TensorFlow can easily handle. I confidently ran the code and was horrified to see that, after waiting a few minutes, Colab complained about running out of memory while converting the point cloud. Bad news, because I plan to collect and process more data than I currently have.

Fortunately, this is a common problem when working with large datasets, and tools like TensorFlow have features to handle such situations. The required solution is the Dataset API, which provides methods for creating efficient input pipelines. Quoting the API's documentation: The use of datasets follows a common pattern:

  • Create a source dataset based on your input data.
  • Apply dataset transformations to preprocess the data.
  • Iterate over the dataset and process elements.
  • Iterations are performed in a streaming manner, so the complete data set does not need to be loaded into memory.

So, essentially, the Dataset API will allow me to create a pipeline and the data will be loaded in parts based on requests from the training loop in TensorFlow, thus avoiding out of memory. So I reviewed how to use the API and created some code to build the data pipeline. Following step 1 of the above pattern, the code first loads the URL list of all PCDs and images, then in step 2, loads the PCDs and converts them to points in NumPy, then loads the images and normalizes them. But this time I ran into trouble again.

For efficiency, everything in the Dataset API (and obviously all TensorFlow APIs) operates as tensors in the graph. The Dataset API provides functions for loading data from different formats, but there are no functions for PCD. After researching different possible solutions, I decided that instead of having the data as multiple PCD and JPEG files and letting TensorFlow load and preprocess them, I would preprocess all the data offline and package it into HDF5 files.

HDF5 is an open source file format that supports large, complex, and heterogeneous data. I've obviously verified that the Dataset API supports this type of file. The main advantage of using this format, besides being able to work well with TensorFlow, is that I can pack all the data into one large, well-structured file that I can move around easily. I created a simple Python script to load all PCDs, extract points, and package them into a nice HDF5 file along with the corresponding context files.

def generate_hdf5_dataset_with_padding(path, run_name, hdf5_filename):

	# Build main path
	path = join(path, run_name)
	
	# Get files
	jpgs = sorted(glob.glob(path+"/jpg/*.jpg"))
	pcds = sorted(glob.glob(path+"/pcd/*.pcd"))

	# Open HDF5 file in write mode
	with h5py.File(hdf5_filename, 'w') as f:
		
		images = []
		point_clouds = []
		
		# Determine the size of largest point cloud for padding
		max_size = 0

		for i, jpg in enumerate(jpgs):

			base_name = jpg[jpg.rfind("/")+1:jpg.find(".jpg")]

			# Load the image
			image = cv2.cvtColor(cv2.imread(jpgs[i]), cv2.COLOR_BGR2RGB) 
			images.append(image)

		
			# Load the point cloud
			cloud = o3d.io.read_point_cloud(pcds[i])
			points= np.asarray(cloud.points)
			point_clouds.append(points)
			
			# Keep track of largest size
			if points.shape[0] > max_size:
				max_size = points.shape[0]
			
			if ((i+1) % 1000 == 0):
				print("Processed ",(i+1)," pairs of files.")
				
		print("Max size ", max_size)
		print("Padding ...")

		# Pad the point clouds with 0s
		padded_point_clouds = []
		for points in point_clouds:
			pad_amount = max_size - points.shape[0]
			
			points_padded = np.pad(points, ((0, pad_amount),(0, 0)), 'constant', constant_values=(0, 0))
			padded_point_clouds.append(points_padded)

		# Create an images and a point clouds dataset in the file
		f.create_dataset("images", data = np.asarray(images))
		f.create_dataset("point_clouds", data = np.asarray(padded_point_clouds))
		
	print("Done!")

After loading the HDF5 file (~18 GB) into my Drive, I returned to Colab and added the appropriate Dataset API code. Essentially, step 1 of this pattern loads the image and points from the HDF5 file and creates the corresponding pairs, step 2 randomly selects some points from the point cloud (I will explain why in a later article) and normalizes the image Once this is done, step 3 is ready to serve the data nicely upon request.

def resize_and_format_data(points, image):

  # Sample a random number of points
  idxs = tf.range(tf.shape(points)[0])
  ridxs = tf.random.shuffle(idxs)[:SAMPLE_SIZE]
  points = tf.gather(points, ridxs)

  # Normalize pixels in the input image
  image = tf.cast(image, dtype=tf.float32)
  image = image/127.5
  image -= 1
  
  return points, image

def get_training_dataset(hdf5_path):
  # Get the point clouds
  x_train = tfio.IODataset.from_hdf5(hdf5_path, dataset='/point_clouds')
  # Get the images
  y_train = tfio.IODataset.from_hdf5(hdf5_path, dataset='/images')
  # Zip them to create pairs
  training_dataset = tf.data.Dataset.zip((x_train,y_train))
  # Apply the data transformations
  training_dataset = training_dataset.map(resize_and_format_data)
  
  # Shuffle, prepare batches, etc ...
  training_dataset = training_dataset.shuffle(100, reshuffle_each_iteration=True)
  training_dataset = training_dataset.batch(BATCH_SIZE)
  training_dataset = training_dataset.repeat()
  training_dataset = training_dataset.prefetch(-1)

  # Return dataset
  return training_dataset

I tried the data pipeline using very basic training code and it worked really well. No more out of memory errors. I'm not sure if this is the most efficient way to provide the data, but it does the trick, and in particular, creating the pipeline is a good first exercise in point cloud data manipulation. Next, train the first TensorFlow model using the point cloud.


Original link: Producing 3D point cloud data set—BimAnt

Guess you like

Origin blog.csdn.net/shebao3333/article/details/133301550