Interpretation of CenterPoint source code process (1)

Interpretation of CenterPoint source code process (1)

References :
1. Quick reading of the paper – CenterPoint
2. Introduction to 3D target detection of lidar point cloud (CenterPoint source code analysis)
3. Implementation of CenterPoint in mmdetection3d
4. Centerpoint complete translation of the original text

This article uses the configuration page :
configs/centerpoint model in the mmdetection3d project:
centerpoint_02pillar_second_secfpn_4x8_cyclic_20e_nus.py

1. Data processing part (mainly for point cloud) – train_pipeline process

1. LoadPointsFromFile

1.1 Function: Load point cloud from file.

1.2 Initialization parameters

  • coord_type: Coordinate system type, optional 'LIDAR', 'DEPTH', 'CAMERA'
  • load_dim: Data loading dimension, the default is 6, and the nuscenes dataset is set to 5
  • use_dim: The dimension to be used, the default is [0,1,2], only xyz is used
  • shift_height: Do you need to use height translation, the default is false
  • use_color: Do you need to use color features, the default is false
  • file_client_args: (optional parameter) file client configuration, the default is to use the disk method, that is, given the file path, read it directly.

1.3 Functions within the class (__init__ is omitted for all of the following)

  • _load_points: load point cloud data
  • __call__: The callback function to get the point cloud data from the file, the result is a dict containing the point cloud data
  • __repr__: Returns the module description string

2. LoadPointsFromMultiSweeps

2.1 Function: Load multi-frame point cloud data

2.2 Initialization parameters

  • sweeps_num: number of frames, default 10
  • load_dim: default 5
  • use_dim: default [0, 1, 2, 4], the difference between the 4 dimensions corresponding to the timestamp
  • time_dim: Timestamp dimension of each point, default 4
  • file_client_args: Same as above
  • pad_empty_sweeps: When the point cloud is empty, whether to repeat the keyframe, the default is false
  • remove_close: remove the closest point, default false
  • test_mode: If it is true, it will not be randomly sampled, only the nearest N frames will be selected, the default is false

2.3 Functions in a class

  • _load_points: load point cloud data
  • _remove_close: Remove all points at a certain radius from the origin
  • __call__: The callback function to get the point cloud data from the file, the result is a dict containing the point cloud data
  • __repr__: Returns the module description string

3. LoadAnnotations3D

3.1 Function: Load the 3D annotation box, encapsulate the instance mask and point cloud semantic mask into the associated field.

3.2 Initialization parameters

  • with_bbox_3d: whether to load 3D box, the default is true
  • with_label_3d: Whether to load the 3D box label, the default is true
  • with_attr_label: Whether to load the attribute label, the default is false
  • with_mask_3d: Whether to load the point cloud 3D instance mask, the default is false
  • with_seg_3d: Whether to load the point cloud 3D semantic mask, the default is false
  • with_bbox: whether to load 2D box, default false
  • with_label: whether to load 2D labels, default false
  • with_mask: Whether to load the 2D instance mask, the default is false
  • with_seg: Whether to load the 2D semantic mask, the default is false
  • with_bbox_depth: whether to load 2.5D box, default false
  • poly2mask: Whether to convert the polygon to label 二进制掩码, the default is true
  • seg_3d_dtype: 3D semantic mask type, default int64
  • file_client_args

3.3 Functions within the class
The specific processes of the following processing functions are carried out in the mmdet3d.CustomDataset class

  • _load_bboxes_3d: load 3D box annotations, return gt_bboxes_3d、bbox3d_fieldstwo keys
  • _load_bboxes_depth: Load 2.5D box annotations and return center2d、depthstwo keys
  • _load_labels_3d: Load label annotation, the returned gt_labels_3dkey
  • _load_attr_labels: load label attributes, the returned attr_labelskey
  • _load_masks_3d: Load 3D mask annotations and return pts_instance_mask、pts_mask_fieldstwo keys
  • _load_semantic_seg_3d: Load 3D semantic segmentation annotations and return pts_semantic_mask、pts_seg_fieldstwo keys
  • __call__: Return one of multiple types of annotation results
  • __repr__: Returns the module description string

4. ObjectSample

4.1 Function: Sampling ground truth (gt) objects into data

4.2 Initialization parameters

  • db_sampler (dict): Configuration of the benchmark data sampler
  • sample_2d (bool): Whether to copy the patch of the 2D image to the image, if it is a multi-modal cut-paste, it should be set to true, the default is false
  • use_ground_plane (bool): Whether to use the ground plane to adjust the 3d label, default false

4.3 Functions in a class

  • remove_points_in_boxes (static function): Remove the point cloud in the sampled bbox
  • __call__: Sampling the real-value object into the data, and the returned result contains gt_bboxes_3d、gt_labels_3d、pointsthree keys
  • __repr__: Returns the module description string

5. GlobalRotScaleTrans

5.1 Function: Rotate, zoom and translate in the global coordinate system for 3D scene applications

5.2 Initialization parameters

  • rot_range (list[float]): Rotation angle range, default [-0.78539816, 0.78539816] (close to [-pi/4, pi/4])
  • scale_ratio_range (list[float]): scaling factor, default [0.95, 1.05]
  • translation_std (list[float]): Labeling difference of translation noise, a scene is randomly translated by adding noise, the noise method is sampled from a Gaussian distribution, the default is [0, 0, 0]
  • shift_height (bool): Whether to shift the height value, the default is false

5.3 Functions in a class

  • _trans_bbox_points: Translate bbox and point cloud
  • _rot_bbox_points: Rotate bbox and point cloud
  • _scale_bbox_points: scale bbox and point cloud
  • _random_scale: Randomly set the scale factor
  • update_transform: update the transformation matrix
  • __call__: rotate, scale, translate bbox and point cloud
  • __repr__

6. RandomFlip3D

6.1 Function: Randomly flip point cloud and bbox.
Note : If the "flip" keyword is included in the input dictionary, the flag will be used. Otherwise, it will be randomly determined by the ratio specified in init.

6.2 Initialization parameters

  • sync_2d (bool, optional): Whether to apply flipping to 2D images. If true, apply relative flipping in 3D to the 2D image, if false, decide whether to randomly rotate the 2D image independently. Defaults to true.
  • flip_ratio_bev_horizontal (float, optional): Horizontal flip ratio, default 0.0
  • flip_ratio_bev_vertical (float, optional): vertical flip ratio, default 0.0
  • **kwargs: mutable key-value pair parameter

6.3 Functions in a class

  • random_flip_data_3d: random flip 3d data
  • update_transform: update the transformation matrix
  • __call__: Flip the point cloud (bbox3d_fields), while flipping the 2D image and its annotations
  • __repr__

7. PointsRangeFilter

7.1 Function: Filter point cloud by range

7.2 Initialization parameters

  • point_cloud_range (list[float]): point cloud range

7.3 Functions in a class

  • __call__: filter point cloud by range
  • __repr__

8. ObjectRangeFilter

8.1 Function: Filter obstacles by range

8.2 Initialization parameters

  • point_cloud_range (list[float]): point cloud range

8.3 Functions in a class

  • __call__: filter point cloud by range
  • __repr__

9. ObjectNameFilter

9.1 Function: filter true value obstacles according to category name

9.2 Initialization parameters

  • classes (list[str]): A list of class names that need to be retained for training

9.3 Functions in a class

  • __call__: filter obstacles by name
  • __repr__

10. PointShuffle

10.1 Function: Shuffle the order of the input point cloud (shuffling)

10.2 Initialization parameters: none

10.3 Functions in a class

  • __call__: Disrupt point cloud sorting
  • __repr__

11. DefaultFormatBundle3D

11.1 Function: Format and pack 3D information by default

Note : It simplifies the pipeline for formatting common fields of voxels, mainly including "proposals", "gt_bboxes", "gt_labels", "gt_masks" and "gt_semantic_seg". The field type conversion is as follows:

    - img: (1)transpose, (2)to tensor, (3)to DataContainer (stack=True)
    - proposals: (1)to tensor, (2)to DataContainer
    - gt_bboxes: (1)to tensor, (2)to DataContainer
    - gt_bboxes_ignore: (1)to tensor, (2)to DataContainer
    - gt_labels: (1)to tensor, (2)to DataContainer

11.2 Initialization parameters

  • class_names: list of classes
  • with_gt(bool): Whether to use the true value, the default is true
  • with_label(bool): whether to use labels, default true

11.3 Functions in a class

  • __call__: Perform transformation and formatting of generic fields, put [results]in
  • __repr__: returns a string describing the module

12. Collect3D

12.1 Function: Collect data from dataloader about specific tasks

Note :
1) This type is usually used as a data loader pipeline 最后阶段, a typical set of keywords, such as "img", "proposals", "gt_bboxes", "gt_bboxes_ignore", "gt_labels", "gt_masks"
2) img_metaalways inserted The data item, the content depends on meta_keys, and the default contains:

- 'img_shape': 网络中输入图像尺寸,类型 tuple ,维度(h, w, c). 注意图像在
          右/下可能是0扩充(pad)
- 'scale_factor': 预处理尺度
- 'flip': 图像是否翻转
- 'filename': 图像文件名路径
- 'ori_shape': 图像原始形状,tuple (h, w, c)
- 'pad_shape': 扩充后图像尺寸
- 'lidar2img': lidar 到 图像旋转矩阵
- 'depth2img': depth 到图像旋转矩阵
- 'cam2img': 相机坐标系到图像坐标洗旋转矩阵
- 'pcd_horizontal_flip': 点云是否水平翻转
- 'pcd_vertical_flip': 点云是否垂直翻转
- 'box_mode_3d': 3D box 模式
- 'box_type_3d': 3D box 类型
- 'img_norm_cfg': 正则化信息dict
    - mean: 每个channel平均值
    - std: 每个channel标准差
    - to_rgb: 是否由bgr转换为rgb
- 'pcd_trans': 点云旋转矩阵
- 'sample_idx': 样本关键帧索引
- 'pcd_scale_factor': 点云尺度因子
- 'pcd_rotation': 点云旋转
- 'pts_filename': 点云文件路径名

12.2 Initialization parameters

  • keys (Sequence[str]): collected keywords
  • meta_keys (Sequence[str], optional): The main key, used to convert to mmcv.DataContainer, stored in data[img_metas].

12.3 Inner-class functions

  • __call__: Collect keywords in results
  • __repr__: returns a string describing the module

To be continued, interpretation of CenterPoint source code process (2)

Guess you like

Origin blog.csdn.net/weixin_36354875/article/details/127757667