Scannet dataset introduction and download

In the near future, I plan to reproduce the pointcontrast model for semantic segmentation, but I have been stuck on the download of the data set. I feel that it is necessary to understand what is going on with this data set. Here I would like to thank the two bloggers very much. There are links to their blog posts below, which are very well written. Let me sort out my ideas here.

Introduction to Datasets

The github address of the dataset

ScanNet is an RGB-D video dataset containing 2.5 million views from over 1500 scans, annotated with 3D camera pose, surface reconstruction, and instance-level semantic segmentation. The ScanNet V2 dataset has a total of 1.2T. (But in fact, it is not necessary to download all, and download selectively according to the corresponding tasks)

An RGB-D sensor is a specific type of depth-sensing device used with RGB (red, green, and blue) sensor cameras. It enhances traditional images by using depth information (related to the distance from the sensor) on a per-pixel basis, i.e. RGBD = RGB + Depth Map.

  1. How was the dataset created?

    Stanford's doctoral team collects 3D reconstruction data, annotates the data in an effective way, and collects more data. The team collects RGB-D video sequences, deepens the sensor collection through the ipad application, and then the video will be uploaded to the server and automatically reconstructed. The video is then fed to Amazon Mechanical Turk, which crowdsources the annotation work.

2. How is the data labeled?

​In a given 3D scene, draw objects, which can be chairs, tables, or computers, so as to understand each object and its corresponding location. Each image typically requires 5 people to annotate. The resulting data can be used for training tasks such as object classification. The main task is to give semantic interpretation to 3D data, which will help robots better understand the world.

Dataset introduction

A total of 1513 collected scene data (the number of point clouds in each scene is different, if you want to use end-to-end, you may need to sample to make the points of each scene the same), a total of 21 categories of objects, of which 1201 The scenes are used for training and 312 scenes are used for testing.

Data in ScanNet is organized in RGB-D sequence. Each sequence is stored under a directory named scene < spaceId > < scanId > or scene%04d_%02d, where each space corresponds to a unique position (indexed from 0). Raw data captured during scanning, camera pose and surface mesh reconstructions, and annotation metadata are all stored together for a given sequence. This directory has the following structure:

Dataset directory structure : Although it is explained, it is still not understood. . . . . But I probably know that it contains 2D and 3D data. If it is a point cloud, 2D data is not needed.

<scanId>
|-- <scanId>.sens
    包含颜色帧、深度帧、相机姿势和其他数据的 RGB-D 传感器流
    RGB-D传感器流(*sens):压缩二进制格式,
    包含每帧的颜色、深度、相机姿势和其他数据。
    其中RGB图像大小为1296×968,深度图像大小为640×480
|-- <scanId>_vh_clean.ply
    High quality reconstructed mesh
    高质量重建网格
|-- <scanId>_vh_clean_2.ply
    Cleaned and decimated mesh for semantic annotations
    清理和抽取语义注释的网格
|-- <scanId>_vh_clean_2.0.010000.segs.json
    Over-segmentation of annotation mesh
    注释网格的过分割
|-- <scanId>.aggregation.json, <scanId>_vh_clean.aggregation.json
    Aggregated instance-level semantic annotations on lo-res, hi-res meshes, respectively
    分别在低分辨率、高分辨率网格上聚合实例级语义注释
|-- <scanId>_vh_clean_2.0.010000.segs.json, <scanId>_vh_clean.segs.json
    Over-segmentation of lo-res, hi-res meshes, respectively (referenced by aggregated semantic annotations)
    分别对低分辨率、高分辨率网格进行过分割(由聚合语义注释引用)
|-- <scanId>_vh_clean_2.labels.ply
    Visualization of aggregated semantic segmentation; colored by nyu40 labels (see img/legend; ply property 'label' denotes the nyu40 label id)
    聚合语义分割的可视化;由 nyu40 标签着色(参见 img/图例;ply 属性“label”表示 nyu40 标签 id)
|-- <scanId>_2d-label.zip
    Raw 2d projections of aggregated annotation labels as 16-bit pngs with ScanNet label ids
    聚合注释标签的原始 2d 投影为具有 ScanNet 标签 id16 位 png
|-- <scanId>_2d-instance.zip
    Raw 2d projections of aggregated annotation instances as 8-bit pngs
    聚合注释实例的原始二维投影为 8 位 png
|-- <scanId>_2d-label-filt.zip
    Filtered 2d projections of aggregated annotation labels as 16-bit pngs with ScanNet label ids
    将聚合注释标签的 2d 投影过滤为具有 ScanNet 标签 id16 位 png
|-- <scanId>_2d-instance-filt.zip
    Filtered 2d projections of aggregated annotation instances as 8-bit pngs
    将聚合注释实例的 2d 投影过滤为 8 位 png

Data format and visualization code

For example, this Surface mesh segmentation file *.segs.json

./ContrastiveSceneContexts-main/downstream/semseg/datasets/preprocessing/scannet/SCANNET_DATA/scans/scene0000_00/scene0000_00_vh_clean_2.0.010000.segs.json

{
    
    
"params": {
    
    //分割参数
	"kThresh": "0.0001",
	"segMinVerts": "20",
	"minPoints": "750",
	"maxPoints": "30000",
	"thinThresh": "0.05",
	"flatThresh": "0.001",
	"minLength": "0.02",
	"maxLength": "1"
}
,
"sceneId": "scene0000_00",
"segIndices":[....]//网格段的每顶点索引
}

Semantic annotation file *.aggregation.json

ContrastiveSceneContexts-main/downstream/semseg/datasets/preprocessing/scannet/SCANNET_DATA/scans/scene0000_00/scene0000_00.aggregation.json

{
    
    
"sceneId": "scannet.scene0000_00",//标注场景的ID
"appId": "Aggregator.v2",// 用于创建注释的工具的版本
"segGroups": [
	{
    
    
	"id": 0,
	"objectId": 0,
	"segments": [43652,43832,43632,53294,44062,44013,44158,53070,53173,53253],
	"label": "window"
	},{
    
    },{
    
    }....
    ],
"segmentsFile": "scannet.scene0000_00_vh_clean_2.0.010000.segs.json"}//引用的 *.segs.json 分割文件的id

BenchmarkScripts/util_3d.py gives examples to parse the semantic instance information from the *.segs.json, *.aggregation.json, and *_vh_clean_2.plymesh file, with example semantic segmentation visualization (you can try semantic segmentation visualization later) in BenchmarkScripts/3d_helpers/visualize_labels_on_mesh.py .

Dataset download method

https://blog.csdn.net/qq_25763027/article/details/126112421

https://blog.csdn.net/qq_35781447/article/details/115078283#comments_24548052

The bloggers of these two articles explained the download process in great detail. I also refer to his download method for downloading. It is highly recommended. Of course, if it doesn’t work, you can only write an email to the official to get the python code for downloading data (I haven’t tried this way)

With the downloaded code, you can use the command, -o is the file download path, and -type is the type of file to download.

python download_scannet.py -o path(存放数据集的路径) --type _vh_clean_2.ply
python download_scannet.py -o path(存放数据集的) --type _vh_clean_2.label.ply
#python3 download-scannet.py -o scannet/ --type  _vh_clean_2.labels.ply
python download_scannet.py -o path(存放数据集的) --type _vh_clean_2.0.010000.segs.json
python download_scannet.py -o path(存放数据集的) --type .aggregation.json

Downloaded file directory structure

├── scannet
│   ├── scans
│   │   ├── [scene_id]									
│   │   │   ├── [scene_id].aggregation.json
│   │   │   ├── [scene_id]_vh_clean_2.0.010000.segs.json
│   │   │   ├── [scene_id]_vh_clean_2.labels.ply
│   │   │   ├── [scene_id]_vh_clean_2.ply
│   ├── scans_test
│   │   ├── [scene_id]								
│   │   │   ├── [scene_id]_vh_clean_2.ply
│   ├── scannetv2-labels.combined.tsv

Guess you like

Origin blog.csdn.net/shan_5233/article/details/128300415