1. Project introduction
In this project, a convolutional neural network will be created to detect and classify objects using Waymo's dataset. You will be given a dataset of images of urban environments with labeled cyclists, pedestrians, and vehicles.
First, extensive data analysis is performed, including computation of label distributions, display of sample images, and checking for object occlusion.
Example nighttime imagery from the Waymo dataset, with annotations for vehicles and pedestrians
Use this analysis to decide which augmentations make sense for this project; then, train a neural network to detect and classify objects.
Use TensorBoard to monitor training and decide when it's over. Finally, experiment with different hyperparameters to improve the performance of the model.
This project will include using the TensorFlow Object Detection API, where a model can be deployed to obtain predictions on images sent to the API; it will also provide code for creating short videos of the model's predictions.
2. Environment configuration
create project
Local Setup : Use the instructions below to create a Docker container using a local GPU, or create a similar environment on a cloud provider's GPU instance.
project files
First, get the project files from the associated Github repository ( https://github.com/udacity/nd013-c1-vision-starter )
.
Docker installation
For local installations, if you have your own Nvidia GPU, you can use the provided Dockerfile and requirements in the build directory of the starter code.
The instructions below are also included in the Build directory of the starter code.
need
- NVIDIA GPU with latest drivers installed
- Docker / nvidia-docker
Build
Use the following command:
docker build -t project-dev -f Dockerfile .
Create a container:
docker run --gpus all -v <PATH TO LOCAL PROJECT FOLDER>:/app/project/ --network=host -ti project-dev bash
and any other flags useful to the system (for example, --shm-size ).
set up
Once inside the container, gsutil needs to be installed, by running:
Curl https://sdk.cloud.google.com | bash
Once gsutil is installed and added to your path, you can verify it with:
gcloud auth login
Debug (Debug)
- If you have any problems installing the TF Object Detection API (TF Object Detection API.), please follow this tutorial Installation — TensorFlow 2 Object Detection API tutorial documentation .
3. Project Description
Object Detection in Urban Environments ( GitHub - udacity/nd013-c1-vision-starter: Starter Code for the Course 1 project of the Udacity Self-Driving Car Engineer Nanodegree Program )
local settings
For local installation, if you have your own Nvidia GPU, you can use the Dockerfile and requirements provided in the build directory (https://github.com/udacity/nd013-c1-vision-starter/tree/main/build ).
Follow the README in it ( nd013-c1-vision-starter/build/README.md at main udacity/nd013-c1-vision-starter GitHub ) to create a docker container and install all prerequisites.
Download and process data
For this project, we will use data from the Waymo Open dataset .
These files can be downloaded directly from the website as a tar file, or from a Google Cloud Bucket as individual tf records .
The first goal of this project is to download data from Waymo's Google Cloud Storage to a local machine. For this project, we only need to provide a subset of the data (e.g. no need to use Lidar data). Therefore, we will download and trim each file immediately. , see the create_tf_example function, which will perform this processing. This function gets the Waymo Tf record component and saves it in Tf object detection api format (Tf Object Detection api). An example of such a function is described below ( Training Custom Object Detector — TensorFlow 2 Object Detection API tutorial documentation ) . We have provided the documents.在download_process.py
label_map.pbtxt
The command to run the script is as follows:
python download_process.py --data_dir {processed_file_location} --size {number of files you want to download}
Download 100 files (unless the size parameter is changed)! After the script finishes, you can check the data_dir folder to see if the files were downloaded and processed correctly.
structure
data
The data that will be used for training, validation and testing can be organized as follows:
- train : contains training data
- val : contains validation data
- test - contains test files to test the model and create inference videos
By completing and executing the create_split.py file, the downloaded data will be split into training, validation and test sets.
experiment
Experiment folders will be organized as follows:
experiments/
- pretrained_model/
- exporter_main_v2.py
- to create an inference model
- model_main_tf2.py
- to launch training
- reference/
- reference training with the unchanged config file
- experiment0/
- create a new folder for each experiment you run
- experiment1/
- create a new folder for each experiment you run
- experiment2/
- create a new folder for each experiment you run
- label_map.pbtxt
...
guide
Step 1a - Exploratory Data Analysis (EDA)
Explore datasets using data already in the /home/workspace/data/ directory! This is the most important task in any machine learning project.
- Implement the display_images function in the exploratory data analysis notebook. The output of this function is shown in the following figure:
Expected output of display_images function
- Additional EDA : Feel free to spend more time exploring the data and reporting your findings. Report anything related to the dataset in the written report.
Refer to this analysis to create different tests (training and validation sets).
Step 1b - Create train-validation split
We discussed cross-validation and the importance of creating meaningful training and validation splits. For this project, the relevant training, validation and test sets must be created using files located in /home/workspace/data/. The split function in the create_split.py file does the following:
-
Create three subfolders: /home/workspace/data/train/, /home/workspace/data/val/ and /home/workspace/data/test/
- Split the tf recording files into these three folders.
After the function is implemented, run the script with the following command:
python create_splits.py --data-dir /home/workspace/data
You can also use any other method to divide the data into training, validation, and testing.
Step 2 - Edit the configuration file
Now you can start training, the Tf object detection API depends on the configuration file. The configuration used in this project is pipeline.config, configured for a Resnet 50 640x640 SSD device. You can learn more about single-shot detectors here ( https://arxiv.org/pdf/1512.02325.pdf ) .
- First, let's download the pretrained model ( http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz ) and move it to /home/workspace/experiments/pretrained_model/. Please follow the steps below:
cd /home/workspace/experiments/pretrained_model/
wget http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz
tar -xvzf ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz
rm -rf ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz
-
We need to edit the configuration file to change the location of the training and validation files, as well as the location of the label_map file for the pretrained weights. We also need to adjust the batch size. To do this, execute the following command:
cd /home/workspace/
python edit_config.py --train_dir /home/workspace/data/train/ --eval_dir /home/workspace/data/val/ --batch_size 2 --checkpoint /home/workspace/experiments/pretrained_model/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/checkpoint/ckpt-0 --label_map /home/workspace/experiments/label_map.pbtxt
A new configuration file called pipeline_new.config will be created under the /home/workspace/ directory. Move the file to the /home/workspace/experiments/reference/ directory.
Step 3 - Model training and evaluation
Start the training process:
- Training process:
python experiments/model_main_tf2.py --model_dir=experiments/reference/ --pipeline_config_path=experiments/reference/pipeline_new.config
To monitor training, start a tensorboard instance by running python -m tensorboard.main --logdir experiments/reference/ . Report your findings in a written report. The log is shown in the figure below:
Tensorboard training log
After training, start the evaluation process. Starting the evaluation process in parallel with the training process will cause OOM errors in the workspace.
- Evaluation process:
python experiments/model_main_tf2.py --model_dir=experiments/reference/ --pipeline_config_path=experiments/reference/pipeline_new.config --checkpoint_dir=experiments/reference/
By default, the evaluation script only runs for one epoch. Therefore, the eval log in Tensorboard will look like a blue dot.
Note: Both processes will display some Tensorflow warnings, which can be ignored. You may have to manually terminate the evaluation script with CTRL+C.
Step 4 - Improve performance
Most likely, this initial experiment did not yield the best results. However, you can make several changes to the configuration file to improve this model.
-
One obvious change is the improved data augmentation strategy. The preprocessor.proto ( https://github.com/tensorflow/models/blob/master/research/object_detection/protos/preprocessor.proto ) file contains the different data augmentation methods available in the Tf object detection API. To help visualize these enhancements, a notebook is provided: Explore augments.ipynb. Using the notebook, try different combinations of data augmentations and choose the one that you think works best for your dataset. Justify your choice in a written statement.
-
Keep in mind that the following are also available:
- Experiment with optimizers: types of optimizers, learning rates, schedulers, etc.
- Experiment with the architecture. Tf object detection API model zoo ( https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md ) provides many architectures. Remember the pipeline.config file is unique to each architecture and you must edit it.
Create animated videos
Export the trained model
Modify the parameters of the following functions to suit your model:
python experiments/exporter_main_v2.py --input_type image_tensor --pipeline_config_path experiments/reference/pipeline_new.config --trained_checkpoint_dir experiments/reference/ --output_directory experiments/reference/exported/
This will create a new folder experiments/reference/exported/saved_model. You can read more about the Tensorflow SavedModel format here ( https://www.tensorflow.org/guide/saved_model ) .
Finally, a video of model inference can be created for any tf record file. To do this, run the following command (modify it into your file):
python inference_video.py --labelmap_path label_map.pbtxt --model_path experiments/reference/exported/saved_model --tf_record_path data/test/segment-12200383401366682847_2552_140_2572_140_with_camera_labels.tfrecord --config_path experiments/reference/pipeline_new.config --output_path animation.gif
submit template
Project Overview
This section should contain a brief description of the project and what we are trying to achieve. Why object detection is an important part of self-driving car systems?
set up
This section should contain a brief description of the steps you need to follow to run the code from this repository.
data set
Dataset analysis
This section should contain a quantitative and qualitative description of the dataset. It should include images, charts and other visualizations.
Cross-validation
This section should detail the cross-validation strategy and justify your approach.
train
reference experiment
This section should detail the results of the reference experiments. It should include training metrics and detailed explanations of algorithm performance.
improve reference
This section will highlight the different strategies you can employ to improve your model. It should contain relevant data and details about your findings.
Reference Code---