Here comes the source code | Nvidia open source pedestrian generation/re-identification code

A few days ago, NVIDIA open sourced the source code of DG-Net. Let's review this CVPR19 Oral paper.

The paper is the article "Joint Discriminative and Generative Learning for Person Re-identification" presented orally at CVPR19 by researchers from NVIDIA, University of Technology Sydney (UTS), and Australian National University (ANU). Deep learning model training often requires a large amount of labeled data, but it is often difficult to collect and label a large amount of data. The author explores the method of using generated data to assist training on the task of pedestrian re-identification. By generating high-quality pedestrian images and fusing them with the pedestrian re-identification model, the quality of pedestrian generation and the accuracy of pedestrian re-identification are improved at the same time.


Paper link: https://arxiv.org/abs/1904.07223
Station B video: https://www.bilibili.com/video/av51439240/Tencent
video: https://v.qq.com/x/page/t0867x53ady .html

Code address: https://github.com/NVlabs/DG-Net

 

Code running effect: (training 100000 iterations)

 

Development environment:

  • Python 3.6
  • GPU Memory >= 15G if using fp32 precision
  • GPU Memory >= 10G If you use fp16 precision, you can save some video memory
  • NumPy
  • PyTorch 1.0+
  • [Optional] APEX (use fp16 to install)

 

Dataset download address:

The Market-1501 data set is used  http://www.liangzheng.com.cn/Project/project_reid.html

 

Download the trained model:

 

The test results are as follows:

  • Accuracy of pedestrian re-identification:

  • Generated pedestrian image:

 

The command for training is simple:

The options have been built into the yaml file, and if it runs in full-precision fp32, it will take up about 15G of video memory.

python train.py --config configs/latest.yaml

If half-precision training is used, only about 10G of video memory will be used.

python train.py --config configs/latest-fp16.yaml

The training log can be viewed using tensorboard

 tensorboard --logdir logs/latest

 

About the author
Zheng Zhedong, the first author of this article, is a PhD student at the School of Computer Science at UTS, expected to graduate in June 2021. The thesis is the result of his internship at NVIDIA.

Zheng Zhedong has published 8 papers so far. One of them is ICCV17 spotlight, which has been cited more than 300 times. For the first time, feature learning for assisted person re-identification using GAN-generated images is proposed. A TOMM journal paper was selected as a 2018 Highly Cited Paper by Web of Science, with more than 200 citations. At the same time, he also contributed the benchmark code of the pedestrian re-identification problem to the community, which has more than 1,000 stars on Github and has been widely adopted.

In addition, other authors of the paper include Yang Xiaodong, an expert in the video field of Nvidia Research Institute, Yu Zhiding, an expert in the face field (Sphere Face, the author of LargeMargin), Dr. CVPR oral mid-draft), and Jan Kautz, vice president of NVIDIA Research.

 

Guess you like

Origin blog.csdn.net/Layumi1993/article/details/94647875