Reference Code:
https://github.com/JDAI-CV/fast-reid/tree/v1.0.0
0. Environment
ubuntu16.04
cuda9.0
python3.6
torch==1.1.0
torchvision==0.3.0
Cython
yacs
tensorboard
future
termcolor
sklearn
tqdm
opencv-python==4.1.0.25
matplotlib
scikit-image
numpy==1.16.4
faiss-gpu==1.6.3
Install apex (do not install directly via pip):
git clone https://www.github.com/nvidia/apex
cd apex
# python setup.py install --cuda_ext --cpp_ext
pip install -v --no-cache-dir ./
1. Prepare the data
Refer to https://blog.csdn.net/qq_35975447/article/details/106664593
The data directory structure is as follows:
fast-reid
datasets
Market-1501-v15.09.15
bounding_box_train
bounding_box_test
query
2. Modification and training
The modified part is the same as https://blog.csdn.net/qq_35975447/article/details/112482765 , you can directly train:
CUDA_VISIBLE_DEVICES="0,1" python ./tools/train_net.py --config-file='./configs/Market1501/sbs_R101-ibn.yml'
3. Knowledge distillation training
Prepared here is the Market1501 data set, change the yaml file to:
1) Train the teacher model:
CUDA_VISIBLE_DEVICES='0,1' python ./projects/FastDistill/train_net.py --config-file ./projects/FastDistill/configs/sbs_r101ibn.yml --num-gpus 2
Without Non-local, the image size also has only 128x256 results:
2) Training r34 alone:
CUDA_VISIBLE_DEVICES='0' python ./projects/FastDistill/train_net.py --config-file ./projects/FastDistill/configs/sbs_r34.yml --num-gpus 1
3) loss, r101 is used as a teacher model to train a student model:
CUDA_VISIBLE_DEVICES='0,1' python ./projects/FastDistill/train_net.py --config-file ./projects/FastDistill/configs/kd-sbs_r101ibn-sbs_r34.yml --num-gpus 2
4) Loss, use r34 as a pre-training model to train, and r101 as a teacher model to train the student model:
CUDA_VISIBLE_DEVICES='0,1' python ./projects/FastDistill/train_net.py --config-file ./projects/FastDistill/configs/kd-sbs_r101ibn-sbs_r34.yml --num-gpus 2 MODEL.WEIGHTS projects/FastDistill/logs/market1501/r34/model_best.pth
5) loss+overhaul distillation, r101 is used as a teacher model to train a student model:
CUDA_VISIBLE_DEVICES='0,1' python ./projects/FastDistill/train_net.py --config-file ./projects/FastDistill/configs/kd-sbs_r101ibn-sbs_r34.yml --num-gpus 2 --dist-url 'tcp://127.0.0.1:49153' MODEL.META_ARCHITECTURE DistillerOverhaul
6) loss+overhaul distillation, use r34 as the pre-training model for training, and r101 as the teacher model to train the student model:
CUDA_VISIBLE_DEVICES='0,1' python ./projects/FastDistill/train_net.py --config-file ./projects/FastDistill/configs/kd-sbs_r101ibn-sbs_r34.yml --num-gpus 2 --dist-url 'tcp://127.0.0.1:49153' MODEL.WEIGHTS projects/FastDistill/logs/market1501/r34/model_best.pth MODEL.META_ARCHITECTURE DistillerOverhaul
7) Comparison table on Market1501 data set (the following table is written in Zhihu, writing the table is too troublesome, just take a screenshot):
Model | Rank@1 | mAP |
---|---|---|
R101_ibn (teacher) | 95.52% | 88.75% |
R34 (student) | 91.95% | 79.60% |
JS Div | 94.74% | 86.85% |
JS Div+R34 | 94.71% | 86.60% |
JS Div + Overhaul | 94.80% | 87.39% |
JS Div + Overhaul+R34 | 95.19% | 87.33% |
Whether to add r34 as a pre-training model, the results are not much different. In fact, training the knowledge distillation does not require pre-training the student model in advance. According to the results of knowledge distillation, knowledge distillation is still very effective (a lot of increase).