0.环境

ubuntu16.04
python3.6
cuda9.0

# pip install
torch==1.1.0
joblib
pandas
h5py
torchvision


# apt install
apt-get -y install ffmpeg

1.数据准备

1.1 下载数据

官网：https://www.crcv.ucf.edu/data/UCF101.php,下面有两处是我们需要的。第一处是avi数据，第二处是分训练集与测试集的。

UCF101视频分类数据集：http://www.crcv.ucf.edu/datasets/human-actions/ucf101/UCF101.rar

新建data目录，解压UCF101.rar文件到其中：

apt-get install unrar
unrar x UCF101.rar

下载第二处数据https://www.crcv.ucf.edu/data/UCF101/UCF101TrainTestSplits-DetectionTask.zip。放到自己想放的地方，我是放到data目录下。

1.2 视频生成图片

修改代码（参考此处https://github.com/kenshohara/3D-ResNets-PyTorch/issues/202）：

# line 18
p = subprocess.run(ffprobe_cmd, capture_output=True)
# 改为
p = subprocess.run(ffprobe_cmd, stdout=PIPE, stderr=PIPE)

运行命令，视频生成jpg文件：

python -m util_scripts.generate_video_jpgs avi_video_dir_path jpg_video_dir_path ucf101

eg：
python -m util_scripts.generate_video_jpgs "/root/avi_video_dir_path" "/root/jpg_video_dir_path" dataset="ucf101"

大概这个过程需要花1个小时以内。

1.3 生成json文件

python -m util_scripts.ucf101_json annotation_dir_path jpg_video_dir_path dst_json_path

# eg:

python -m util_scripts.ucf101_json "/root/data/ucfTrainTestlist" "/root/data/UCF101_jpg/" "/root/data/UCF101_json"

2.准备预训练模型

https://drive.google.com/drive/folders/1zvl89AgFAApbH0At-gMuZSeQB_LpNP-M

3.文件结构

data
	UCF101_videos
	  	ApplyEyeMakeup
	  	...
	  	YoYo

  	UCF101_jpg
	  	ApplyEyeMakeup
	  	...
	  	YoYo
  	
  	UCF101_json
  		ucf101_01.json

  	UCF101_txt
	  	classInd.txt
	  	testlist01.txt
	  	testlist02.txt
	  	testlist03.txt
	  	trainlist01.txt
	  	trainlist02.txt
	  	trainlist03.txt

  	models
  		resnet-50-kinetics.pth

    results

4.finetuning for mutil-gpu

此处遇到“RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cuda:1”问题，已经尝试修改两天了，暂缓一下。

下载好了多gpu训练的模型，需要多gpu优化才行，否则会报模型的参数名称不对应的问题。

修改代码：

# model.py
# line 100

pretrain = torch.load(pretrain_path, map_location='cpu')

        if torch.cuda.device_count() > 1:
            # 如果有多个GPU，将模型并行化，用DataParallel来操作。这个过程会将key值加一个"module. ***"。
            model = nn.DataParallel(model)
        model.load_state_dict(pretrain['state_dict'])
        tmp_model = model
        if model_name == 'densenet':
            tmp_model.classifier = nn.Linear(tmp_model.classifier.in_features,
                                             n_finetune_classes)
        elif model_name != 'densenet' and torch.cuda.device_count() > 1:
            tmp_model.fc = nn.Linear(tmp_model.module.fc.in_features,
                                     n_finetune_classes)
        else:
            tmp_model.fc = nn.Linear(tmp_model.fc.in_features,
                                     n_finetune_classes)

如果指定gpu从0开始的话，下面的"cuda:0"；如果从1开始，设置“cuda:1”。

# main.py
opt.device = torch.device('cpu' if opt.no_cuda else 'cuda')

# 指定为
opt.device = torch.device('cpu' if opt.no_cuda else 'cuda:0')
# or
opt.device = torch.device('cpu' if opt.no_cuda else 'cuda:1')

CUDA_VISIBLE_DEVICES='0,1,2,3' python main.py --root_path ./data --video_path ucf101_jpg/ --annotation_path ucf101_json/ucf101_01.json \
--result_path results --dataset ucf101 --n_classes 101 --n_pretrain_classes 400 \
--pretrain_path models/resnet-50-kinetics.pth --ft_begin_module fc \
--model resnet --model_depth 50 --batch_size 128 --n_threads 4 --checkpoint 5

如果出现共享内存不足的话，修改 --n_threads 1，或者更小。如果还有问题的话，参考我的另外一篇博客，将对应代码添加到运行代码的前面。https://blog.csdn.net/qq_35975447/article/details/107287614

5.训练

python main.py --root_path ./data --video_path ucf101_jpg/ --annotation_path ucf101_json/ucf101_01.json \
--result_path results --dataset ucf101 --n_classes 101 --model resnet --model_depth 50 --batch_size 128 \
--n_threads 4 --checkpoint 5

终端显示结果：

batch_size=32时，大概需要7GB显存吧。

【视频分类】3D-ResNets-PyTorch复现