Smoking (smoking) detection and recognition 2: Pytorch realizes smoking (smoking) detection and recognition (including smoking (smoking) data set and training code)

Smoking (smoking) detection and recognition 2: Pytorch realizes smoking (smoking) detection and recognition (including smoking (smoking) data set and training code)

Table of contents

Smoking (smoking) detection and recognition 2: Pytorch realizes smoking (smoking) detection and recognition (including smoking (smoking) data set and training code)

1. Smoking (smoking) detection and identification

2. Smoking (smoking) dataset

 (1) Smoking (smoking) data set description

 (2) Custom data set

3. Human detection model

4. Smoking (smoking) classification model training

(1) Project installation

(2) Prepare data

(3) Smoking (smoking) recognition classification model training (Pytorch)

(4) Visualize the training process

(5) Smoking (smoking) recognition effect

(6) Some optimization suggestions

(7) Some running error handling methods

5. Project source code download (Python version)


This is the project " Smoking (smoking) detection and recognition " series " Pytorch realizes smoking (smoking) detection and recognition (including smoking (smoking) data set and training code)"; the project is based on the deep learning framework Pytorch to develop a high-precision, real-time smoking detection and recognition algorithm; the project source code supports models such as resnet18, resnet34, resnet50, mobilenet_v2 and googlenet and other common deep learning models, which can be customized by users for training; the accuracy is quite high, using lightweight The smoking recognition accuracy of the mobilenet_v2 model can also be as high as 95.5607%, which meets the business performance requirements.

Model input size Test accuracy
mobilenet_v2 224×224 95.5607%
googlenet 224×224 96.7290%
resnet18 224×224 95.7944%

Let me show you the Python version of the smoking (smoking) detection and recognition Demo effect

[ Respect originality, please indicate the source for reprinting ] https://blog.csdn.net/guyuealian/article/details/131521338 


For more articles on the " Smoking (Smoking) Detection and Identification " series, please refer to:

  1. Smoking (smoking) detection and recognition 1: Smoking (smoking) dataset description (including download link): https://blog.csdn.net/guyuealian/article/details/130337263
  2. Smoking (smoking) detection and recognition 2: Pytorch realizes smoking (smoking) detection and recognition (including smoking (smoking) data set and training code): https://blog.csdn.net/guyuealian/article/details/131521338
  3. Smoking (smoking) detection and recognition 3: Android realizes smoking (smoking) detection and recognition (including source code, real-time detection): https://blog.csdn.net/guyuealian/article/details/131521347
  4. Smoking (smoking) detection and recognition 4: C++ realizes smoking (smoking) detection and recognition (including source code, real-time detection): https://blog.csdn.net/guyuealian/article/details/131521352


1. Smoking (smoking) detection and identification

There are many implementation schemes for smoking (smoking) detection and identification methods. The most conventional method is used here: based on human body detection + smoking classification and recognition method , that is, first use a general human body detection model to detect and locate the human body area, then cut out the smoking detection area according to certain rules, and then train a smoking behavior recognition classifier to complete the smoking (smoking) detection and identification tasks ;

The advantage of this is that the existing human detection model can be used, and there is no need to relabel the smoking and non-smoking detection frames, which can reduce the cost of manual labeling; while smoking classification data is relatively easy to collect, and the classification model can be targeted for optimization.


2. Smoking (smoking) dataset

 (1) Smoking (smoking) data set description

This project mainly uses two smoking (smoking) datasets: smoking-dataset and smoking-video , with a total of 15000+ pictures. The data is of high quality and can be used for the development of item classification model algorithms for deep learning smoking (smoking) recognition. The project divides the state of smoking (smoking) into two situations, namely: smoking (smoking) and notsmokint (not smoking) . In order to facilitate everyone's understanding, the following is the definition of the behavior category of smoking (smoking):

notsmoking : there is no smoke in the smoking detection area, it is defined as
non -smoking behavior (notsmoking); if the subject has smoking behavior, but the smoke is not in the smoking detection area, due to the limitation of the algorithm, it is still defined as non-smoking (notsmoking )
 

 For instructions on using the smoking (smoking) data set, please refer to one of my blogs: https://blog.csdn.net/guyuealian/article/details/130337263 

 (2) Custom data set

If you need to add new categories of data, or need to customize the data set for training, you can refer to the following processing:

  • To create Train and Test datasets, images of the same category are required to be placed in the same folder; and the subdirectory folder is named as the category name, such as

  • Class file: one list per line: ​class_name.txt​
    (The last line, please enter one more line)
A
B
C
D

  • Modify the data path of the configuration file: configs/​config.yaml​
train_data: # 可添加多个数据集
  - 'data/dataset/train1' 
  - 'data/dataset/train2'
test_data: 'data/dataset/test'
class_name: 'data/dataset/class_name.txt'
...
...

3. Human detection model

For the human detection training code of this project, please refer to: Pedestrian detection (human detection) 2: YOLOv5 realizes human detection (including human detection data set and training code)


4. Smoking (smoking) classification model training

After preparing the smoking (smoking) recognition data, you can start training the smoking recognition classification model. The project model supports common deep learning models such as resnet18, resnet34, resnet50, mobilenet_v2, and googlenet. Considering that we need to deploy the smoking (smoking) recognition model to the Android platform, the project chooses a lightweight model mobilenet_v2 with a relatively small amount of calculation.

 The basic structure of the whole project is as follows:

.
├── classifier                 # 训练模型相关工具
├── configs                    # 训练配置文件
├── data                       # 训练数据
├── libs           
│   ├── convert                # 将模型转换为ONNX工具
│   ├── yolov5                 # 人体检测
│   ├── detector.py            # 人体检测demo
│   └── README.md               
├── demo.py              # demo
├── README.md            # 项目工程说明文档
├── requirements.txt     # 项目相关依赖包
└── train.py             # 训练文件

(1) Project installation

 The project depends on the python package, please refer to requirements.txt, use pip to install:

numpy==1.16.3
matplotlib==3.1.0
Pillow==6.0.0
easydict==1.9
opencv-contrib-python==4.5.2.52
opencv-python==4.5.1.48
pandas==1.1.5
PyYAML==5.3.1
scikit-image==0.17.2
scikit-learn==0.24.0
scipy==1.5.4
seaborn==0.11.2
tensorboard==2.5.0
tensorboardX==2.1
torch==1.7.1+cu110
torchvision==0.8.2+cu110
tqdm==4.55.1
xmltodict==0.12.0
basetrainer
pybaseutils==0.6.5

Please refer to the project installation tutorial ( for beginners, please read the following tutorial first and configure the development environment ):

(2) Prepare data

Download the smoking (smoking) recognition dataset: smoking-dataset and smoking-video, https://blog.csdn.net/guyuealian/article/details/130337263

(3) Smoking (smoking) recognition classification model training (Pytorch)

Based on the " Pytorch Basic Training Library Pytorch-Base-Trainer (supporting model pruning distributed training)", the project realizes the training and testing of the smoking (smoking) recognition classification model. The whole set of training code is very simple to operate. Users only need to put the same category of image data in the same directory and fill in the corresponding data path to start training.

The training framework adopts Pytorch, and the content supported by the whole set of training code mainly includes:

  • Currently supported backbones are: googlenet, resnet[18,34,50], ,mobilenet_v2, etc. Other backbones can be customized and added
  • Training parameters can be set through the (configs/config.yaml) configuration file

Modify the data path of the configuration file: configs/​config.yaml​ :

  • train_data and test_data are modified to their own data paths
  • Note that the data path separator uses [/], not [\]
  • The project should not contain directory files or paths containing Chinese characters, otherwise there will be many exceptions!
# 训练数据集,可支持多个数据集(不要出现中文路径)
train_data:
  - 'path/to/smoking/smoking-person/smoking-dataset/trainval'
  - 'path/to/smoking/smoking-person/smoking-video'
# 测试数据集(不要出现中文路径)
test_data:
  - 'path/to/smoking/smoking-person/smoking-dataset/test'

# 类别文件
class_name: 'data/class_name.txt'
train_transform: "train"       # 训练使用的数据增强方法
test_transform: "val"          # 测试使用的数据增强方法
work_dir: "work_space/"        # 保存输出模型的目录
net_type: "mobilenet_v2"       # 骨干网络,支持:resnet18/50,mobilenet_v2,googlenet,inception_v3
width_mult: 1.0                # 模型宽度因子
input_size: [ 224,224 ]        # 模型输入大小
rgb_mean: [ 0.5, 0.5, 0.5 ]    # for normalize inputs to [-1, 1],Sequence of means for each channel.
rgb_std: [ 0.5, 0.5, 0.5 ]     # for normalize,Sequence of standard deviations for each channel.
batch_size: 128                # batch_size
lr: 0.01                       # 初始学习率
optim_type: "SGD"              # 选择优化器,SGD,Adam
loss_type: "CrossEntropyLoss"  # 选择损失函数:支持CrossEntropyLoss,LabelSmooth
momentum: 0.9                  # SGD momentum
num_epochs: 120                # 训练循环次数
num_warn_up: 0                 # warn-up次数
num_workers: 8                 # 加载数据工作进程数
weight_decay: 0.0005           # weight_decay,默认5e-4
scheduler: "multi-step"        # 学习率调整策略
milestones: [ 30,60,100 ]       # 下调学习率方式
gpu_id: [ 2 ]                  # GPU ID
log_freq: 50                   # LOG打印频率
progress: True                 # 是否显示进度条
pretrained: True               # 是否使用pretrained模型
finetune: False                # 是否进行finetune

To start training, enter in the terminal: 

python train.py -c configs/config.yaml 

After the training is completed, the Accuracy of the training set is above 98.0%, and the Accuracy of the test set is around 95.0%.

(4) Visualize the training process

The training process visualization tool uses Tensorboard and enters the command in the terminal (Terminal):

For tutorials, please refer to: Project development tutorials and common problems and solutions

# 需要安装tensorboard==2.5.0和tensorboardX==2.1
# 基本方法
tensorboard --logdir=path/to/log/
# 例如
tensorboard --logdir=work_space/mobilenet_v2_1.0_CrossEntropyLoss_20230313090258/log

Visualization 

​​​

 ​​ 

(5) Smoking (smoking) recognition effect

After the training is completed, the accuracy of the training set is above 95.5%, and the accuracy of the test set is about 94.5%. The following table shows the three models that have been trained. Among them, the accuracy rate of mobilenet_v2 can reach 95.5607%, the accuracy rate of googlenet can reach 96.7290%, and the accuracy rate of resnet18 can reach 95.7944%.

Model input size Test accuracy
mobilenet_v2 224×224 95.5607%
googlenet 224×224 96.7290%
resnet18 224×224 95.7944%
  • test image file
# 测试图片(Linux系统)
image_dir='data/test_image' # 测试图片的目录
model_file="data/pretrained/mobilenet_v2_1.0_224_224_CrossEntropyLoss_20230629161618/model/best_model_045_95.5607.pth" # 模型文件
out_dir="output/" # 保存检测结果
python demo.py --image_dir $image_dir --model_file $model_file --out_dir $out_dir

For Windows system, please replace $image_dir, $model_file, $out_dir and other variables with the corresponding variable values, such as

# 测试图片(Windows系统)
python demo.py --image_dir data/test_image --model_file data/pretrained/mobilenet_v2_1.0_224_224_CrossEntropyLoss_20230629161618/model/best_model_045_95.5607.pth --out_dir output/

  • test video file
# 测试视频文件(Linux系统)
video_file="data/video-test.mp4" # 测试视频文件,如*.mp4,*.avi等
model_file="data/pretrained/mobilenet_v2_1.0_224_224_CrossEntropyLoss_20230629161618/model/best_model_045_95.5607.pth" # 模型文件
out_dir="output/" # 保存检测结果
python demo.py --video_file $video_file --model_file $model_file --out_dir $out_dir
# 测试视频文件(Windows系统)
python demo.py --video_file data/video-test.mp4 --model_file data/pretrained/mobilenet_v2_1.0_224_224_CrossEntropyLoss_20230629161618/model/best_model_045_95.5607.pth --out_dir output/

  • test camera
# 测试摄像头(Linux系统)
video_file=0 # 测试摄像头ID
model_file="data/pretrained/mobilenet_v2_1.0_224_224_CrossEntropyLoss_20230629161618/model/best_model_045_95.5607.pth" # 模型文件
out_dir="output/" # 保存检测结果
python demo.py --video_file $video_file --model_file $model_file --out_dir $out_dir
# 测试摄像头(Windows系统)
python demo.py --video_file 0 --model_file data/pretrained/mobilenet_v2_1.0_224_224_CrossEntropyLoss_20230629161618/model/best_model_045_95.5607.pth  --out_dir output/

The following is a demonstration of the effect of smoking (smoking) detection and identification:

 

(6) Some optimization suggestions

 If you want to further improve the performance of the model, you can try:

  1. ​ Increase training sample data: It is recommended to collect relevant data according to your own business scenarios, such as collecting the same smoking and non-smoking data, to improve the model generalization ability;
  2. Use a model with a larger number of parameters: This tutorial uses the mobilenet_v2 model, which is a relatively lightweight classification model. A larger model (such as resnet50) is used. Theoretically, its accuracy is higher, but the inference speed is also slower.
  3. Try different combinations of data augmentations for training
  4. Increase data enhancement: Already supported: Random cropping, random flipping, random rotation, color transformation and other data enhancement methods, you can try more complex data enhancement methods such as mixup, CutMix, etc.
  5. Sample balance : The raw data of smoking recognition category data is not balanced. The sample data of the category notsmoking is too much, while the data of smoking is too small, which will cause the trained model to be biased towards the category with a large number of samples. Sample equalization is recommended.
  6. Cleaning the data set: The original data has been manually cleaned, but there are still some fuzzy, low-quality, and ambiguous samples; it is recommended that you clean the data set again before training, otherwise it will affect the accuracy of the model's recognition.
  7. Tuning hyperparameters: such as learning rate adjustment strategies, optimizers (SGD, Adam, etc.)
  8. Loss function: The current training code already supports: cross entropy, LabelSmoothing, you can try loss functions such as FocalLoss

(7) Some running error handling methods

  • The project should not contain directory files or paths containing Chinese characters, otherwise there will be many exceptions ! ! ! ! ! ! ! !

  • cannot import name 'load_state_dict_from_url' 

Due to some version upgrades, some interface functions will not be available, please ensure that the version corresponds

torch==1.7.1

torchvision==0.8.2

Or the corresponding python file will be

from torchvision.models.resnet import model_urls, load_state_dict_from_url

change into:

from torch.hub import load_state_dict_from_url
model_urls = {
    'mobilenet_v2': 'https://download.pytorch.org/models/mobilenet_v2-b0353104.pth',
    'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
    'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
    'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
    'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
    'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
    'resnext50_32x4d': 'https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth',
    'resnext101_32x8d': 'https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth',
    'wide_resnet50_2': 'https://download.pytorch.org/models/wide_resnet50_2-95faca4d.pth',
    'wide_resnet101_2': 'https://download.pytorch.org/models/wide_resnet101_2-32ee1156.pth',
}

5. Project source code download (Python version)

Project source code download address: Pytorch realizes smoking (smoking) detection and recognition (including smoking (smoking) data set and training code)

The whole set of project source code content includes:

  1. Provide smoking (smoking) recognition and classification data sets: smoking-dataset and smoking-video , a total of 15000+ pictures. The data is of high quality and can be used for the development of item classification model algorithms for deep learning smoking (smoking) recognition.
  2. Provide smoking (smoking) recognition classification model training code: train.py
  3. Provide smoking (smoking) recognition classification model test code: demo.py
  4. Demo supports picture, video and camera testing
  5. Support custom datasets for training
  6. Project support models: common deep learning models such as resnet18, resnet34, resnet50, mobilenet_v2 and googlenet
  7. The project source code comes with a trained model file , which can run the test directly: python demo.py
  8. Real-time detection and recognition on common computer CPU/GPU

Guess you like

Origin blog.csdn.net/guyuealian/article/details/131521338