Table of contents
(2) Label your own dataset steps
3. Modify the pre-trained model
4. The use of TensorBoard in Pytorch
5. Training of image classification model
6. Face detection model training
1. Pre-knowledge points
Using pytorch to implement target detection in pre-trained model migration learning
Using pytorch to implement image classification in pre-trained model migration learning
Use pytorch to load and process datasets
Use python to modify the file name and save
2. Dataset preparation
(1) Download the dataset
Link: https://pan.baidu.com/s/1QCLjLZBAUbpwppHiHq3e_Q
Extraction code: t563
Tip: The data set here is already marked by me, and it is already a small part of the data set that has been processed. Readers first use this part of the data set for training to ensure that the data set can be trained on the code, and then mark their own data. set for training.
(2) Label your own dataset steps
The first step: Use LabelImg to label your own data set (in YOLO format, refer to the link above);
Tip: It is recommended that readers scale the image to a specified size before annotating the data set, so that there is no need to scale the image size and the corresponding coordinate size in the later stage.
def equalScaleImage(imgPath,savePath): """ :param imgPath: 需要转换图片的位置 :param savePath: 需要保存转换后图片的位置 :return: """ imgs=os.listdir(imgPath) i=0 for i,imgName in enumerate(imgs): img_path=os.path.join(imgPath,imgName) img=cv2.imread(img_path) newImg=cv2.resize(img,(224,224)) save_path=os.path.join(savePath,str(i)+'.png') i+=1 cv2.imwrite(save_path,newImg) print('正在转换...')
Tip: Readers need to pay attention to the problem of backslashes in the given image path, otherwise the reading will fail.
The second step: convert the coordinate values obtained by labeling, because the data obtained by using YOLO format labeling is normalized (the data needs to be further processed, so that the face frame in the image can be drawn).
Tip: Why do we need to re-process the coordinates of the picture here? It is mainly the requirements of the pre-trained model we use, as shown in the figure below:
Since we use the YOLO format to mark out the data set is as follows:
Therefore, the coordinates marked in YOLO format need to be converted to a certain extent (the conversion program has been given in the link above).
Step 3: Regarding the name of the data set, readers can name it according to their own needs, and the link for batch naming of pictures has also been given.
Tip: The process of labeling datasets and processing datasets is cumbersome and boring. I hope readers can insist on finishing the datasets.
(3) Load the dataset
First: Since the data sets loaded into the model have format requirements, it is necessary to write a separate program to load the data set to further process the data set (MyDataset.py), and the data set needs to be converted to Tensor format;
For example: the data set format loaded into the target detection model is:
#怎么使用预训练模型进行自己的数据集的一个小实例
def example():
model = fasterrcnn_resnet50_fpn(pretrained=True, progress=True)
#images:四张图像,每一张图像的格式为[C,H,W]
#boxes:对于每一张图像中包含11个目标,每一个目标包含四个坐标
images, boxes = torch.rand(4, 3, 600, 1200), torch.rand(4, 11, 4)
# print('images.shape: {}'.format(images.shape))
# print('boxes.shape: {}'.format(boxes.shape))
print('boxes: {}'.format(boxes))
boxes[:, :, 2:4] = boxes[:, :, 0:2] + boxes[:, :, 2:4]
print('boxes.shape: {}'.format(boxes.shape))
# print('boxes: {}'.format(boxes))
#这里的整数范围[1,91),其二维形状为[4,11]
labels = torch.randint(1, 91, (4, 11))
print('labels.shape: {}'.format(labels.shape))
#将图像存放在一个列表中
images = list(image for image in images)
targets = []
#将坐标和对应的标签存放在一个字典当中
for i in range(len(images)):
d = {}
d['boxes'] = boxes[i]
# print('boxes.shape: {}'.format(boxes[i].shape))
d['labels'] = labels[i]
# print('labels[i].shape: {}'.format(labels[i].shape))
targets.append(d)
# print('d: {}'.format(d))
print('images.shape: {}'.format(len(images)))
print('targets.shape: {}'.format(len(targets)))
print('images: {}'.format(images))
print('targets: {}'.format(targets))
#注意模型默认的模式为训练模式
# model.train()
# output = model(images, targets)
# print(output)
# print(output['loss_classifier'].item())
# For inference
#设置为eval模式并进行检测
model.eval()
x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)]
predictions = model(x)
print('predictions: {}'.format(predictions))
print('boxes.shape: {}')
3. Modify the pre-trained model
Tip: Since the number of categories of the dataset we trained is different from that of the pre-trained model, certain modifications are required. For the modification of the number of categories output by the model, please refer to the following article:
Using pytorch to implement target detection in pre-trained model migration learning
4. The use of TensorBoard in Pytorch
Recommend the following blogger:
5. Training of image classification model
Tip: If you are training a face recognition detection model, you not only need to mark the position of the face, but also identify the identity of the face.
Using pytorch to implement image classification in pre-trained model migration learning
6. Face detection model training
Tip: The code here will be placed on Github.