1. Dataset download
Download link
Extraction code: 2022
The sample image and label are stored in the dataset folder, and you can view their channels and other information to facilitate processing your own dataset.
2. Dataset preprocessing
Refer to another article: Image preprocessing
- Dataset image
- is 24 bit depth
- 3-channel RGB image
- Dataset label
- is 8 bit depth
- The value of the label is 0, 1, 2, ... (n-1 if there are n types of values)
If the value of your data set label is 255, it needs to be converted to 0, 1, 2, etc. ( depending on your own situation )
For example:
Background [0,0,0]--------------0
Person [192,128,128]--------------1
Bike [0,128,0]----------------------2
Car [128,128,128]----------------- 3
Drone [128,0,0]--------------------4
Boat [0,0,128]--------------------- 5
Animal [192,0,128]---------------- 6
Obstacle [192,0,0]------------------7
Construction [192,128,0]-----------8
Vegetation [0,64,0]-----------------9
Road [128,128,0]-------------------10
Sky [0,128,128]---------------------11
If the video memory is not enough, you can crop the image to a size of 256, the corresponding code is as follows:
the cropped image is saved in image and label
import os
import numpy as np
import cv2
images_path = './JPEGImages/'
labels_path = './SegmentationClass/'
image_files = os.listdir(images_path)
for s in image_files:
image_path = images_path + s
label_path = labels_path + s[:-4]+'.png'
image = cv2.imread(image_path)
label = cv2.imread(label_path)
#print(image.shape)
index = 0
for i in range(4):
for j in range(4):
#print(i*256, ((i+1)*256-1), j*256, ((j+1)*256-1))
new_image = image[i*256 : ((i+1)*256), j*256 : ((j+1)*256), :]
new_label = label[i*256 : ((i+1)*256), j*256 : ((j+1)*256), :]
cv2.imwrite('./image/'+ 'b_' + s[:-4] +'_'+ str(index) + '.png', new_image)
new_label = cv2.cvtColor(new_label, cv2.COLOR_BGR2GRAY)
cv2.imwrite('./label/'+ 'b_' + s[:-4] +'_'+ str(index) + '.png', new_label)
index+=1
print(s)
3. Model download
4. Check the data
- After the data preparation is complete, first run
voc_annotation.py
the file, generate a data list, and run the command:python voc_annotation.py
, generally no error will be reported here, if an error is reported, you can check it in the following way:-
Check if the label is
8位深度
, if not, run the following code:import os import cv2 file_names = os.listdir('./SegmentationClass/') for s in file_names: image_path = os.path.join('./SegmentationClass/', s) image = cv2.imread(image_path) image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) cv2.imwrite('./SegmentationClass/' + s , image)
-
Check if there are other values in the label. For example
类别为3
, the value in the label can only be0,1,2
. If there are other values, an error may be reported; even if there is no error reported here, there may be problems in training, so if there is this problem , must remember to modify. If there are not too many such values, you can directly set them to 0; if they are too many, you need to check which step caused the problem. If you can’t find it, it is recommended to change the data set.
-
5. Train the model
- Modify the train.py code
image size (important)
image category (important)
batch_size and other parameters as the case may be
- run
python train.py
to train
6. Model Evaluation
- Modify the content of deeplab.py, which are the trained models. Number of categories (including background). image size
- Modify get_miou.py content
- run
python get_miou.py
command
7. Model Prediction
- Modify the content of deeplab.py, which are the trained models. Number of categories (including background). image size
- Modify predict.py content. name_classes indicates the name corresponding to each category
- run
python predict.py
command