[Deep learning] [Original] Unet trains its own data set, the entire process and problem exploration

Project address: https://github.com/zhixuhao/unet

test environment:

ubuntu18.04

cuda10.0 + cudnn7.6.5

anaconda3+python3.7

hard == 2.2.5

tensorflow-gpu==1.14.0

Step 1: Prepare the data set

Use labelme to label the data, and then use labelme_json-to_dataset to batch convert to 5 files, similar to the following

Please refer to this blog for details on how to convert. https://blog.csdn.net/qq_29462849/article/details/81037343

In order to be lazy, I used the VOC data set myself and wrote a script to convert the xml format into the labelme corresponding format, so that I instantly get a lot of labelme data sets, many VOC data sets such as VOC2012, VOC2007, etc., get the json file, and then use it labelme_json-to_dataset. Unfortunately, unet cannot directly use the images directly converted by labelme. There are two important things to do.

First, for example, the img.png in the above picture must all be converted into 8-bit images. After checking, it is found that the labelme is 24 bits. After my training, the test is all gray. This needs to be converted. The code is available online. Here I put it.

import os
import glob
import cv2


def togrey(img, outdir):
    src = cv2.imread(img)
    try:
        dst = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
        cv2.imwrite(os.path.join(outdir, os.path.basename(img)), dst)
    except Exception as e:
        print(e)


for file in glob.glob('../data/mydata/train/pic/*.png'):
    togrey(file, '../data/mydata/train/pic/')

for file in glob.glob('../data/mydata/test/pic/*.png'):
    togrey(file, '../data/mydata/test/pic/')

Second, the mask file needs to be converted to black and white, which is binarization, and the model is completely black after training and testing. This is very important. After testing, black background or white background will work, I used the ostu global threshold for batch binarization.

The first image above is the result of labelme transformation, and the second is the mask file for training. After repeated tests, it is proved that if I reverse the color of the second picture, the result of the training test that turns it into a white background is also reversed, and it has no effect.

Finally, put the picture and mask file in the corresponding path and change the unet source code path to start training.

Finally, give a training test result

The white ball is the general location of the vehicle. Since the picture belongs to the company's internal items, it will not be placed here. I put the labelme conversion up to compare

There is also a num_class everyone needs to pay attention to. If there is only one class, the unet code basically does not need to be changed. num_class=2, why is 2 not 1? It's actually very simple, the background is a class. In addition, the picture must be 0.png, 1.png,..., similar to this when testing, for example, I have 10 pictures, the names are 0-9.png, and then the test code num_image=10, and the second parameter of predict_generator is also It should be changed to 10, if the coding ability is available, you can also change it yourself without numbers. Below is the modified code

testGene = testGenerator("data/mydata/test/pic",num_image=10)
model = unet()
model.load_weights("./model/model.h5")
results = model.predict_generator(testGene, 10, verbose=1)
saveResult("data/mydata", results,num_class=2)

. I haven't studied the multi-category for the time being. When I find it out, I will post another blog.

Overall feeling:

There are few file codes, but there are many problems. The Internet predicts all gray, all white, and all black situations. I checked and the problem was not solved. Finally, I found a problem with the label file. Okay, this concludes the test summary!

Second update: The above method is feasible, but after further research, it is found that the prediction of total darkness is likely to be correct. Because the label is not directly generated by labelme, and is established according to the category index, for example, the pixels are all 0,1,2,3, and of course they are all black! You can generate a predicted picture and write a script to output the pixels to see. At present, I found that many data sets are labeled data sets constructed according to the index!

[Deep learning] [Original] Unet trains its own data set, the entire process and problem exploration

Guess you like