Python based on improved CNN & FCN aerial image segmentation (full source code & data set & video tutorial)

1. Aerial image segmentation effect display:

2.png

3.png

2. Background introduction:

With the rapid development of UAV technology, UAVs have received extensive attention in research fields and industrial applications. Images and videos are important ways for UAVs to perceive the surrounding environment. Semantic image segmentation is a research hotspot in the field of computer vision. , is widely used in scenarios such as unmanned driving and intelligent robots. Semantic segmentation of UAV aerial images is based on UAV aerial images, using semantic segmentation technology to enable UAVs to obtain intelligent perception of scene targets. Introduced semantic Segmentation technology and UAV application development, related UAV aerial photography data sets, UAV aerial image characteristics and common semantic segmentation evaluation indicators. According to the characteristics of UAV aerial photography, related semantic segmentation methods are introduced, including small objects, model Real-time and multi-scale integration and other aspects. Summarize the related applications of UAV semantic segmentation, including line detection, agriculture and building extraction, and analyze the future development trend and challenges of UAV semantic segmentation.

3. Data Augmentation

What we have now are large-scale aerial images, and we cannot directly feed these images into the network for training, because the memory cannot bear it and they are all different sizes. Therefore, we first randomly cut them, that is, randomly generate x, y coordinates, then cut out a small image of 256*256 under the coordinates, and perform the following data enhancement operations:

1. Both the original image and the label image need to be rotated: 90 degrees, 180 degrees, 270 degrees
2. Both the original image and the label image need to be mirrored along the y-axis
3. The original image is blurred
4. The original image is illuminated to adjust
5. The original image is used to add noise (Gaussian noise, salt and pepper noise).
Here I did not use the data augmentation function that comes with Keras, but wrote the corresponding augmentation function using opencv.

img_w = 256 img_h = 256 image_sets = ['1.png','2.png','3.png','4.png','5.png']defgamma_transform(img, gamma): gamma_table = [np.power(x / 255.0, gamma) * 255.0 forx inrange(256)] gamma_table = np.round(np.array(gamma_table)).astype(np.uint8) returncv2.LUT(img, gamma_table)defrandom_gamma_transform(img, gamma_vari): log_gamma_vari = np.log(gamma_vari) alpha = np.random.uniform(-log_gamma_vari, log_gamma_vari) gamma = np.exp(alpha) returngamma_transform(img, gamma) defrotate(xb,yb,angle): M_rotate = cv2.getRotationMatrix2D((img_w/2, img_h/2), angle, 1) xb = cv2.warpAffine(xb, M_rotate, (img_w, img_h)) yb = cv2.warpAffine(yb, M_rotate, (img_w, img_h)) returnxb,yb defblur(img): img = cv2.blur(img, (3, 3)); returnimgdefadd_noise(img): fori inrange(200): #添加点噪声temp_x = np.random.randint(0,img.shape[0]) temp_y = np.random.randint(0,img.shape[1]) img[temp_x][temp_y] = 255 returnimg defdata_augment(xb,yb): ifnp.random.random() < 0.25: xb,yb = rotate(xb,yb,90)

ifnp.random.random() < 0.25: xb,yb = rotate(xb,yb,180)

ifnp.random.random() < 0.25: xb,yb = rotate(xb,yb,270)

ifnp.random.random() < 0.25: xb = cv2.flip(xb, 1)

# flipcode > 0:沿y轴翻转yb = cv2.flip(yb, 1) ifnp.random.random() < 0.25: xb = random_gamma_transform(xb,1.0) ifnp.random.random() < 0.25: xb = blur(xb) ifnp.random.random() < 0.2: xb = add_noise(xb) returnxb,ybdefcreat_dataset(image_num = 100000, mode = 'original'): print('creating dataset...') image_each = image_num / len(image_sets) g_count = 0 fori intqdm(range(len(image_sets))): count = 0 src_img = cv2.imread('./data/src/'+ image_sets[i])

# 3 channelslabel_img = cv2.imread('./data/label/'+ image_sets[i],cv2.IMREAD_GRAYSCALE)

# single channelX_height,X_width,_ = src_img.shape

whilecount < image_each: random_width = random.randint(0, X_width - img_w - 1) random_height = random.randint(0, X_height - img_h - 1) src_roi = src_img[random_height: random_height + img_h, random_width: random_width + img_w,:] label_roi = label_img[random_height: random_height + img_h, random_width: random_width + img_w]

ifmode == 'augment': src_roi,label_roi = data_augment(src_roi,label_roi) visualize = np.zeros((256,256)).astype(np.uint8) visualize = label_roi *50 cv2.imwrite(('./aug/train/visualize/%d.png'% g_count),visualize) cv2.imwrite(('./aug/train/src/%d.png'% g_count),src_roi) cv2.imwrite(('./aug/train/label/%d.png'% g_count),label_roi) count += 1 g_count += 1

4. Environment configuration

tensorflow-gpu2.3.0
numpy
1.21.5
matplotlib==3.5.1

5. Create a dataset path index file

There are three files under the "./prepare_dataset" directory in the project root directory: drive.py, stare.py and chasedb1.py. Assign the "data_root_path" parameter in the three files to the absolute path of the dataset prepared in 3.2 above (for example: data_root_path="/home/lee/datasets"). Then run separately:

python ./prepare_dataset/drive.py
python ./prepare_dataset/stare.py
python ./prepare_dataset/chasedb1.py
can generate "train.txt" and The "test.txt" file stores the data paths for training and testing respectively (each line stores the original image, label and FOV path (separated by spaces) in turn).

6. Training model

Modify hyperparameters and other configuration information in the "config.py" file in the root directory. Pay special attention to the two parameters "train_data_path_list" and "test_data_path_list", respectively pointing to "train.txt" and "text.txt" of a data set created in 3.3. Construct the created model in "train.py" (all models are torn in "./models"), for example, specify the UNet model:

net = models.UNetFamily.U_Net(1,2).to(device) # line 103 in train.py
After modification, execute in the project root directory:

CUDA_VISIBLE_DEVICES=1 python train.py --save UNet_vessel_seg --batch_size 64
The above command will execute the training program on GPU No. 1, and the training results will be saved in the "./experiments/UNet_vessel_seg" folder. The batchsize is 64, and the rest parameters are config Default arguments in .py.

You can configure training information in config, or modify configuration parameters with the command line. The training results will be saved to the corresponding directory in the "./experiments" folder (the name of the saving directory is specified with the parameter "-save").

In addition, it should be noted that there is a "val_on_test" parameter in the config file. When it is true, it means that performance evaluation will be performed on the test set after each epoch of training, and the model with the highest "AUC of ROC" will be selected and saved as "best_model.pth"; when it is false, verification will be used Set the performance evaluation results (AUC of ROC) to save the model. Of course, the indicator for saving the best model can be modified by itself, and the default is AUC of ROC.

2.png

7. Test Evaluation

Construct the corresponding model in the "test.py" file (same as above), for example, specify the UNet model:

net = models.UNetFamily.U_Net(1,2).to(device)
The test process also requires related parameters in "./config.py", which can also be modified by command line parameters at runtime.

Then run:

CUDA_VISIBLE_DEVICES=1 python test.py --save UNet_vessel_seg
The above command loads the trained "./experiments /UNet_vessel_seg/best_model.pth" parameters to the corresponding model, and performs performance tests on the test set, and the test performance index results are saved In "performance.txt" in the same folder, the corresponding visualization results are drawn at the same time.
training process.png

8. System integration:

1.png

9. Complete source code & environment deployment video tutorial & dataset:

10. References

[1]Hoover,A.,Kouznetsova,V.,Goldbaum,M.Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response[].IEEE Trans Med Imag.2000
[2]Ying Sun.Automated identification of vessel contours in coronary arteriograms by an adaptive tracking algorithm[].IEEE Transactions on Microwave Theory and Techniques.1989
[3]Eckhorn R,Reitboeck H J,Arndt M,et al.Feature Linking via Synchronization among Distributed Assemblies:Simulation of Results from Cat Cortex[].Neural Computation.1990
[4]Chaudhuri S,Chatterjee S,Katz N,et al.Detection of bloodvessels in retinal images using two-dimensional matched filters[].IEEE Transactions on Medical Imaging.1989
[5]HOOVER A.Structured analysis of the retina. http://www.ces.clemson.edu/~ahoover/stare/ . 2000

Guess you like

Origin blog.csdn.net/cheng2333333/article/details/126663428