"Baidu architects take you to practice deep learning with zero foundation" learning experience

"Baidu architects take you to practice deep learning with zero foundation" learning experience

First time using deep learning framework

Baidu Paddle should be the first open source, open, technologically advanced, and fully functional industrial-level deep learning platform in China. It is also the first time I have come into contact with a similar learning framework. Those who are interested in experiencing it can log on to the following link https: //www.paddlepaddle .org.cn/

About the course

If you want to learn about machine learning and deep learning, I personally suggest supplementing the basic theoretical knowledge first. You can search Zhejiang University’s postgraduate machine learning courses on station B, and you can also search for courses from many big names. After that, you can learn open classes on AI Studio, a learning platform that is compatible with Flying Paddle. There are detailed open class training camps for machine learning, deep learning, reinforcement learning, and transfer learning. You can receive 12 hours of GPU computing power every day. Practical and theoretical courses are combined. The supporting daily homework and competitions I think really exercise my abilities. By the way, we can find some commonly used data sets on AI Studio, and most courses will also teach these data sets.

Comprehension of AI insect recognition competition

Since I only came into contact with the content of deep learning not long ago, the theoretical part of the study should be repeated to deepen the understanding. In this training camp course, the final AI insect recognition competition, the teacher suggested that the target detection of YOLO V3 Improve the algorithm, such as replacing the backbone network, it can be resnet/mobilenet, etc., or use Faster RCNN to overcome the shortcomings of the yolov3 algorithm for inaccurate detection of small targets. The limited ability is only a preliminary understanding of the mentioned network. The working principle is that it is really difficult to find a way to implement it with code, so the highest accuracy rate in the end is 77.8, which is an increase of one percentage point on the basis of the baseline. Looking at the 99% of the big guys, I sigh. In the future, I will deepen my study in this area and strive to get good results in the next training camp.

some ideas for improvement

1. Data fusion
The following code only realizes the fusion of images. In order to unify the data format during data processing, we fixed the list of real frames and labels to 50. After image fusion, I think it is necessary to combine the real frames of the two pictures The and tags are also expanded, and the time issue cannot be completed before the competition, and the code is left to be improved in the future.

### 数据融合
def random_mixup(img, img1, gtboxes, gtlabels, gt_boxes1, gt_labels1):
    alpha = 5
    lam = np.random.beta(alpha,alpha)# 类高斯分布
    height = max(img1.shape[0],img2.shape[0])
    width = max(img1.shape[1],img2.shape[1])
    mix_img = np.zeros(shape = (height,width,3),dtype ='float32')
    mix_img[:img1.shape[0], :img1.shape[1], :] = img1.astype('float32') * lam
    mix_img[:img2.shape[0], :img2.shape[1], :] += img2.astype('float32') * (1. - lam)
    mix_img = mix_img.astype('uint8')
    # mix_gtboxes = 
    # mix_gtlabels = 
    
    return mix_img

2. Replace the backbone network, and borrow the idea of ​​​​this article to share AI insects (based on YoloV3 target detection and SeResnet classification correction). The pre-training model of the object365 data set mentioned in the article, forgive Xiaobai for not knowing how to use it, Leave it for improvement, I'll be back soon. There is also the producer-consumer mode mentioned in the article that performs multi-process data loading , which can effectively increase the training speed, and needs to be further understood.

3. Modify the selection criteria of the prediction box and the anchor box. Usually we use the method of intersection and comparison of IOU. You can refer to this Zhihu article about IoU, GIoU, DIoU, and CIoU loss functions.

Follow up

In the near future, I will implement related ideas and try to see the results!
If you are interested in learning about paddle or deep learning, you can leave a message to learn together, come on boy!

Guess you like

Origin blog.csdn.net/weixin_43357695/article/details/108290380