Series notes | deep learning serial (5): optimization techniques (under)

Click on the top " AI proper way ", select the "star" Public No.

Heavy dry goods, the first time served

Depth study we conclude that five tips:

This section continues to talk from the start of the third.

3. Early stopping and Regularization

In this section we discuss with Early stopping and Regularization, these two skills are not unique to the depth of learning methods, machine learning is the common method.

Early stopping

During training, often with the conclusion of training also may not be as before, the reason is likely to occur overfitting.  We need to advance on the brakes, draw better results.

Regularizaton

When we tried to reduce the value Loss of function, we will find that we find a set of parameters weights, not only to make Loss and smaller, and their weights also need to be close to zero, so that our results will be even better.

L1 regularization:

New Loss function will be minimized:

L2 Regularization:

New Loss function will be minimized:

Here, many students may wonder why the small weights, the result it is better, I am here to illustrate: 6 years old and 14 years old, the brain's neuronal density was significantly decreased, indicating that some of the neurons are invalid hinder the progress of the brain.

4. Dropout

Dropout imagenet game shine in 2012, is one of CNN model was part of feats to win.

So what is our first intuitive understanding Dropout:

When practicing martial arts, weight training when feet tied

Wait until excel down time:

We have several ways to explain Dropout

Basic definitions

When training, each neuron may have p% "step aside"

When tested, all the neurons come together to:

Dropout is a learning Ensemble

Ensemble learning we discuss in machine learning through column, the link is an integrated learning. Each training when the network structure is not the same, is a thinner network:

其实在训练的时候训练了很多thinner network:

测试的时候,取各个网络的平均值

所以在深度学习中,我们的整个训练测试方法如下:

本专栏图片、公式很多来自台湾大学李弘毅老师、斯坦福大学cs229、cs231n 、斯坦福大学cs224n课程。在这里,感谢这些经典课程,向他们致敬!

作者简介:武强 兰州大学博士,谷歌全球开发专家Google Develop Expert(GDE Machine Learing 方向) 

CSDN:https://me.csdn.net/dukuku5038 

知乎:https://www.zhihu.com/people/Dr.Wu/activities 

漫画人工智能公众号:DayuAI-Founder

系列笔记: 

系列笔记 | 深度学习连载(1):神经网络

系列笔记 | 深度学习连载(2):梯度下降

系列笔记 | 深度学习连载(3):反向传播

系列笔记 | 深度学习连载(4):优化技巧(上)


推荐阅读

(点击标题可跳转阅读)

干货 | 公众号历史文章精选

我的深度学习入门路线

我的机器学习入门路线图

重磅!

林轩田机器学习完整视频和博主笔记来啦!

扫描下方二维码,添加 AI有道小助手微信,可申请入群,并获得林轩田机器学习完整视频 + 博主红色石头的精炼笔记(一定要备注:入群 + 地点 + 学校/公司。例如:入群+上海+复旦。 

长按扫码,申请入群

(添加人数较多,请耐心等待)

 

最新 AI 干货,我在看 

发布了251 篇原创文章 · 获赞 1024 · 访问量 137万+

Guess you like

Origin blog.csdn.net/red_stone1/article/details/103951669