Click on the top " AI proper way ", select the "star" Public No.
Heavy dry goods, the first time served
We recall deep learning "three axes":
1. Select the neural network
2. Define the quality of neural networks
3. Select the best set of parameters
Wherein the step Third, how to choose good or bad neural networks do?
Gradient descent is one of the most effective methods.
Methods: We give two examples of parameters θ1, θ2, loss of function is L. Then its gradient is:
I order to obtain the minimum, we have:
Parameters constantly being multiplied by the gradient iterative learning rate η
So why is the public announcement above minus, not a plus sign it?
We define θ direction to change the direction of movement, and direction of gradient is the direction normal to the contour
Based Gradient Decent been introduced over, Next, we explore together the GD tips.
Learning rate setting learning rate
Learning Rate η If the setting is not good, Loss instead of increasing
Adaptive learning rate adaptive learning rate
Many small partner in machine learning code, the learning rate is generally set to a fixed value (the need to constantly adjust parameters).
According to the learning experience, we have the following general conclusions:
1. When the beginning of training, learning rate is high
2. After several rounds of training, the results slowly approaching when the need to transfer a small learning rate
Learning rate Adagrad conventional learning rate is divided by the square of the derivative of the square root and
Stochastic Gradient Decent (SGD)
Make training more quickly
GD is a general method after all the training data, a parameter update
SGD is a sample parameters can be updated
GD and SGD contrast effect:
Cutting Feature Scaling feature
Let different dimensions of data, with the same magnitude of change
Training time, which is good train, at a glance
Normalization method:
总结: Gradient Decent 是机器学习、深度学习求解Optimal问题的“普世”方法,但是也会遇到很多问题,例如local minima 和 saddle point 的问题。 我们以后会展开讨论。
本专栏图片、公式很多来自台湾大学李弘毅老师、斯坦福大学cs229、cs231n 、斯坦福大学cs224n课程。在这里,感谢这些经典课程,向他们致敬!
作者简介:武强 兰州大学博士,谷歌全球开发专家Google Develop Expert(GDE Machine Learing 方向)
CSDN:https://me.csdn.net/dukuku5038
知乎:https://www.zhihu.com/people/Dr.Wu/activities
漫画人工智能公众号:DayuAI-Founder
系列笔记:
推荐阅读
(点击标题可跳转阅读)
重磅!
林轩田机器学习完整视频和博主笔记来啦!
扫描下方二维码,添加 AI有道小助手微信,可申请入群,并获得林轩田机器学习完整视频 + 博主红色石头的精炼笔记(一定要备注:入群 + 地点 + 学校/公司。例如:入群+上海+复旦。
长按扫码,申请入群
(添加人数较多,请耐心等待)
最新 AI 干货,我在看