为了学习matplotlib 画图,同时也为了看看各种优化算法下变化率曲线
先看最好的RMSprop 算法(350次)
import math import matplotlib #导入matplotlib库 from numpy import * import numpy as np import matplotlib.pyplot as plt from matplotlib.ticker import MultipleLocator, FuncFormatter def f(x): return x**3-x+x**2 def derivative_f(x): return 3*(x**2)+2*x-1 x=0.0 y=0.0 learning_rate = 0.001 gradient=0 e=0.00000001 sum = 0.0 d = 0.9 Egt=0 Edt = 0 delta = 0 xx=[] dd=[] gg=[] yy=[] for i in range(100000): print('x = {:6f}, f(x) = {:6f},gradient={:6f}'.format(x,y,gradient)) if(abs(gradient)>0.000001 and (abs(gradient)<0.00001)): print("break at "+str(i)) break else: xx.append(x) gradient = derivative_f(x) gg.append(gradient) Egt = d * Egt + (1-d)*(gradient**2) delta = learning_rate*gradient/math.sqrt(Egt + e) dd.append(delta) x=x-delta y=f(x) yy.append(y) fig = plt.figure() ax1 = fig.add_subplot(111) ax1.plot(xx,dd, label='YR', color='red') ax2 = ax1.twinx() # this is the important function ax2.plot(xx, gg,label='YL', color='blue') plt.savefig('latex-rms.png', dpi=75) plt.show()
蓝色是导数变化情况,红色的差值的变化率,注意红色的有一段平滑的阶段,刚好是learning_rate
再看看 ADADELTA (760次)
红色的是来回的碗口,说明动量的此起彼伏
再看看ADAGRAD (1454)
再看adam(861次)
最后再看看原生的 (3018次)