Gradient descent algorithm rate of change curve

In order to learn matplotlib drawing, but also to see the rate of change curve under various optimization algorithms

First look at the best RMSprop algorithm (350 times)

 

import math
import matplotlib #Import matplotlib library
from numpy import *
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator, FuncFormatter

def f(x):
        return x**3-x+x**2

def derivative_f(x):
         return 3*(x**2)+2*x-1


x=0.0
y=0.0
learning_rate = 0.001
gradient=0
e=0.00000001
sum = 0.0

d = 0.9

Egt=0
Edt = 0


delta = 0


xx=[]
dd=[]
gg=[]
yy=[]
for i in range(100000):

    print('x = {:6f}, f(x) = {:6f},gradient={:6f}'.format(x,y,gradient))
    if(abs(gradient)>0.000001 and (abs(gradient)<0.00001)):
        print("break at "+str(i))
        break
    else:
        xx.append(x)

        gradient = derivative_f(x)

        gg.append(gradient)

        Egt = d * Egt + (1-d) * (gradient ** 2)

        delta = learning_rate*gradient/math.sqrt(Egt + e)

        dd.append(delta)
        x=x-delta

        y=f(x)

        yy.append(y)



fig = plt.figure()

ax1 = fig.add_subplot(111)
ax1.plot(xx,dd, label='YR', color='red')



ax2 = ax1.twinx()  # this is the important function
ax2.plot(xx, gg,label='YL', color='blue')


plt.savefig('latex-rms.png', dpi=75)
plt.show()


 

 

The blue is the change of the derivative, and the red is the rate of change of the difference. Note that the red has a smooth stage, which is just learning_rate

 

 

 

Look at ADADELTA again (760 times)



 


The red one is the back and forth of the bowl, indicating the ups and downs of momentum

 

 

SEE ALSO ADAGRAD (1454)

 



 

 Look at adam again (861 times)



 

 

Finally, look at the original one (3018 times)

 



 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326442497&siteId=291194637