[Notes] machine learning - CHANG - 4 - Gradient Descent

Gradient Descent gradient descent
Gradient descent is an iterative process (as opposed to the least squares method), the goal is to solve the optimization problem: \ ({\ Theta} ^ * = Arg min _ {\ Theta} L ({\ Theta}) \) wherein \ ({\ theta} \) is a vector, partial differential gradient.

In order to achieve better results gradient descent, following these Tips:

  1. Adjust the learning rate
    at the beginning of time, larger learning rate for faster iteration, when close to the target, the learning rate is adjusted smaller.
    For example, by \ (. 1 / T \) : attenuation \ (^ t = {\ eta
    } / \ sqrt {(t + 1)} \ {\ eta}) In addition, different parameters should be given a different learning rate.

Guess you like

Origin www.cnblogs.com/yanqiang/p/11325772.html