Supervised Learning Applications and Gradient Descent

Main content:
Linear regression
Gradient descent
Normal equation system

Linear regression:
problem: predicting housing prices

Definition concept:
m: training number
x: input variable, or feature
y: output variable, or target variable
(x, y): training sample
(xi, yi): i-th sample

Common practice:
training set -> learning algorithm -> h prediction function

Indicates the prediction function:
h(x)=a+a1*x
h(x)=ha(x)=a+a1*x1 +a2*x2
...
define x0=1
h(x)=a0*x0+a1*x1+...+an*xn

What kind of prediction function is the optimal solution: Make the following function minimum:
J(x ) = ( ha(x1) - y1 )*( ha(x1) - y1 ) + ( ha(x2) - y2 )*( ha(x2) - y2 )
+...+( ha(xi) - yi ) *( ha(xi) - yi )+...+
( ha(xm) - ym )*( ha(xm) - ym )

How to get this function:
Gradient descent:
Visually, we take all the values When we reach it, we will get a surface. We randomly choose a point, and then start to walk down until we reach the lowest point. We also call convergence, which is the optimal solution.
The approximate implementation of the algorithm is to stand at a point, and then go to find the partial derivative, you can get the steepest direction, take a step down, and then continue to find the partial derivative.
We treat the parameter a as a vector (a0,a1,a2...ai...an), initialize it to (0,0,...0), and then start changing this vector continuously until the above


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326260836&siteId=291194637