The fifth chapter summarizes the learning machine learning gradient descent is seeking the minimum of the cost function

Is seeking the minimum cost function

Gradient descent

When the minimum value of the cost function before solving linear regression equation, we use the following equation to iterate θ value.
Iterative equation
We know that the cost function after regularization, we are beginning to punish the θ1, θ0 did not move. So we will θ separated from the iterative equation. Get the following iterative equation:
After separation
in fact there is no change, at least to separate out the calculation of the equation θ0 only. Then the start values θj from 1, n-to . If we want to use this method to find the section regularization objective function, we need to add one on the θj equation . FIG After addition:
Modified cost function
and this Equation becomes what shape, to obtain:
After deformation
1 [alpha]- (Ronda / m) is less than 1, and [alpha] (Ronda / m) are very small numbers, the 1-α * ( Rhonda / m) is approximately equal to 0.99 ** sometimes
we can see the updated θj becomes approximately 0.99 times the original, only a little smaller.
Regarding the second partial differential summation, you can view another article I wrote:

https://blog.csdn.net/Ace_bb/article/details/103996097

The normal equation

I have assumed that there is a set of data, there are n variables, m set of samples.
Whereby all data samples constitute a m * (n + 1) matrix of dimension X, as shown in FIG. For each of the samples constituting the prediction value y m-dimensional vector. As shown below:
The normal equation 1
Our purpose is to seek cost function J (θ) to obtain the minimum value [theta] , [theta] is a vector , can be used directly to calculate the following formula, when Ronda> 0 may be used.

The normal equation
The intermediate matrix is a (n + 1) * (n + 1) dimensional matrix, only the first diagonal element is 0, and all 1, the off-diagonal elements on all zeros .

When m <n, X may lead to a result multiplied by the transpose of the matrix X irreversible. Therefore, m <n by the amount of time not

------------------------------
Network lesson images from Andrew Ng teacher:
https://www.bilibili.com/video/ av9912938? p = 43

Published 50 original articles · won praise 3 · Views 5175

Guess you like

Origin blog.csdn.net/Ace_bb/article/details/104073472