normal equation (normal equations)

normal equation (normal equations)

  • The normal equation is obtained by solving the following equation to find the parameters that minimizes a cost function:

\[ \frac{\partial}{\partial\theta_j}J\left(\theta\right)=0 \]

  • We assume that the training set feature matrix \ (X-\) (contains \ (x_0 =. 1 \) ) and our training set results as a vector \ (Y \) , then using the normal equations solved for vector:
    \ [\ Theta = {{\ left ({X ^ T} X \ right)} ^ {- 1}} {X ^ T} y \]
  • Gradient descent with more formal equation:
    • Gradient descent: the need to select learning rate \ (\ Alpha \) ; require multiple iterations; n is large when the number of features can be preferably applied for all types of models;
    • Normal equation: Learning rate need not select \ (\ Alpha \) ; no iteration, one operation can be drawn \ (\ Theta \) optimal solution; need to compute \ ({\ left ({X ^ T} X \ right)} ^ {-}. 1 \) ; wherein the number n is large, if the cost calculation is large because an inverse matrix computation time complexity is \ (O (n ^. 3) \) , when n is typically less than 10000 when is acceptable, only applies to linear models, logistic regression model is not suitable for other models.

Programming

In programming operations 1.1: Univariate linear regression based on the realization:

# 正规方程
def normalEqn(X, y):
    theta = np.linalg.inv(X.T@X)@X.T@y  #X.T@X等价于X.T.dot(X);np.linalg.inv():矩阵求逆
    return theta
final_theta2=normalEqn(X, y)#感觉和批量梯度下降的theta的值有点差距
final_theta2

After completion before running the gradient descent algorithm, we output \ (\ Theta \) values as follows:

As can be seen in two ways determined \ (\ Theta \) values are substantially similar.

Guess you like

Origin www.cnblogs.com/yangdd/p/12307925.html