Least Squares

1. The origin of the least squares method

In 1805, the French mathematician Legendre published the first clear and concise exposition of the least squares method;

In 1809, the German mathematician Gauss published "On the Motion of Celestial Bodies" and claimed to have used the method of least squares since 1795. This led to a priority dispute with Legendre.

In 1829, Gauss provided a proof that the optimization effect of the least squares method was stronger than that of other methods (Gauss-Markov theorem)

 

2. Parameter estimation - least squares method (normal equation)

1. Univariate Linear Regression

      For a univariate linear regression model  , where e i represents the error, we can get 

      Suppose n sets of observations (X1, Y1), (X2, Y2), ..., (Xn, Yn) are obtained from the population. For these n points in the plane, an infinite number of curves can be used to fit. The sample regression function is required to fit this set of values ​​as well as possible. Taken together, it is most reasonable for this line to be in the center of the sample data. The criteria for selecting the best fit curve can be determined to minimize the total fitting error (ie, the total residual) . There are three criteria to choose from:

        (1) Determining the position of the straight line with "residual sum minimum" is one approach. But it was soon discovered that the calculation of the "residual sum" had the problem of canceling each other out.
        (2) It is also a way to determine the position of the straight line with "the absolute value and minimum of the residual". But the calculation of the absolute value is more troublesome.
        (3) The principle of the least squares method is to determine the position of the straight line by "minimizing the sum of squares of residuals". In addition to the convenience of calculation, the estimator obtained by the least squares method has excellent characteristics. This method is very sensitive to outliers.

      The most commonly used method is Ordinary Least Square (OLS): the regression model chosen should minimize the residual sum of squares for all observations

                                     Mathematical formula expression: Q is the residual sum of squares——

 

      To minimize Q, find the extreme value of Q. Treat it as a variable, and then take the partial derivative to get:

                                     

      Then, the above two partial derivatives are further eliminated and calculated to obtain:

                                 

      This way we can be sure and get the regression model. Finally, substitute the data into this model to get the results we want.

 

2. Multiple Linear Regression

     Extending to the more general case, if there are more model variables  x 1 , x 2 , , x n , it can be expressed as a linear function as follows

                                

     For m samples, it can be expressed by the following linear equations: 

                             

    If the sample matrix x ij is denoted as matrix A, the parameter matrix is ​​denoted as vector β, and the real value is denoted as vector Y, the above linear equation system can be expressed as:

                                which is  

    For least squares, the final matrix representation can be expressed as: 

                           (The vector paradigm is involved here )

    Among them, m≥n, because the constant term is considered, the number of attribute values ​​changes from n to n+1

 

    The process of solving this equation is as follows:

                            

                         

     The middle two items in the penultimate row are scalars, so they are equal. Then use this formula to derive the vector β:

                                    (1)

                                    

Set the result of the above formula equal to 0 to get:

                                              (2)

Note: Formula (1) is derived from the vector matrix derivation formula to obtain the following formula

                                

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


 

Reference: https://blog.csdn.net/qll125596718/article/details/8248249

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324754478&siteId=291194637