(notes) logistic regression

  Logistic regression is not linear regression, linear regression is a prediction algorithm, and logistic regression is a classification algorithm. What is regression, if there are some data points, fit a straight line to the data, the process of fitting is called regression. Logistic regression is to establish a regression equation based on the boundaries of the data to classify. The logistic regression algorithm trains a classifier to find the best fitting parameters.

Sigmoid function

  

When the value of x is 0, the value of the function is 0.5, when the value of x becomes larger, the value of the function approaches and 1, and when the value of x decreases, the value of the function approaches and 0.

You can use the following formula y = w 0 x 0 + w 1 x 1 +w 2 x +.......w n x to multiply each of the n trait values ​​by a corresponding coefficient to get Values ​​brought into the Sigmoid function greater than 0.5 are classified as one class and those less than 0.5 are classified as one class. We need to find this set of w coefficients, multiply them by the trait value, and then substitute them into the sigmoid function to output the correct classification result. Here we need to use gradient ascent.

Gradient Ascent

If there is a function f(x,y), then there is a shave of the function , that is, moving in the x direction and moving in the y direction  . When our n features are multiplied by the corresponding coefficient w,

After the sigmoid function is brought in, the output value is compared with the actual value, and the obtained difference is used as the direction. The difference vector of N pieces of data is analogized at a time, and each type of feature of the data is multiplied by the difference vector matrix to obtain the coefficient matrix w Should each element corresponding to increase or decrease

w is the coefficient matrix, is the step size, and the difference matrix between the actual value of the transposed feature matrix (Yy) and the output value. Loop through the above formula to get the final coefficient matrix.

#gradient ascent 
def gradAscent(dataMatIn, classLabels):
    dataMatrix = mat(dataMatIn) #Convert         input data to matrix 
    labelMat = mat(classLabels).transpose() #Convert 1 *n label array to n*1 matrix 
    m, n = shape(dataMatrix)
    alpha = 0.001 
    maxCycles = 500 #Number of iterations of the weight loop 
    weights = ones((n,1)) #Initialize the weight to 1 
    for k in range(maxCycles):             
        h = sigmoid(dataMatrix*weights)    
        error = (labelMat - h) #label               matrix difference 
        weights = weights + alpha * dataMatrix.transpose()* error #weight iteration 
    return weighs

 

Summarize

The purpose of logistic regression is to obtain the best fitting parameter of a sigmoid function, and the method of solving this best fitting parameter can use the gradient ascent method.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325205853&siteId=291194637