Difference [machine learning notes] Logistic Regression and Linear Regression of

Distinction logistic regression (Logistic Regression) linear regression (Linear Regression) of

 

Logistic Regression

  • Used in classification
  • In addition to solve binary classification problem, you can also solve the multi-classification problems.
  • Logistic Regression is discrete. For example, to predict the weather tomorrow - overcast, sunny, rain. Classification is marked with a label for something, usually as a result of discrete values. For example, to determine the animal on the picture is a cat or a dog, usually classification is based on regression, classification of the last layer typically uses softmax function to judge their category. Classification and no concept of approximation, and ultimately there is only one correct result, wrong is wrong, there will be no similar concept. The most common classification is logistic regression, or call logical classification.
  • Logistics Regression still belongs to the category of linear regression, because the interface is linear, and Logistics Regression is a generalized linear model (GLM) or called log-linear model (LLM);
  • Logistics Regression the logarithmic maximum likelihood, make a positive gradient to the gradient descent direction, but sometimes to be consistent with linear regression, often taking the negative log-likelihood;
  • In general, LR refers Logistics Regression, rather than the Linear Regression;
  • Logistics Regression and Softmax Regression is really doing the classification of choice, because the method is simple, easy to implement, works well, is easy to explain, except for classification, recommendation systems can also be used;
  • Activation function is a sigmoid function, the linear regression understood to be a normalized sigmoid function, the sigmoid function is mapped to the real interval [0,1]. Logistic regression parameter estimates about solving this unknown said. If data to predict an unknown class to which x belongs, just into the sigmoid function is assumed, the simplest decision method, if the value is between 0.5 and 1, belonging to class 1, otherwise belong to category 0.
     

Logistic Regression using multiple classification idea is: Choose a category as a positive samples, other samples classified as negative to create a binary classification model; and so build more (there are several categories to build a few) binary model; to the output value of more than two-class model to compare the size of the sample classified as maximum output value of that class.

Linear Regression

  • Return to solve the problem, usually used to predict a value. As house prices predicted future weather conditions and so on, for example, the actual price of a product is $ 500, through regression analysis, predictive value of 499 yuan, we think this is a good regression analysis. The return value is an approximation of the real prediction.
  • The sample may be non-linear, as long as the parameters is linear, can be used. The expression of the form y = w'x + e, e is the error is normally distributed with mean 0. For it does not matter whether x is linear, but sometimes need to do feature selection;
  • Linear Regression logarithmic likelihood minimum, the gradient descent in doing so, to the negative gradient direction;

Logistic and Softmax Regression 

Reference Bowen:

https://blog.csdn.net/danieljianfeng/article/details/41901063?depth_1-utm_source=distribute.pc_relevant.none-task&utm_source=distribute.pc_relevant.none-task

The basic concept of Logistic Regression

Logistic Regression is a generalized linear regression model, commonly used in data mining, economic forecasting and other fields.

Logistic Regression belong to binary classification problem in essence, is the second-class supervised classification model based on Sigmoid function (also known as "S-type function") of.

Sigmoid function formula is:

:( its derivative form of a note, derivative form will be used at a later stage)

Sigmoid function shown below which an image which is compressed to the range between 0 and 1.

We know there are problems to be supervised classification of training samples with category tags, the  would correspond to a sample of the training set information. And sample information is usually expressed by a linear combination of a series of features, i.e.,

Wherein   denotes n features, a weight of each feature weight corresponding features representative of the degree of importance, it is the offset, the above formula is usually written in vector form:    (  corresponding to equal 1). So Sigmoid function can accordingly be written as the following form:

Suppose we know that a certain feature values corresponding to the sample and weight parameters, as long as it is into the above equation to obtain a number between 0 and 1, is generally considered to belong to the category positive, negative and vice versa belonging to the category, i.e., the in fact, the number reflects the probability that the sample belongs to the positive category.

The question now is, we have a training set of hands that sample are known, and the model parameters are unknown. We need to determine the unknown value through the training set. Once determined, whenever faced with a new sample, we are able to map the thrown , depending on whether the result is greater than 0.5, breezy to obtain samples of the new category.

 

Published 619 original articles · won praise 185 · views 660 000 +

Guess you like

Origin blog.csdn.net/seagal890/article/details/105107179