-
A point: prediction logistic regression model of the sample depends on the weight vector and bias .
-
concept:
No. concept Explanation 1 Training set Sample set contains real class labels 2 training According to the training process to find the optimal set of parameters 3 Loss function It is a function of the model parameters used to measure the pros and cons of the model parameters -
Logistic regression prediction sample X = (X . 1 , X 2 , X . 3 , ......, X n- ) T probability of belonging to the positive class P :
-
Which, w and b are parameters of the model, the training process is to find these two parameters.
-
-
Confusion matrix
Forecast negative class Predicted positive class Negative real class TN FP Real positive class FN TP -
Correct rate
-
Accuracy of calculation formula:
accuracy =
The correct rate is the correct ratio of the number of samples and the total number of sample models predict. It is not always reliable, such as n Example: 99 = Negative Example: 1, positive for the whole prediction result, then the accuracy was 99%.
-
-
Precision:
-
Also known accuracy (precision), using the following formula:
(N-type)
(Negative type)
Seen from the formula, the number of samples correctly predicted positive class and the total number of samples predicted positive class ratio. Negative similar.
-
-
Recall:
-
This formula means that the number of samples predicted positive class and all classes than the number of positive samples. Also known as the true positive rate ( TPR , to true positive Rate) Correspondingly, there are false positive rate ( FPR , false positive Rate):
-
-
ROC curve
-
In the logistic regression, assume we've got a set of w and B, so we can test set data into f (x) for predicting, after substituting the function we get a value from a number between 0 and 1 in order to achieve the forecast, we need a threshold, we will f (x) is greater than the threshold value of the test data as a positive class, otherwise negative category.
-
So threshold selection will directly affect the quality of our logistic regression model.
-
FPR false positive rate and positive real rate of this index with the threshold value changes with the rise with the fall. TPR high and low FPR is our hope.
-
ROC curve in FIG.
-
-
FPR in the horizontal axis, TPR vertical axis, drawn to a different threshold corresponding ROC curve. The higher the arch curve ROC, have described higher in the lower region of the FPR TPR.
-
The area under the ROC curve, AUC (area under curve) can measure the quality of the model.
-
The next section will publish calculation of loss function
There are follow-up gradient descent method to solve the logistic regression, the gradient descent method improvement of content, then offer logistic regression handwritten codes
Thanks for attention