Ng Enda's Machine Learning Notes--Week 3-1. Classification and Logistic Regression

week3-1.Classification and Representation


一、Classification

Binary classification:
Classes are determined based on a threshold (threshold).
If linear regression is used, the presence of an outlier (the rightmost point) may greatly affect the classification. So linear regression is not recommended.
There may be cases where h(x) < 0 or > 1, in order to avoid this situation, logistic regression needs to be used.

二、Hypothesis Representation

The origin of logistic regression - logistic function
When z approaches +infinity, g(z) approaches 1; when z approaches -infinity, g(z) approaches 0.
The expression for the hypothetical function h(x) in logistic regression:
It means: given the parameter theta, for an eigenvalue x, there is a probability of h(x)=P(y=1|x;theta) to determine that this data point should be assigned to the class of y=1 .

三、Decision Boundary

From the image, when theta'*x>=0, predict y=1; when theta'*x<0, predict y=0.
Suppose the parameters are theta0=-3, theta1=theta2=1.
The decision boundary is the straight line corresponding to h(x)=0.5. It assumes that the function h(x)=g(theta0+theta1x1+theta2x2) is a property (depending on the theta parameter), not a property of the dataset (that is, the decision boundary remains unchanged even if the theta value does not change the dataset). ).
The method of determining the decision boundary: bring the theta parameter into g, that is, theta'*x, and then make it = 0 (>0, y=1, <0, y=0).
Complex nonlinear decision boundaries:

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324609349&siteId=291194637