## Machine learning algorithm / model - logistic regression

### Logistic regression

Logistic regression is similar to linear regression, but the result is binary. It is used to transform a variety of problems into the problem can be linear model.

# Concepts / terms

• Logistic function
a can the class belongs to a probability map to ± ∞ range (between 0 and 1 instead) function. (Note that not the final ratio)
Logistic function = logarithmic probability function
• Probability of
"success" ratio between (1) and "unsuccessful" (0).
• Results variables: label probability p 1 is (instead of a simple binary label)
outcome variable variable y =

# Assume functions

## Modeling process

First of all, we can not simply be seen as two yuan outcome variable label, the label should be regarded as the probability p 1.
If the direct modeling, does not ensure the probability p is within [0, 1]:

The following in a different approach. : P us to model the response function (inverse of the logistic regression function) by applying logic predictors

This conversion ensures that the value of p is within [0, 1].

Note:

logarithmic ends of the equation to give:

Logarithmic probability function , also known as Logistic functions.
Upon completion of this conversion process, we can use a linear model to predict the probability.

## Logistic regression models

• Logistic regression models
• Logical Functions

• Logistic regression model to predict

# Loss function

• Loss of function of a single training samples
• Logistic regression cost function (log loss function)

# optimization

Logistic regression cost function is not known closed equation (equation does not exist a standard equivalent equation) to calculate the minimum cost function value of θ. However, this is a convex function, so by gradient descent (or any other optimization algorithms) guaranteed to find the global minimum

# Code examples

``````import matplotlib.pyplot as plt
import numpy as np

from sklearn import datasets
from sklearn.linear_model import LogisticRegression
``````
``````iris = datasets.load_iris()

list(iris.keys())
``````
``````# 我们试试仅基于花瓣宽度这一个特征，创建一个分类器来检测Virginica鸢尾花。

X = iris["data"][:, 3:] # petal width
y = (iris["target"] == 2).astype(np.int) # 1 if Iris-Virginica, else 0
``````
``````log_reg = LogisticRegression()

log_reg.fit(X, y)
``````
``````X_new = np.linspace(0, 3, 1000).reshape(-1, 1)

y_proba = log_reg.predict_proba(X_new)

y_proba
``````
``````plt.scatter(X, y)

plt.plot(X_new, y_proba)

# plt.plot(X_new, y_proba[:, 1], "g-", label="Iris-Virginica")

# plt.plot(X_new, y_proba[:, 0], "b--", label="Not Iris-Virginica")

``````

Note that there are partially overlapped. At about 1.6 cm there is a decision boundary, where "yes" and the possibility of "no" were 50%

Published 133 original articles · won praise 2 · views 10000 +

### Guess you like

Origin blog.csdn.net/Robin_Pi/article/details/104432832
Recommended
Ranking
Daily