### Logistic regression

Logistic regression is similar to linear regression, but the result is binary. It is used to transform a variety of problems into the problem can be linear model.

# Concepts / terms

- Logistic function

a can the class belongs to a**probability**map to ± ∞ range (between 0 and 1 instead) function. (Note that not the final ratio)

Logistic function = logarithmic probability function - Probability of

"success" ratio between (1) and "unsuccessful" (0). - Results variables:
**label probability p 1 is**(instead of a simple binary label)

outcome variable variable y =

# Assume functions

## Modeling process

First of all, we can not simply be seen as two yuan outcome variable label, the label should be regarded as the probability p 1.

If the direct modeling, does not ensure the probability p is within [0, 1]:

The following in a different approach. : P us to model the response function (inverse of the logistic regression function) by applying logic predictors

This conversion ensures that the value of p is within [0, 1].

Note:

logarithmic ends of the equation to give:

**Logarithmic probability function** , also known as Logistic functions.

Upon completion of this conversion process, we can use a linear model to predict the probability.

## Logistic regression models

- Logistic regression models

- Logical Functions

- Logistic regression model to predict

# Loss function

- Loss of function of a single training samples

- Logistic regression cost function (log loss function)

# optimization

Logistic regression cost function is not known closed equation (equation does not exist a standard equivalent equation) to calculate the minimum cost function value of θ. However, this is a convex function, so by gradient descent (or any other optimization algorithms) guaranteed to find the global minimum

# Code examples

```
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets
from sklearn.linear_model import LogisticRegression
```

```
iris = datasets.load_iris()
list(iris.keys())
```

```
# 我们试试仅基于花瓣宽度这一个特征，创建一个分类器来检测Virginica鸢尾花。
X = iris["data"][:, 3:] # petal width
y = (iris["target"] == 2).astype(np.int) # 1 if Iris-Virginica, else 0
```

```
log_reg = LogisticRegression()
log_reg.fit(X, y)
```

```
X_new = np.linspace(0, 3, 1000).reshape(-1, 1)
y_proba = log_reg.predict_proba(X_new)
y_proba
```

```
plt.scatter(X, y)
plt.plot(X_new, y_proba)
# plt.plot(X_new, y_proba[:, 1], "g-", label="Iris-Virginica")
# plt.plot(X_new, y_proba[:, 0], "b--", label="Not Iris-Virginica")
```

Note that there are partially overlapped. At about 1.6 cm there is a decision boundary, where "yes" and the possibility of "no" were 50%