# 1 Introduction

Logic is not logical, non-return return.

Recall the year when the initial learning logistic regression algorithm, see the "logistic regression," the name, the first impression is that this is a return similar to the return type of the linear algorithm, but this algorithm to highlight the "logic", or with an order " logic "named after the relevant knowledge points. But later I found that this is a pit dead attractiveness name - logistic regression algorithm is not regression algorithm, a classification algorithm, has nothing to do with logic, to say about it is only because of its English name is Loginstics, transliterated as only logic (there is also a material called logistic regression).

# 2 logistic regression principle

## Linear regression from 2.1 to logistic regression

In a previous blog post, we said back in detail the difference between algorithm and classification algorithm. Since it is the logistic regression classification algorithm, but why not call classification logic logistic regression it? In my opinion, this is due to return to the idea of ​​using logistic regression to solve the problem of classification.

Suppose shown below in a data set using a linear regression algorithm, we can find a substantially linear model as a black line be fit. For regression algorithm needs to be done for each data set is ${_ {{X}}}$ I , can be found by a model $_ {{{Y}}}$ I (predicted value) corresponding thereto.

He won the predictive value of ${_} {{the y-i}}$ , we can do a lot of things, such as: classification. We can ${{y} _ {i }}$ segmented, for example, in the ${{y}}$ axis takes a value of $M$ , when ${{y} _ {i }} <M$ , we mark the class 0, when ${{y} _ {i }}> M$ , we mark it into another one, as shown below:

## 2.2 sigmod函数

sigmoid函数也叫Logistic函数，函数表达式如下：

$g(z)=\frac{1}{1+{{e}^{-x}}}$

$y=\left\{ _{0,\text{ }g(x)<0.5}^{1,\text{ }g(x)\ge 0.5} \right.$

$g(x)=\frac{1}{1+{{e}^{-z}}}$

$z=f(x)={{\theta }_{0}}+{{\theta }_{1}}{{x}_{1}}+{{\theta }_{2}}{{x}_{2}}+\cdots +{{\theta }_{n}}{{x}_{n}}$

$h(x)=\frac{1}{1+{{e}^{-({{\theta }_{0}}+{{\theta }_{1}}{{x}_{1}}+{{\theta }_{2}}{{x}_{2}}+\cdots +{{\theta }_{n}}{{x}_{n}})}}}$

$h(x)=g(z)=g({{\theta }^{T}}x)=\frac{1}{1+{{e}^{-{{\theta }^{T}}x}}}$

# 3 损失函数

$p(y=1|x;\theta )=h(x)$

$p(y=0|x;\theta )=1-h(x)$

$\cos t(h(x),y)=\left\{ _{-\log (1-h(x)),\text{ }y=0}^{-\log (h(x)),\text{ }y=1} \right.$

$J(\theta )=-\frac{1}{m}\sum\limits_{i=1}^{m}{\cos (h(x),y)}$

$J(\theta )=-\frac{1}{m}\sum\limits_{i=1}^{m}{({{y}_{i}}\log (h({{x}_{i}}))+(1-{{y}_{i}})\log (1-h({{x}_{i}})))}$

# 4 代码实现

import torch
from torch import nn
import matplotlib.pyplot as plt
import numpy as np

# 假数据
n_data = torch.ones(100, 2)         # 数据的基本形态
x0 = torch.normal(2*n_data, 1)      # 类型0 x data (tensor), shape=(100, 2)
y0 = torch.zeros(100)               # 类型0 y data (tensor), shape=(100, 1)
x1 = torch.normal(-2*n_data, 1)     # 类型1 x data (tensor), shape=(100, 1)
y1 = torch.ones(100)                # 类型1 y data (tensor), shape=(100, 1)

# 注意 x, y 数据的数据形式是一定要像下面一样 (torch.cat 是在合并数据)
x = torch.cat((x0, x1), 0).type(torch.FloatTensor)  # FloatTensor = 32-bit floating
y = torch.cat((y0, y1), 0).type(torch.FloatTensor)    # LongTensor = 64-bit integer

# 画图
# plt.scatter(x.data.numpy()[:, 0], x.data.numpy()[:, 1], c=y.data.numpy(), s=100, lw=0, cmap='RdYlGn')
# plt.show()

class LogisticRegression(nn.Module):
def __init__(self):
super(LogisticRegression, self).__init__()
self.lr = nn.Linear(2, 1)
self.sm = nn.Sigmoid()

def forward(self, x):
x = self.lr(x)
x = self.sm(x)
return x

logistic_model = LogisticRegression()
if torch.cuda.is_available():
logistic_model.cuda()

# 定义损失函数和优化器
criterion = nn.BCELoss()
optimizer = torch.optim.SGD(logistic_model.parameters(), lr=1e-3, momentum=0.9)

# 开始训练
for epoch in range(10000):
if torch.cuda.is_available():
x_data = Variable(x).cuda()
y_data = Variable(y).cuda()
else:
x_data = Variable(x)
y_data = Variable(y)

out = logistic_model(x_data)
loss = criterion(out, y_data)
print_loss = loss.data.item()
correct = (mask == y_data).sum()  # 计算正确预测的样本个数
acc = correct.item() / x_data.size(0)  # 计算精度
loss.backward()
optimizer.step()
# 每隔20轮打印一下当前的误差和精度
if (epoch + 1) % 20 == 0:
print('*'*10)
print('epoch {}'.format(epoch+1)) # 训练轮数
print('loss is {:.4f}'.format(print_loss))  # 误差
print('acc is {:.4f}'.format(acc))  # 精度

# 结果可视化
w0, w1 = logistic_model.lr.weight[0]
w0 = float(w0.item())
w1 = float(w1.item())
b = float(logistic_model.lr.bias.item())
plot_x = np.arange(-7, 7, 0.1)
plot_y = (-w0 * plot_x - b) / w1
plt.scatter(x.data.numpy()[:, 0], x.data.numpy()[:, 1], c=y.data.numpy(), s=100, lw=0, cmap='RdYlGn')
plt.plot(plot_x, plot_y)
plt.show()

# 4 总结

1）预测结果是介于0和1之间的概率；

2）可以适用于连续性和类别性自变量；

3）容易使用和解释。

1）对模型中自变量多重共线性较为敏感，例如两个高度相关自变量同时放入模型，可能导致较弱的一个自变量回归符号不符合预期，符号被扭转。需要利用因子分析或者变量聚类分析等手段来选择代表性的自变量，以减少候选变量之间的相关性；

2）预测结果呈“S”型，因此从log(odds)向概率转化的过程是非线性的，在两端随着log(odds)值的变化，概率变化很小，边际值太小，slope太小，而中间概率的变化很大，很敏感。 导致很多区间的变量变化对目标概率的影响没有区分度，无法确定阀值。

https://blog.csdn.net/out_of_memory_error/article/details/81275651

https://www.cnblogs.com/yiduobaozhiblog1/p/8872903.html

https://blog.csdn.net/ligang_csdn/article/details/53838743

https://baijiahao.baidu.com/s?id=1620514366177013756&wfr=spider&for=pc

### Guess you like

Origin www.cnblogs.com/chenhuabin/p/11272820.html
Recommended
Ranking
Daily