Python梯度下降法实现二元逻辑回归
二元逻辑回归假设函数
定义当函数值大于等于0.5时,结果为1,当函数值小于0.5时,结果为0.函数的值域是(0, 1)。
二元逻辑回归的损失函数
上图为二元逻辑回归的概率公式,则代价函数可以表示为
损失函数求偏倒数为
可以发现和线性回归的结果是一样的,只不过是假设函数h发生了变化。
正则化
为了避免过拟合,通常在代价函数后加一个正则化项,针对二元逻辑回归,填加正则化项,
这样,随时函数后就应该添加一项
Python代码实现
import numpy as np
import matplotlib.pyplot as plt
# 特征数目
n = 2
# 构造训练集
X1 = np.arange(-2., 2., 0.02)
m = len(X1)
X2 = X1 + np.random.randn(m)
# print(X1, X2)
one = np.full(m, 1.0)
Y = 1 / (np.full(m, 1.0) + np.exp(-(0.1 * np.random.randn(m)-np.full(m, 0.6) + 5*X1 + 2 * X2)))
# Y
Y = np.array([np.int(round(i)) for i in Y])
# 梯度下降法
theta = np.random.rand(n+1)
print(theta)
X = np.vstack([np.full(m, 1), X1, X2]).T
# 前一次的theta
pre = np.zeros(n + 1)
diff = 1e-10
max_loop = 10000
alpha = 0.01
lamda = 1
now_diff = 0
while max_loop > 0:
#sum = np.zeros(n + 1)
sum = np.sum([(1 / (1. + np.exp(- np.dot(theta, X[i]))) - Y[i])*X[i] for i in range(m)], axis=0)
# for i in range(m):
# sum += (1 / (1. + np.exp(- np.dot(theta, X[i]))) - Y[i])*X[i]
theta = theta - (alpha * sum + alpha * lamda * theta)
print("还差 %d 次" % max_loop, "theta = ", theta)
now_diff = np.linalg.norm(theta - pre)
if(now_diff <= diff):
break
pre = theta
max_loop -= 1
# 打印
print("find theta : ", theta, "now_diff : ", now_diff)
# 画出平测试例子图
X_1 = np.array([(X1[i], X2[i]) for i in range(m) if Y[i] == 1])
X_0 = np.array([(X1[i], X2[i]) for i in range(m) if Y[i] == 0])
plt.scatter(X_1[:, 0], X_1[:, 1], c='r', marker='o')
plt.scatter(X_0[:, 0], X_0[:, 1], c='g', marker='v')
# 画出求得的模型图
point1 = np.arange(-2, 2, 0.02)
point2 = (theta[0] + theta[1] * point1)/(-theta[2])
plt.plot(point1, point2)
plt.xlabel('X1')
plt.ylabel('X2')
plt.show()
效果图如下