逻辑回归学习笔记2（python）

一、读入数据；

pdData = pd.read_csv(path, header=None, names=['Exam 1', 'Exam 2', 'Admitted'])

画图：

positive = pdData[pdData['Admitted'] == 1]
negative = pdData[pdData['Admitted'] == 0]

fig, ax = plt.subplots(figsize=(10,5))
ax.scatter(positive['Exam 1'], positive['Exam 2'], s=30, c='b', marker='o', label='Admitted')
ax.scatter(negative['Exam 1'], negative['Exam 2'], s=30, c='r', marker='*', label='no Admitted')
ax.legend()
ax.set_xlabel('Exam 1')
ax.set_ylabel('Exam 2')

二、逻辑回归：

目标：建立分类器（求解出三个参数 θ0θ1θ2θ0θ1θ2）

设定阈值，根据阈值判断录取结果

要完成的模块

sigmoid : 映射到概率的函数
model : 返回预测结果值
cost : 根据参数计算损失
gradient : 计算每个参数的梯度方向
descent : 进行参数更新
accuracy: 计算精度

1、sigmod 函数：

def sigmod(z):
return 1/(1 + np.exp(-z))

2、model:返回预测结果值

def sigmod(z):
return 1/(1 + np.exp(-z))

3、cost: 根据参数计算损失

def cost(X, y, theta):
left = np.multiply(-y, np.log(model(X, theta)))
right = np.multiply(1-y, np.log(1-model(X, theta)))
return np.sum(left-right) / (len(X))

4、计算梯度

def gradient(X, y, theta):
grad = np.zeros(theta.shape)
error = (model(X, theta) - y).ravel()
for j in range(len(theta.ravel())):
term = np.multiply(error, X[:,j])
grad[0, j] = np.sum(term)/len(X)

return grad

1）、三种梯度下降的方法：

2）、三种停止方法：

import time

def descent(data, theta, batchSize, stopType, thresh, alpha):
#梯度下降求解

init_time = time.time()
i = 0 # 迭代次数
k = 0 # batch
X, y = shuffleData(data)
grad = np.zeros(theta.shape) # 计算的梯度
costs = [cost(X, y, theta)] # 损失值

while True:
grad = gradient(X[k:k+batchSize], y[k:k+batchSize], theta)
k += batchSize #取batch数量个数据
if k >= n:
k = 0
X, y = shuffleData(data) #重新洗牌
theta = theta - alpha*grad # 参数更新
costs.append(cost(X, y, theta)) # 计算新的损失
i += 1

if stopType == STOP_ITER: value = i
elif stopType == STOP_COST: value = costs
elif stopType == STOP_GRAD: value = grad
if stopCriterion(stopType, value, thresh): break

return theta, i-1, costs, grad, time.time() - init_time

5、精度计算

逻辑回归学习笔记2（python）

要完成的模块

猜你喜欢