Logistic回归分类器

Logistic回归分类器

logistic回归是一种广义的线性回归分析模型,logistic回归的模型与线性回归分析模型基本相同,对于自变量 x x x和因变量 w ∗ x + b w * x + b wx+b,通过逻辑回归函数将因变量的值映射到 ( 0 , 1 ) (0, 1) (0,1),logistic回归模型试图学得一个通过属性的线性组合来进行预测的函数
f ( x ) = x T + b f(\pmb{x}) = \pmb{x}^T + b f(xxx)=xxxT+b

Logistic回归试图学得合适的权重向量 w \pmb{w} www和实数 b b b,对于标记向量 y \pmb{y} yyy,使得 f ( x ) ≈ y f(\pmb{x}) \approx \pmb{y} f(xxx)yyy

import numpy as np
from numpy.core.fromnumeric import shape
import matplotlib.pyplot as plt

读取训练数据集

def load_data_set():
    data_matrix = []
    label_matrix = []
    with open('testSet.txt', "r+") as file:
        for line in file.readlines():
            data = line.strip().split()
            data_matrix.append([1.0, float(data[0]), float(data[1])])
            label_matrix.append(int(data[2]))

    return data_matrix, label_matrix

Sigmoid函数

sigmoid函数也叫Logistic函数,可以将一个实数映射到(0,1)的区间,用来而分类

S ( x ) = 1 1 + e − x S(x) = \frac{1}{1 + e^{-x} } S(x)=1+ex1

sigmoid函数的导数为

S ( x ) = S ( x ) ∗ S ( 1 − x ) S(x) = S(x) * S(1 - x) S(x)=S(x)S(1x)

def sigmoid(X):
    return 1.0 / (1 + np.exp(-X))
data_matrix, label_matrix = load_data_set()
data_matrix = np.mat(data_matrix)
label_matrix = np.mat(label_matrix).transpose()
m, n = shape(data_matrix)
m, n
(100, 3)
alpha = 0.001
max_cycles = 500
weights = np.ones((n, 1))
weights
array([[1.],
       [1.],
       [1.]])
h = sigmoid(data_matrix * weights)
h
matrix([[0.9999997 ],
        [0.98616889],
        [0.99887232],
        [0.99892083],
        [0.99999619],
        [0.99979122],
        [0.99999945],
        [0.99553342],
        [0.99998516],
        [0.99998882],
        [0.99984482],
        [0.99999982],
        [0.99524519],
        [0.99975551],
        [0.99793879],
        [0.97128332],
        [0.99919801],
        [0.97477903],
        [0.77681757],
        [0.99957748],
        [0.9980066 ],
        [0.22252829],
        [0.99999498],
        [0.26394949],
        [0.8246228 ],
        [0.99999261],
        [0.99991432],
        [0.01392443],
        [0.99215449],
        [0.99999407],
        [0.99007735],
        [0.99994736],
        [0.999999  ],
        [0.05986936],
        [0.99921454],
        [0.99997998],
        [0.99997966],
        [0.99982544],
        [0.99999104],
        [0.99998525],
        [0.97919678],
        [0.99971059],
        [0.99997751],
        [0.93705909],
        [0.9890627 ],
        [0.99996675],
        [0.1359093 ],
        [0.99921684],
        [0.99999079],
        [0.99999622],
        [0.99995015],
        [0.99998279],
        [0.99982675],
        [0.99999982],
        [0.9994663 ],
        [0.99964232],
        [0.9999885 ],
        [0.99997259],
        [0.99999121],
        [0.99542831],
        [0.98631076],
        [0.96925991],
        [0.99995761],
        [0.99999899],
        [0.99999879],
        [0.21808844],
        [0.99995494],
        [0.99999908],
        [0.99999865],
        [0.99998668],
        [0.99999443],
        [0.53267014],
        [0.99999957],
        [0.97651256],
        [0.99998887],
        [0.99993141],
        [0.33507029],
        [0.98891672],
        [0.99968925],
        [0.98927143],
        [0.99613509],
        [0.03702176],
        [0.99999797],
        [0.99999593],
        [0.83044946],
        [0.17239595],
        [0.9820568 ],
        [0.9999997 ],
        [0.99973113],
        [0.75736609],
        [0.59244738],
        [0.99999982],
        [0.9999823 ],
        [0.88578868],
        [0.82357126],
        [0.98572192],
        [0.9999961 ],
        [0.26402371],
        [0.99999196],
        [0.99999989]])

梯度上升算法

梯度上升算法,用来求解函数的最大值,沿着梯度的方向上升的速度最快

对于一个函数 y = f ( x ) y = f(\pmb{x}) y=f(xxx),这个函数的导数(derivative)记为 f ′ ( x ) f\prime(x) f(x) d y d x \frac{dy}{dx} dxdy,导数代表f(x)在点x处的斜率,表明如何缩放输入的小变化才能在输出获得相应的变化:

f ( x + ϵ ) ≈ f ( x ) + ϵ d y d x f(x+ \epsilon) \approx f(x) + \epsilon \frac{dy}{dx} f(x+ϵ)f(x)+ϵdxdy

针对具有多位输入的函数,需要用到偏导数(partial derivative),偏导数 ∂ f ( ( x ) ) ∂ x i \frac{\partial f (\pmb(x))}{\partial x_i} xif((((x))衡量点x处只有 x i x_i xi增加时f(x)如何变化,梯度(gradient)是相对一个向量求导的导数,f的导数是包含所有偏导数的向量

梯度向量指向上坡,在梯度方向上移动增加f,称为最速上升法(method of steepest descent)或梯度上升(gradient descent)算法

梯度上升算法建议新的点为

x ′ = x + ϵ ∂ f ( x ) ∂ x i x' = x + \epsilon \frac{\partial f( \pmb{x} )}{\partial x_i} x=x+ϵxif(xxx)

ε指的是学习率,一个确定步长大小的正标量,通常选择一个较小的常数

sigmoid函数的输入为 z = w 0 x 0 + w 1 x 1 + . . . + w n x n z = w_0 x_0 + w_1 x_1 + ... + w_n x_n z=w0x0+w1x1+...+wnxn

通过多次迭代不断更新权重向量 w \pmb{w} www b b b,使得 f ( x ) f(\pmb{x}) f(xxx)接近 y \pmb{y} yyy

for i in range(max_cycles):
        h = sigmoid(data_matrix * weights)
        error = (label_matrix - h)
        weights += alpha * data_matrix.transpose() * error
weights
array([[ 4.12414349],
       [ 0.48007329],
       [-0.6168482 ]])
weights * [1.0, -0.017612, 14.053064]
array([[ 4.12414349e+00, -7.26344151e-02,  5.79568524e+01],
       [ 4.80073293e-01, -8.45505083e-03,  6.74650071e+00],
       [-6.16848197e-01,  1.08639304e-02, -8.66860719e+00]])
def test(x):
    return sigmoid(np.sum(weights.transpose() * list(x))) > 0.5

测试模型准确率

accuracy = 0.0
for x, y in zip(data_matrix, label_matrix):
    if test(x) == True and y == 1:
        accuracy += 1
    elif test(x) == False and y == 0:
        accuracy += 1
accuracy / len(label_matrix)
0.96

引用

周志华. 机器学习 : = Machine learning[M]. 清华大学出版社, 2016.
[美] 伊恩·古德费洛 / [加] 约书亚·本吉奥 / [加] 亚伦·库维尔. 深度学习. 人民邮电出版社, 2017.
哈林顿李锐. 机器学习实战 : Machine learning in action[M]. 人民邮电出版社, 2013.

最后

  • 由于博主水平有限,不免有疏漏之处,欢迎读者随时批评指正,以免造成不必要的误解!

猜你喜欢

转载自blog.csdn.net/qq_44486439/article/details/109683821