Machine Learning - Logistic Regression and SoftMax Regression

logistic regression

Logistic regression is essentially a classification algorithm. For historical reasons, the word "return" is included in the name.

main idea

Find the probability that the input vector x belongs to the positive class and the negative class, and the one with the larger probability is the predicted class. The probability of belonging to the positive class follows a logistic distribution.

Introduction to Algorithms

Model

The logistic regression model has fewer parameters. Compared with Naive Bayes, the prior information that the probability of belonging to the positive class follows a logisticf distribution ( logistic function ) is used. which is

p ( and = 1 | x ) = 1 1 + e w x
So what does this distribution mean? The probability of an event is the ratio of the probability that the event occurs to the probability that it does not occur, then
log p 1 p = w x
It can be seen that the logarithmic probability of the logistic regression model has a linear relationship with the input vector x.

Strategy

The logarithmic loss function is minimized, which is equivalent to the maximum likelihood function.

study method

Depending on the strategy, the problem evolves into an optimization problem, usually by steepest gradient descent.

g r a d = i = 1 N x i ( and i and i p r e d )
Note: This formula is very similar to the squared loss function gradient, but this is derived from the logarithmic loss function (or the maximum likelihood function)

SoftMax regression

SoftMax regression is a generalization of logistic regression for multi-classification problems.

main idea

Find the probability that the input vector x belongs to each class, and take the highest probability as the predicted class.

Introduction to Algorithms

The only difference between SoftMax regression and logistic regression models is that each class of SoftMax corresponds to a set of weight vectors w, so its parameter is a matrix of n_class*n_dimension. For more details, please refer to SoftMax Regression

code

Logistic regression

"""
逻辑斯谛回归
"""

import numpy as np


class LR:
    def __init__(self, alpha=0.01, maxstep=1000):
        self.w = None
        self.maxstep = maxstep
        self.alpha = alpha

    def sig(self, z):
        # Logistic函数, 正类的概率
        return 1.0 / (1 + np.exp(-z))

    def bgd(self, X_data, y_data):  # 损失函数采用对数损失函数,其数学形式与似然函数一致
        # 批量梯度下降法
        b = np.ones((X_data.shape[0], 1))
        X = np.hstack((X_data, b))  # 考虑阈值,堆输入向量进行扩充
        w = np.ones(X.shape[1])  # 初始化各特征的权重
        i = 0
        while i <= self.maxstep:
            i += 1
            err = y_data - self.sig(w @ X.T)
            w += self.alpha * err @ X  # 注意,其表达式与平方误差损失函数的非常相似,但这是由对数损失函数推导而来的
        self.w = w
        return

    def fit(self, X_data, y_data):
        self.bgd(X_data, y_data)
        return

    def predict(self, x):
        x = np.append(x, 1)
        PT = self.sig(self.w @ x.T)
        if PT > 1 - PT:
            return 1
        else:
            return 0


if __name__ == '__main__':
    from sklearn import datasets

    data = datasets.load_digits(n_class=2)
    X_data = data['data']
    y_data = data['target']
    clf = LR()
    from machine_learning_algorithm.cross_validation import validate
    g = validate(X_data, y_data, ratio=0.2)
    for item in g:
        X_train, y_train, X_test, y_test = item
        clf.fit(X_train, y_train)
        score = 0
        for x, y in zip(X_test, y_test):
            if clf.predict(x)==y:
                score += 1
        print(score/len(y_test))

SoftMax regression

"""
SoftMax回归,逻辑斯蒂回归的多分类推广。所以,本质还是一种分类算法
"""
import numpy as np


class SoftMax:
    def __init__(self, maxstep=10000, C=1e-4, alpha=0.4):
        self.maxstep = maxstep
        self.C = C  # 权值衰减项系数lambda, 类似于惩罚系数
        self.alpha = alpha  # 学习率

        self.w = None  # 权值

        self.L = None  # 类的数量
        self.D = None  # 输入数据维度
        self.N = None  # 样本总量

    def init_param(self, X_data, y_data):
        # 初始化,暂定输入数据全部为数值形式
        b = np.ones((X_data.shape[0], 1))
        X_data = np.hstack((X_data, b))  # 附加偏置项
        self.L = len(np.unique(y_data))
        self.D = X_data.shape[1]
        self.N = X_data.shape[0]
        self.w = np.ones((self.L, self.D))  # l*d, 针对每个类,都有一组权值参数w
        return X_data

    def bgd(self, X_data, y_data):
        # 梯度下降训练
        step = 0
        while step < self.maxstep:
            step += 1
            prob = np.exp(X_data @ self.w.T)  # n*l, 行向量存储该样本属于每个类的概率
            nf = np.transpose([prob.sum(axis=1)])  # n*1
            nf = np.repeat(nf, self.L, axis=1)  # n*l
            prob = -prob / nf  # 归一化, 此处条件符号仅方便后续计算梯度
            for i in range(self.N):
                prob[i, int(y_data[i])] += 1
            grad = -1.0 / self.N * prob.T @ X_data + self.C * self.w  # 梯度, 第二项为衰减项
            self.w -= self.alpha * grad
        return

    def fit(self, X_data, y_data):
        X_data = self.init_param(X_data, y_data)
        self.bgd(X_data, y_data)
        return

    def predict(self, X):
        b = np.ones((X.shape[0], 1))
        X = np.hstack((X, b))  # 附加偏置项
        prob = np.exp(X @ self.w.T)
        return np.argmax(prob, axis=1)


if __name__ == '__main__':
    from sklearn.datasets import load_digits

    data = load_digits()
    X_data = data['data']
    y_data = data['target']
    clf = SoftMax(maxstep=10000, alpha=0.1, C=1e-4)

    from machine_learning_algorithm.cross_validation import validate

    g = validate(X_data, y_data, ratio=0.2)
    for item in g:
        X_train, y_train, X_test, y_test = item
        clf.fit(X_train, y_train)
        y_pred = clf.predict(X_test)
        score = 0
        for y, y_pred in zip(y_test, y_pred):
            score += 1 if y == y_pred else 0
        print(score / len(y_test))

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324882804&siteId=291194637