Pytorch learning - the second model (logistic regression)

Refer to this blog system to learn Pytorch Note 2: Pytorch's dynamic graph, automatic derivation and logistic regression
class = { 0 0.5 > y 1 0.5 ≤ y class=\left\{ \begin{array}{rcl} 0 & & {0.5 > y}\\ 1 & & {0.5 \le y}\\ \end{array} \right.class={ 010.5>y0.5y
Classify according to the value of this y. When the value is less than 0.5, it is judged as category 0, and if it is greater than 0.5, it is judged as category 1.
Linear regression: the independent variable is XXX , the dependent variable isyyy, 关系: y = w x + b y = w x + b y=wx+b , the image is a straight line. is the analysis independent variablexxx and dependent variableyyMethods for relationships between y (scalars). Note that the linearity here is forwwwhat w said, awww only affects onexxx . The decision boundary is a straight line
logistic regression: the independent variable isXXX , the dependent variable isyyy , just yyherey becomes a probability. Relation:
y = f ( wx + b ) y=f(wx+b)y=f(wx+b)
f ( x ) = 1 1 + e − x f(x)=\frac{1}{1+e^{-x}} f(x)=1+ex1
The image is also a straight line. is the analysis independent variable xxx and dependent variableyyThe relationship between y (probability)

data generation

Here we use random generation to generate 2 types of samples (represented by 0 and 1), 100 samples for each type, and two features for each sample.

"""数据生成"""
torch.manual_seed(1)

sample_nums = 100
mean_value = 1.7
bias = 1

n_data = torch.ones(sample_nums, 2)
x0 = torch.normal(mean_value*n_data, 1) + bias  # 类别0  数据shape=(100,2)
y0 = torch.zeros(sample_nums)   # 类别0, 数据shape=(100, 1)
x1 = torch.normal(-mean_value*n_data, 1) + bias   # 类别1, 数据shape=(100,2)
y1 = torch.ones(sample_nums)    # 类别1  shape=(100, 1)

train_x = torch.cat([x0, x1], 0)
train_y = torch.cat([y0, y1], 0)

Modeling

Here we use two methods to build our logistic regression model. One is the sequential method of Pytorch. This method is simple and easy to understand, just like building blocks, building up layer by layer. Another way is to inherit the nn.Module class to build a model. This way is very flexible and can build various complex networks.

"""建立模型"""
class LR(torch.nn.Module):
    def __init__(self):
        super(LR, self).__init__()
        self.features = torch.nn.Linear(2, 1)  # #in_features代表输入的数据有多少个特征值,out_features同理
        self.sigmoid = torch.nn.Sigmoid()
    
    def forward(self, x):
        x = self.features(x)
        x = self.sigmoid(x)
        
        return x

lr_net = LR()     # 实例化逻辑回归模型

Another way, the method of Sequential:

lr_net = torch.nn.Sequential(
    torch.nn.Linear(2, 1),
    torch.nn.Sigmoid()
)

Choose a loss function

"""选择损失函数"""
loss_fn = torch.nn.BCELoss()

There are two points to note when using BCELoss:

1. It is only used for binary classification problems, the full name is "BinaryClassEntroyLoss"

2. It requires Sigmoid() before use

Choose an optimizer

"""选择优化器"""
lr = 0.01
optimizer = torch.optim.SGD(lr_net.parameters(), lr=lr, momentum=0.9)

iterative training model

"""模型训练"""
for iteration in range(1000):
    
    # 前向传播
    y_pred = lr_net(train_x)
    
    # 计算loss
    loss = loss_fn(y_pred.squeeze(), train_y)
    
    # 反向传播
    loss.backward()
    
    # 更新参数
    optimizer.step()
    
    # 清空梯度
    optimizer.zero_grad()
    
    # 绘图
    if iteration % 20 == 0:

        mask = y_pred.ge(0.5).float().squeeze()  # 以0.5为阈值进行分类
        correct = (mask == train_y).sum()  # 计算正确预测的样本个数
        acc = correct.item() / train_y.size(0)  # 计算分类准确率

        plt.scatter(x0.data.numpy()[:, 0], x0.data.numpy()[:, 1], c='r', label='class 0')
        plt.scatter(x1.data.numpy()[:, 0], x1.data.numpy()[:, 1], c='b', label='class 1')

        w0, w1 = lr_net.features.weight[0]
        w0, w1 = float(w0.item()), float(w1.item())
        plot_b = float(lr_net.features.bias[0].item())
        plot_x = np.arange(-6, 6, 0.1)
        plot_y = (-w0 * plot_x - plot_b) / w1

        plt.xlim(-5, 7)
        plt.ylim(-7, 7)
        plt.plot(plot_x, plot_y)

        plt.text(-5, 5, 'Loss=%.4f' % loss.data.numpy(), fontdict={
    
    'size': 20, 'color': 'red'})
        plt.title("Iteration: {}\nw0:{:.2f} w1:{:.2f} b: {:.2f} accuracy:{:.2%}".format(iteration, w0, w1, plot_b, acc))
        plt.legend()

        plt.show()
        plt.pause(0.5)

        if acc > 0.99:
            break

some functions explained

.item()

In pytorch training, the .item() method is generally used. Such as loss. item().
∙ \bullet Returns the value of this tensor as a standard Python number. This only works for single-element tensors. For other cases, see tolist().
∙ \bullet This operation is not differentiable.
Using the .item() function on floating-point results can improve the display accuracy, so when we are looking for loss or accuracy, we generally use x[1,1].item() instead of simply using x[1,1].

.ge()

insert image description here
The above formula a means that all >0.5 in y_pred are true
b means floating-point conversion
mask means synthetic sequence

all codes

import torch
import matplotlib.pyplot as plt
import numpy as np
"""数据生成"""
torch.manual_seed(1)

sample_nums = 100
mean_value = 1.7
bias = 1

n_data = torch.ones(sample_nums, 2)
x0 = torch.normal(mean_value*n_data, 1) + bias  # 类别0  数据shape=(100,2)
y0 = torch.zeros(sample_nums)   # 类别0, 数据shape=(100, 1)
x1 = torch.normal(-mean_value*n_data, 1) + bias   # 类别1, 数据shape=(100,2)
y1 = torch.ones(sample_nums)    # 类别1  shape=(100, 1)

train_x = torch.cat([x0, x1], 0)
train_y = torch.cat([y0, y1], 0)
"""建立模型"""


class LR(torch.nn.Module):
    def __init__(self):
        super(LR, self).__init__()
        self.features = torch.nn.Linear(2, 1)  # Linear 是module的子类,是参数化module的一种,与其名称一样,表示着一种线性变换。输入2个节点,输出1个节点
        self.sigmoid = torch.nn.Sigmoid()

    def forward(self, x):
        x = self.features(x)
        x = self.sigmoid(x)

        return x


lr_net = LR()  # 实例化逻辑回归模型
"""选择损失函数"""
loss_fn = torch.nn.BCELoss()
"""选择优化器"""
lr = 0.01
optimizer = torch.optim.SGD(lr_net.parameters(), lr=lr, momentum=0.9)
#acce=[]
"""模型训练"""
for iteration in range(1000):

    # 前向传播
    y_pred = lr_net(train_x)

    # 计算loss
    loss = loss_fn(y_pred.squeeze(), train_y)

    # 反向传播
    loss.backward()

    # 更新参数
    optimizer.step()

    # 清空梯度
    optimizer.zero_grad()

    # 绘图
    if iteration % 20 == 0:
        mask = y_pred.ge(0.5).float().squeeze()  # 以0.5为阈值进行分类
        correct = (mask == train_y).sum()  # 计算正确预测的样本个数
        acc = correct.item() / train_y.size(0)  # 计算分类准确率

        plt.scatter(x0.data.numpy()[:, 0], x0.data.numpy()[:, 1], c='r', label='class 0')
        plt.scatter(x1.data.numpy()[:, 0], x1.data.numpy()[:, 1], c='b', label='class 1')

        w0, w1 = lr_net.features.weight[0]
        w0, w1 = float(w0.item()), float(w1.item())
        plot_b = float(lr_net.features.bias[0].item())
        plot_x = np.arange(-6, 6, 0.1)
        plot_y = (-w0 * plot_x - plot_b) / w1

        plt.xlim(-5, 7)
        plt.ylim(-7, 7)
        plt.plot(plot_x, plot_y)

        plt.text(-5, 5, 'Loss=%.4f' % loss.data.numpy(), fontdict={
    
    'size': 20, 'color': 'red'})
        plt.title("Iteration: {}\nw0:{:.2f} w1:{:.2f} b: {:.2f} accuracy:{:.2%}".format(iteration, w0, w1, plot_b, acc))
        plt.legend()

        plt.show()
        plt.pause(0.5)

        if acc > 0.99:
            break
	# x=range(0,100,20)
	# plt.plot(x, acce, c='r')
	# plt.title('acc')
	# plt.ylabel("acc")
	# plt.xlabel("epoch")
	# plt.show()

insert image description here
insert image description here
insert image description here

Guess you like

Origin blog.csdn.net/fcxgfdjy/article/details/131837842