dejahu's deep learning study notes 01-linear regression + Pytorch code implementation

In simple terms, machine learning can be divided into two problems. We all hope to abstract real problems to build mathematical models. According to the content of the prediction, the problems can be divided into regression problems and classification problems. If the prediction is 连续值, Then it's a regression problem, and if the prediction is yes 离散值, then it's a classification problem. For example, the prediction of housing prices is a typical regression problem. Given basic inputs such as the area and geographic location of the house, the model can obtain the approximate price of the house based on these inputs.

image-20210704145222734

Principle introduction

? what is a model

I have been exposed to the concept of a model since I was a freshman. Many times I am not very clear about what a model is. From the current understanding, the model is a mathematical formula. For example, the kx+b we learned in high school is a linear model. In a specific problem, not only the general form of the formula, but also the solution based on the specific problem is required. The specific values ​​of k and b are obtained by solving, that is, in the current deep learning, the model can also be understood in this way, but the formula is more complicated, and the more parameters need to be solved.

Basic Elements of Linear Regression

In the regression problem, taking the housing price forecast just mentioned as an example, we denote several factors that affect the housing price as x and the transaction price as y, then this problem can be drawn as the following formula.
y = w 1 x 1 + w 2 x 2 + w 3 x 3 + by = w_1x_1 + w_2x_2 + w_3x_3 +bY=in1x1+in2x2+in3x3+The b
weight and bias are the parameters that our machine learning model needs to learn, and what we learn is a large number of real house data samples. Learning through a large amount of real house data with transaction prices is called supervised learning, which is often referred to as supervisor learning.

Representation of Linear Regression

We further abstract the above problem to obtain a general representation of the linear regression problem.

  • Given n-dimensional input x = [ x 1 , x 2 , . . . , x 3 ] T x=[x_1, x_2,...,x_3]^Tx=[x1,x2,. . . ,x3]T

  • The linear model has an n-dimensional weight and a scalar error
    w = [ w 1 , w 2 , . . . , wn ] T , bw = [w_1,w_2,...,w_n]^T, bin=[ in1,in2,. . . ,inn]T,b

  • The output is the weighted sum of the inputs
    y = w 1 x 1 + w 2 x 2 + . . . + wnxn + by = w_1x_1 + w_2x_2 + ... + w_nx_n + bY=in1x1+in2x2+. . .+innxn+b

The vector version is: $ y = <w, x> + b $

Using the graph can be represented as the following form, which can be regarded as a single-layer neural network

image-20210704150454649

how to solve

The process of solving is the same as the process of taking the college entrance examination. We need to do a lot of exercises in advance, train a brain that can do the college entrance examination questions, and constantly improve ourselves before we can complete the final college entrance examination. The model training process can be divided into the following steps:

  1. Calculate the loss

    The regression problem is relatively simple, we generally use the squared loss to $l(y,\hat{y}) = \frac{1}{2}(y - \hat{y})^2 $

  2. training data

    Collect a large amount of real house data and input the model for learning. Generally, in order to prevent overfitting, it is batchused for training.

  3. parameter learning

    Linear regression can obtain an explicit solution, but it is more troublesome, so stochastic gradient descent sgd(chain rule) is generally used to solve

Code

We generate a data of size 2, and then use this data to find w1, w2 and b. The code is implemented with the help of pytorch. The method of building the model is very simple, only need to construct an input through torch.Linear(2, 1) is 2, and the output is a fully connected layer of 1.

import numpy as np
import torch
from torch.utils import data
from d2l import torch as d2l
from torch import nn

# 1. 生成数据集
true_w = torch.tensor([2, -3.4])
true_b = 4.2
features, labels = d2l.synthetic_data(true_w, true_b, 1000)


# 调用框架中的API来读取数据集
def load_array(data_arrays, batch_size, is_train=True):
    """构造一个PyTorch数据迭代器。"""
    dataset = data.TensorDataset(*data_arrays)
    # 使用Pytorch中的dataLoader可以构造数据集加载器,迭代器的形式返回批量的数据
    return data.DataLoader(dataset, batch_size, shuffle=is_train)


batch_size = 10
data_iter = load_array((features, labels), batch_size)

print(next(iter(data_iter)))

# 2. 构造模型
net = nn.Sequential(nn.Linear(2, 1))
# 初始化模型参数,不过一般pytorch模型中有个默认的初始化方式,不初始化也没问题
net[0].weight.data.normal_(0, 0.01)
net[0].bias.data.fill_(0)

# 3. 设置损失函数,损失函数计算得到的是个标量
loss = nn.MSELoss()

# 4. 设置优化器,我们优化的是网络的参数,并且设定一个优化的学习率
trainer = torch.optim.SGD(net.parameters(), lr=0.03)

# 5. 开始训练
num_epochs = 5
for epoch in range(num_epochs):
    for X, y in data_iter:
        l = loss(net(X), y)
        trainer.zero_grad() # 梯度清零
        l.backward() # 反向传播计算梯度
        trainer.step() # 梯度下降优化权重参数
    l = loss(net(features), labels)
    print(f'epoch {epoch + 1}, loss {l:f}')
# 模型保存: torch.save() 可以全部保存然后直接加载,也可以只保存参数,然后初始化模型之后加载参数
# 6. 模型验证
w = net[0].weight.data
print('w的估计误差:', true_w - w.reshape(true_w.shape))
b = net[0].bias.data
print('b的估计误差:', true_b - b)

The output is as follows:

image-20210704152453227

It can be seen that the error is constantly decreasing. Because the data is relatively simple, the model is basically optimal in the second round.

Linear regression in practice - house price prediction

With more than 30 likes, we update the content of kaggle house price forecast. You are also welcome to pay attention to my account of station B. I will update some deep learning dry goods videos from time to time. Thank you for your support!

dejahu's personal space_bilibili_bilibili

Guess you like

Origin blog.csdn.net/ECHOSON/article/details/118462883