Written in the front
Linear Regression : It is the basis of the regression problem. The purpose
of linear regression: What linear regression needs to do is to find a mathematical formula that can relatively perfectly combine all independent variables (addition, subtraction, multiplication and division), and the results obtained are close to the target.
Article directory
General steps to achieve
Implementation steps:
1. Construct data set
2. Read data set
3. Build model
, build network
, build loss function,
build optimizer
4. Training
, initialize parameters,
forward propagation, backward propagation,
gradient
clearing
, parameter optimization
1. Construct the dataset
Dimension of feature: 2 Number of features: 1000 Preset model w value: [1.5, 2.6] Prediction model b value: -3.6 Then the above process is equivalent to the following formula y = 1.5 x 1
+ 2.6 x 2 − 3.6 y=1.5 x_{1}+2.6 x_{2}-3.6y=1.5 x1+2.6x _2−3.6
In order to avoid overfitting as much as possible during training, add a white noise e with a mean value of 0 and a variance of 0.01 to the result, that is, the final model can be written as
y = 1.5 x 1 + 2.6 x 2 − 3.6 + ϵ y= 1.5 x_{1}+2.6 x_{2}-3.6+\epsilony=1.5 x1+2.6x _2−3.6+ϵ
import torch
def original_model(x):
return 1.5 * x[:, ] + 2.6 * x[:, 1] - 3.6
def generator_data(dim, num):
x_set = torch.normal(0,1, (num, dim))
y = torch.matmul(x_set, torch.tensor([1.5,2.6])) + torch.tensor(-3.6)
# y = original_model(x set)#
epsilon = torch.normal(0,0.01,y.shape)
return x_set,y + epsilon
features,labels = generator_data(2,1000)
features.shape,labels.shape, features[0],labels[0]
You can see the structure data
2. Read data
Data reading is to read a small batch of data each time, and the hyperparameter involved is batch_size
import random
def data_iter(x, y, batch_size):
num = len(y)
data_list = list(range(num))
random. shuffle(data_list)
for i in range(0,num,batch_size):
batch_index = torch.LongTensor(data_list[i: min(i + batch_size,num)])
batch_features = torch.index_select(x,dim=0,index=batch_index)
batch_labels = torch.index_select(y,dim=0,index=batch_index)
yield batch_features, batch_labels
for feature,label in data_iter(features,labels, 10):
print(feature.shape,label.shape)
break
3. Build the model
define model
#定义模型
def linear_regression(x,w,b):
out = torch.matmul(x, w)
out = out + b
return out
loss function
def squared_loss(pred_y, y):
return (pred_y - y.view(pred_y.shape))**2 / 2
Define the optimization algorithm
SGD algorithm
def sgd(params,lr, batch_size):
with torch.no_grad( ):
for param in params:
param -= lr * param.grad / batch_size
param.grad.zero_()
training model
#初始化参数
w = torch.normal(0,0.01,size=(2, 1), requires_grad=True)
b = torch.zeros ( 1,requires_grad=True)
#定义超参数
lr = 0.021
num_epochs = 13
batch_size = 20
net = linear_regression
loss = squared_loss
for epoch in range(num_epochs):
for x, y in data_iter(features,labels, batch_size):
#前向传播
l = loss(net(x, w, b), y).sum( ).backward()
#后向传播
#
#l.backward()
#更新参数
sgd([w, b],lr,batch_size)
#预测
with torch.no_grad():
train_l = loss(net(features,w, b), labels)
print(epoch,": ", train_l.sum().item() / 1000)
forecast result: