Although the depth tutorial learning has been rotten Street, basic theory is relatively easy to grasp, but really allow yourself to realize when there are some pits. On the one hand not introduce too many specific engineering problems, on the other hand eating PyTorch English document is still a bit of trouble. Record it, when is the job of the report.
Obtaining a data set
First import the required package:
import torch
import torch.nn as nn
import torch.utils.data as Data
import numpy as np
import matplotlib.pyplot as plt
As used total 500 data sets, wherein the input feature_n = 8
, output category lables_n = 4
. Take a look at the first two rows of data format of the data:
x1 | the X- 2 | x3 | X 4 | x 5 | x6 | x7 | x8 | class label |
---|---|---|---|---|---|---|---|---|
0.4812 | 0.7790 | 0.8904 | 0.7361 | 0.9552 | 0.2119 | 0.7992 | 0.2409 | 4 |
0.4472 | 0.5985 | 0.7859 | 0.5035 | 0.6912 | 0.4038 | 0.0787 | 0.2301 | 1 |
We will pre-400 as a training data set, as the test after 100 data sets.
Definition of a function for obtaining the data set from the file:
def get_data(filename):
dataset = np.loadtxt(filename)
x_train = dataset[0:400,0:8]
raw_y_train = dataset[0:400,8:]
x_test = dataset[400:,0:8]
raw_y_test = dataset[400:,8:]
y_train = np.zeros((400,4),dtype = np.int)
y_test = np.zeros((100,4),dtype = np.int)
y_train = raw_y_train - 1
y_test = raw_y_test - 1
#for i in range(400):
# y_train[i,int(raw_y_train[i])-1]=1
#for j in range(100):
# y_test[j,int(raw_y_test[j])-1]=1
return x_train,y_train,x_test,y_test
The number of output Softmax classifier should in principle class label the same number of correct ideally one that is 1 , the other is 0 . Then the class label is 4 , the predicted value should be over [0, 0, 0, 1] , the data set is given in [4] . If you are using NumPy build the network and write Softmax function manually, you will need to y_train
deal with two-dimensional matrix.
But for PyTorch loss function provided torch.nn.CrossEntropyLoss()
, the data format is slightly different. torch.nn.CrossEntropyLoss()
The first parameter y_pred
is batch_size * labels_n size of the matrix, the second parameter y
is a dimension batch_size magnitude vector, and the data in the range [0, labels_n -1] , so to the y_train
base minus 1, the tag 1 / 2/3/4 become 0/1/2/3.
Setting batch training
Read incoming data set or ndarray format, we first convert them to Tensor :
x_train,y_train,x_test,y_test = get_data('dataset.txt')
x_train = torch.from_numpy(x_train).type(torch.FloatTensor)
y_train = torch.from_numpy(y_train).type(torch.LongTensor)
Note that the data required by the format conversion. This is also the loss of function torch.nn.CrossEntropyLoss()
requirements, predicted probabilities must float , the correct label must be Long .
After the establishment of the training set and load test set empathy:
train_set = Data.TensorDataset(x_train,y_train)
train_loader = Data.DataLoader(
dataset=train_set,
batch_size=BATCH_SIZE,
shuffle=True
)
In the "depth hands-on learning to learn," the book using his own handwriting data iterator iteration batch of training. I personally feel PyTorch provided DataLoader tools easier to use.
Definition Model
Using PyTorch quickly build method provided by the neural network structures, comprising a hidden layer and output layer of 2 layers. It should be noted that the loss of function torch.nn.CrossEntropyLoss()
already contains Softmax function, neural network so we can direct linear output.
net = nn.Sequential(
nn.Linear(8,50),
nn.ReLU(),
nn.Linear(50,4)
)
After defining some of the functions and parameters of the training model:
EPOCH = 5000
BATCH_SIZE = 100
LR = 0.01
LOSS_FUNC = nn.CrossEntropyLoss()
OPTIMIZER = torch.optim.SGD(net.parameters(), lr=LR)
After the neural network itself will establish the parameters w, b is initialized. Because I do not know how much more appropriate initialization, so this is not an explicit initialization using its default.
Trainer
for epoch in range(1,EPOCH+1):
loss_sum = 0.0
for step,(x,y) in enumerate(train_loader):
y_pred = net(x)
y = y.squeeze() #修正标签格式
loss = LOSS_FUNC(y_pred,y)
loss_sum += loss
OPTIMIZER.zero_grad()
loss.backward()
OPTIMIZER.step()
print("epoch: %d, loss: %f" %(epoch,loss_sum/BATCH_SIZE))
I said before there is loss of function requirements for data format. Our label y
is still the two-dimensional matrix, the size of batch_size * 1 , so before calculating loss function needs to reduce the dimensionality.
Test Model
acc_sum = 0.0
acc_sum += (net(x_test).argmax(dim=1) == y_test.squeeze()).sum()
print("test accuracy: %f" %(acc_sum/100))
Testing found that the accuracy rate can reach 93% with the test set.