"Hands-on science learning PyTorch depth version of" 1

Section I: Linear Regression

When vector operation, the direct vector calculation cycle is higher than the efficiency of the algorithm

  1. pytorch build neural network code:

Method 1: class method

#ways to init a multilayer network
#method one
net = nn.Sequential(
    nn.Linear(num_inputs, 1)
    # other layers can be added here
    )

#method two
net = nn.Sequential()
net.add_module('linear', nn.Linear(num_inputs, 1))
#net.add_module ......

#method three
from collections import OrderedDict
net = nn.Sequential(OrderedDict([
          ('linear', nn.Linear(num_inputs, 1))
          # ......
        ]))

print(net)
print(net[0])

Section II: Softmax and Classification

1.Softmax return: [Official] [Official]
2. cross-entropy loss function:
Here Insert Picture Description
3. train the neural network code:

num_epochs, lr = 5, 0.1

# 本函数已保存在d2lzh_pytorch包中方便以后使用
def train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size,
              params=None, lr=None, optimizer=None):
    for epoch in range(num_epochs):
        train_l_sum, train_acc_sum, n = 0.0, 0.0, 0
        for X, y in train_iter:
            y_hat = net(X)
            l = loss(y_hat, y).sum()
            
            # 梯度清零
            if optimizer is not None:
                optimizer.zero_grad()
            elif params is not None and params[0].grad is not None:
                for param in params:
                    param.grad.data.zero_()
            
            l.backward()
            if optimizer is None:
                d2l.sgd(params, lr, batch_size)
            else:
                optimizer.step() 
            
            
            train_l_sum += l.item()
            train_acc_sum += (y_hat.argmax(dim=1) == y).sum().item()
            n += y.shape[0]
        test_acc = evaluate_accuracy(test_iter, net)
        print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
              % (epoch + 1, train_l_sum / n, train_acc_sum / n, test_acc))

train_ch3(net, train_iter, test_iter, cross_entropy, num_epochs, batch_size, [W, b], lr)

Reverse transfer must be cleared before seeking gradient gradient, so as not to increase tired.

First initialize gradient, the gradient is calculated once finished, after completion of updating, the gradient is cleared, for the next calculation.

Section III: Multilayer Perceptron

1.pytorch build MLP

num_inputs, num_outputs, num_hiddens = 784, 10, 256
    
net = nn.Sequential(
        d2l.FlattenLayer(),
        nn.Linear(num_inputs, num_hiddens),
        nn.ReLU(),
        nn.Linear(num_hiddens, num_outputs), 
        )
    
for params in net.parameters():
    init.normal_(params, mean=0, std=0.01)
Published 24 original articles · won praise 3 · Views 1562

Guess you like

Origin blog.csdn.net/xfxlesson/article/details/104319219