1、2、关于梯度、求导...一个简单的神经网络过程

前向传播得到预测值 --> 求预测值与真实值的损失 --> 优化器梯度清零 --> 反向传播求所有参数的梯度 --> 优化器更新梯度

# ================================================================== #
#                     1. Basic autograd example 1                    #
# ================================================================== #

# Create tensors.
x = torch.tensor(1., requires_grad=True)
w = torch.tensor(2., requires_grad=True)
b = torch.tensor(3., requires_grad=True)

# Build a computational graph.
y = w * x + b    # y = 2 * x + 3

# Compute gradients.
y.backward()

# Print out the gradients. 其实就是对公式y求偏导
print(x.grad)    # x.grad = 2
print(w.grad)    # w.grad = 1
print(b.grad)    # b.grad = 1


# ================================================================== #
#                    2. Basic autograd example 2                     #
# ================================================================== #

# Create tensors of shape (10, 3) and (10, 2).
x = torch.randn(10, 3)
y = torch.randn(10, 2)

# Build a fully connected layer.
linear = nn.Linear(3, 2)
print ('w: ', linear.weight)
print ('b: ', linear.bias)

# Build loss function and optimizer.
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(linear.parameters(), lr=0.01)

# Forward pass.
pred = linear(x)

# Compute loss.
loss = criterion(pred, y)
print('the initial loss is : ', loss.item())

# Backward pass.
loss.backward()

# Print out the gradients.
print ('dL/dw: ', linear.weight.grad)
print ('dL/db: ', linear.bias.grad)

# 1-step gradient descent.
optimizer.step()


# Print out the loss after 1-step gradient descent.
pred = linear(x)
loss = criterion(pred, y)
print('loss after 1 step optimization: ', loss.item())

3、numpy与torch的转换

# ================================================================== #
#                     3. Loading data from numpy                     #
# ================================================================== #

# Create a numpy array.
x = np.array([[1, 2], [3, 4]])

# Convert the numpy array to a torch tensor.
y = torch.from_numpy(x)

# Convert the torch tensor to a numpy array.
z = y.numpy()

4、构建输入数据流（利用已有的CIFAR10数据集）

# ================================================================== #
#                         4. Input pipeline                           #
# ================================================================== #

# 1、Download and construct CIFAR-10 dataset.
train_dataset = torchvision.datasets.CIFAR10(root='../../data/', train=True, transform=transforms.ToTensor(), download=True)

# Fetch one data pair (read data from disk).
image, label = train_dataset[0]
print(image.size())
print(label)

# 2、Data loader (this provides queues and threads in a very simple way).
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)

# 3、When iteration starts, queue and thread start to load data from files.
data_iter = iter(train_loader)

# 4、Mini-batch images and labels.
images, labels = data_iter.next()

# 5、Actual usage of the data loader is as below.
for images, labels in train_loader:
    # Training code should be written here.
    pass

5、构建输入数据流（自己定制数据流）

Dataloader与Dataset_马鹏森的博客-CSDN博客_dataloader和dataset

# ================================================================== #
#                5. Input pipeline for custom dataset                #
# ================================================================== #

# You should build your custom dataset as below.
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self):
        # TODO
        # 1. Initialize file paths or a list of file names.
        pass
    def __getitem__(self, index):
        # TODO
        # 1. Read one data from file (e.g. using numpy.fromfile, PIL.Image.open).
        # 2. Preprocess the data (e.g. torchvision.Transform).
        # 3. Return a data pair (e.g. image and label).
        pass
    def __len__(self):
        # You should change 0 to the total size of your dataset.
        return 0

# You can then use the prebuilt data loader.
custom_dataset = CustomDataset()
train_loader = torch.utils.data.DataLoader(dataset=custom_dataset, batch_size=64, shuffle=True)

6、怎么使用一个预训练模型，然后fine-tune

要微调哪一层，就将哪一层的参数finetune，“ resnet.fc = nn.Linear(resnet.fc.in_features, 100) ”

前提是需要将这一层的参数提前初始化，而不用原来的预训练权重：“ nn.init.xavier_uniform_(resnet.fc.weight) ”

关于finetune的其他知识：什么是微调（Fine Tune）？什么时候使用什么样的微调？【数据量和数据相似度决定】_马鹏森的博客-CSDN博客_微调finetune

# ================================================================== #
#                        6. Pretrained model                         #
# ================================================================== #

# Download and load the pretrained ResNet-18.
resnet = torchvision.models.resnet18(pretrained=True)

# If you want to finetune only the top layer of the model, set as below.
for param in resnet.parameters():
    param.requires_grad = False

# Replace the top layer for finetuning. 意思就是只 fine-tune “resnet”的最后一层的参数，其余层的参数不变
resnet.fc = nn.Linear(resnet.fc.in_features, 100)  # (512, 100) 100 is an example.

# 或者替代上面的代码：
# for param in resnet.parameters():
#     param.requires_grad = False
# 而使用下面这一行代码也可以达到初始化某一层参数的效果
nn.init.xavier_uniform_(resnet.fc.weight)

# Forward pass.
images = torch.randn(64, 3, 224, 224)
outputs = resnet(images)
print(outputs.size())     # (64, 100)

7、保存和加载模型

PyTorch:存储和恢复模型并查看参数，load_state_dict()，state_dict()_马鹏森的博客-CSDN博客_model.load_state_dict()

# ================================================================== #
#                      7. Save and load the model                    #
# ================================================================== #

# Save and load the entire model.
torch.save(resnet, 'model.ckpt')
model = torch.load('model.ckpt')

# Save and load only the model parameters (recommended).
torch.save(resnet.state_dict(), 'params.ckpt')
resnet.load_state_dict(torch.load('params.ckpt'))

pytorch-tutorial/main.py at master · yunjey/pytorch-tutorial · GitHub

1、PyTorch Basics（PyTorch 一些基础知识）