(D) pytorch study notes

Author: chen_h
Micro Signal & QQ: 862251340
micro-channel public number: coderpai


(A) pytorch study notes

(B) pytorch study notes

(C) pytorch study notes

(D) pytorch study notes


What is the convolution neural network CNN (Convolutional Neural Network)

Convolutional neural network is an artificial neural network gradual rise in recent years, since the use of a convolutional neural network can give better results in the predicted image and speech recognition, a technique which is also applicable widely spread. aspects convolution neural network is most commonly applied image recognition computer, but because of constant innovation, it is also used in video analytics, natural language processing, drug discovery, and so on. the most recent fire Alpha Go, let the computer look Go understand, there is also applied to this technology.

Convolution and neural networks

Here Insert Picture Description

Let's talk about how specific neural network is a convolution operation of it, give a picture identification example, we know that the neural network is composed of a series of nerve layers, each layer of the nerve layer there exist a lot of neurons. These nerve yuan is the key to identifying the neural network of things. each will have a neural network input and output values, when the input value is the picture when, in fact, is not that the neural network input colorful pattern, but piles of numbers. to say this. when the neural network to deal with so much information input of time, that is a convolution neural network can play its advantages of time. What is the convolution neural network it?

Here Insert Picture Description

We first convolution neural network word break it. "Convolution" and "neural networks." Convolution neural network that is no longer done processing the input information for each pixel, but on each picture a small pixel area for processing, this approach to strengthen the continuity of the picture information. so that the neural network can see the graphic, rather than a point. this approach also deepened the understanding of the neural network image. specifically, convolution neural network has a batch filter images in continuous scroll of information collected on the picture, every time collected just collect a small pixel area, and then to collate information gathered, sorted out this time with some of the information is actually presented, such as neural networks at this time to see some of the edge of the picture information, and then in the same step, with a similar batch filter sweeps resulting from these edge information from these neural networks edge information inside a higher-level summary information structure, such as summary edge can draw the eyes, nose, etc. and then through a filter, also from the face information Information nose eyes are summed up. The last information we then set into the general picture of several layers fully connected neural layer classification, so that we can get input can be divided into what type of result.

Here Insert Picture Description

We introduce the interception of a google video convolution neural networks, specifically talk about how the picture is convoluted. Below is a picture of a cat, pictures have long, wide and high three parameters for! Picture is highly! Here high refers to the computer information used to generate the color used. If it is black and white photos, high unit only 1, if a color photo, you may have information red, green and blue colors, when the height is 3 after we photograph in color as an example. filter image is constantly moving things, he continued to collect a small group small group of pixel blocks in the picture, after collecting all the information, the value of output, we can be understood as a highly higher , length and width smaller "picture." this image was able to contain some of the edge information, and then the same process was repeated several times convolution, length and width of the picture and then compress, add height, there is an input pictures deeper understanding. compression, increased information on the general classification nest in the nerve layer, we will be able to classify such a picture.

Pooling (pooling)

Here Insert Picture Description

The study found that at the time of each convolution, the nerve layer may inadvertently lose some information. At this time, the pool of (pooling) can be a good solution to this problem. And pooling is a process of screening filter, can the layer useful information filtering out to the next layer analysis but also reduce the computational burden neural network. That is when the volume set, we do not compress the length and width, as much as possible to retain more information, compressed work on to pooling, so an additional work can be very effective to improve the accuracy. With these technologies we can build our own convolution of a neural network friends.

Popular CNN structure

Here Insert Picture Description

More popular building structure is such, in order from the lower to the upper, first input image (Image), after one convolution information layer (Convolution), and then treated in the manner of convolution pooled (Pooling) of used here is the max pooling manner. then after a similar process, the nerve layer (fully connected) to obtain information of the second processing incoming fully connected layers, which is generally the two-layer neural network layer and finally connect in a classifier (classifier) ​​to classify forecast. this is just the convolution neural network on a simple picture processing introduction.

CNN convolution neural network

Convolution neural networks are now widely used in picture identification, there are already applications emerging, and then we will make a step by step analysis of handwritten digits CNN it.

Here is a CNN last layer of the learning process, let's take a look at the visualization:

Here Insert Picture Description

MNIST handwritten data

import torch
import torch.nn as nn
import torch.utils.data as Data
import torchvision      # 数据库模块
import matplotlib.pyplot as plt

torch.manual_seed(1)    # reproducible

# Hyper Parameters
EPOCH = 1           # 训练整批数据多少次, 为了节约时间, 我们只训练一次
BATCH_SIZE = 50
LR = 0.001          # 学习率
DOWNLOAD_MNIST = True  # 如果你已经下载好了mnist数据就写上 False


# Mnist 手写数字
train_data = torchvision.datasets.MNIST(
    root='./mnist/',    # 保存或者提取位置
    train=True,  # this is training data
    transform=torchvision.transforms.ToTensor(),    # 转换 PIL.Image or numpy.ndarray 成
                                                    # torch.FloatTensor (C x H x W), 训练的时候 normalize 成 [0.0, 1.0] 区间
    download=DOWNLOAD_MNIST,          # 没下载就下载, 下载了就不用再下了
)

Here Insert Picture Description

The value of the local black is 0 and white where the value is greater than zero.

Also, apart from the training data, and gave some test data, test to see if it does not have good training.

test_data = torchvision.datasets.MNIST(root='./mnist/', train=False)

# 批训练 50samples, 1 channel, 28x28 (50, 1, 28, 28)
train_loader = Data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)

# 为了节约时间, 我们测试时只测试前2000个
test_x = torch.unsqueeze(test_data.test_data, dim=1).type(torch.FloatTensor)[:2000]/255.   # shape from (2000, 28, 28) to (2000, 1, 28, 28), value in range(0,1)
test_y = test_data.test_labels[:2000]

CNN model

As before, we use a class to build this model CNN CNN the whole process is the convolution (. Conv2d) -> activation function ( ReLU) -> pooled, downsampling ( MaxPooling) -> do it again -> flattened multi-dimensional convolution FIG feature into -> full access connection layer ( Linear) -> output

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Sequential(  # input shape (1, 28, 28)
            nn.Conv2d(
                in_channels=1,      # input height
                out_channels=16,    # n_filters
                kernel_size=5,      # filter size
                stride=1,           # filter movement/step
                padding=2,      # 如果想要 con2d 出来的图片长宽没有变化, padding=(kernel_size-1)/2 当 stride=1
            ),      # output shape (16, 28, 28)
            nn.ReLU(),    # activation
            nn.MaxPool2d(kernel_size=2),    # 在 2x2 空间里向下采样, output shape (16, 14, 14)
        )
        self.conv2 = nn.Sequential(  # input shape (16, 14, 14)
            nn.Conv2d(16, 32, 5, 1, 2),  # output shape (32, 14, 14)
            nn.ReLU(),  # activation
            nn.MaxPool2d(2),  # output shape (32, 7, 7)
        )
        self.out = nn.Linear(32 * 7 * 7, 10)   # fully connected layer, output 10 classes

    def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = x.view(x.size(0), -1)   # 展平多维的卷积图成 (batch_size, 32 * 7 * 7)
        output = self.out(x)
        return output

cnn = CNN()
print(cnn)  # net architecture
"""
CNN (
  (conv1): Sequential (
    (0): Conv2d(1, 16, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (1): ReLU ()
    (2): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1))
  )
  (conv2): Sequential (
    (0): Conv2d(16, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (1): ReLU ()
    (2): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1))
  )
  (out): Linear (1568 -> 10)
)
"""

training

Here we begin training will x yhave a Variablewrap, then placed cnncalculated output, and finally a calculation error. The following codes are omitted accuracy of the calculation accuracyportion

optimizer = torch.optim.Adam(cnn.parameters(), lr=LR)   # optimize all cnn parameters
loss_func = nn.CrossEntropyLoss()   # the target label is not one-hotted

# training and testing
for epoch in range(EPOCH):
    for step, (b_x, b_y) in enumerate(train_loader):   # 分配 batch data, normalize x when iterate train_loader
        output = cnn(b_x)               # cnn output
        loss = loss_func(output, b_y)   # cross entropy loss
        optimizer.zero_grad()           # clear gradients for this training step
        loss.backward()                 # backpropagation, compute gradients
        optimizer.step()                # apply gradients

"""
...
Epoch:  0 | train loss: 0.0306 | test accuracy: 0.97
Epoch:  0 | train loss: 0.0147 | test accuracy: 0.98
Epoch:  0 | train loss: 0.0427 | test accuracy: 0.98
Epoch:  0 | train loss: 0.0078 | test accuracy: 0.98
"""

Finally, we come to take 10 data to see predicted value right in the end:

test_output = cnn(test_x[:10])
pred_y = torch.max(test_output, 1)[1].data.numpy().squeeze()
print(pred_y, 'prediction number')
print(test_y[:10].numpy(), 'real number')

"""
[7 2 1 0 4 1 4 9 5 9] prediction number
[7 2 1 0 4 1 4 9 5 9] real number
"""

Visualization Training

This is what is done after a sudden want to add video, because visualization can help to understand, so it is necessary to mention Visualization code is mainly used matplotliband sklearndone, because we used T-SNEdimensionality reduction means, the high-dimensional the last layer CNN visual output, i.e. CNN Forward code x = x.view(x.size(0), -1)this result.

Visualization of the code is not the point, we direct visualization of the results show it.

Here Insert Picture Description

link:

https://morvanzhou.github.io/tutorials/machine-learning/torch/4-01-CNN/

https://github.com/MorvanZhou/PyTorch-Tutorial/blob/master/tutorial-contents/401_CNN.py

Published 414 original articles · won praise 168 · views 470 000 +

Guess you like

Origin blog.csdn.net/CoderPai/article/details/104162340