Convolution code implementation

There are two ways of convolution in pytorch, one is torch.Conv2d(), and the other is torch.nn.functional.conv2d(). These two forms essentially use a convolution operation. The following examples illustrate These two convolution methods

import numpy as np
import torch
from torch import nn
from torch.autograd import Variable
import torch.nn.functional as F
from PIL import Image
import matplotlib.pyplot as plt
%matplotlib inline
im=Image.open('./cat.png').convert('L') #读入一张灰度的图片
im=np.array(im,dtype='float32') #将其转换为一个矩阵

 

#将图片矩阵转化为 pytorch torch,并适配卷积输入的要求
im=torch.from_numpy(im.reshape((1,1,im.shape[0],im.shape[1])))

 Next, we define an operator for contour detection

#使用nn.Conv2d
conv1 = nn.Conv2d(1,1,3,bias=False)#定义卷积
sobel_kernel=np.array([[-1,-1,-1],[-1,8,-1],[-1,-1,-1]],dtype ="float32")#定义轮廓检测算子
sobel_kernel=sobel_kernel.reshape((1,1,3,3))#适配卷积的驶入输出
conv1.weight.data=torch.from_numpy(sobel_kernel)#给卷积的kernel赋值

edge1=conv1(Variable(im))#作用在图片上
edge1=edge1.data.squeeze().numpy()#将输出转换为图片的格式

Below we visualize the results after edge detection

#使用F.conv2d
sobel_kernel = np.array([[-1,-1,-1],[-1,8,-1],[-1,-1,-1]],dtype='float32')#定义轮廓检测算子
sobel_kernel=sobel_kernel.reshape((1,1,3,3))#适配卷积的输入输出
weight = Variable(torch.from_numpy(sobel_kernel))

edge2=F.conv2d(Variable(im),weight)#作用在图片上
edge2=edge2.data.squeeze().numpy()#将输出转换为图片的格式
plt.imshow(edge2,cmap='gray')

 

You can see that the two forms can get the same effect. I believe you have seen the difference. Using `nn.Conv2d()` is equivalent to directly defining a layer of convolutional network structure, while using `torch.nn.functional. conv2d()` is equivalent to defining a convolution operation, so using the latter requires an additional definition of weight, and this weight must also be a Variable, and using `nn.Conv2d()` will help us define it by default A randomly initialized weight. If we need to modify it, then take out the value and modify it. If you don’t want to modify it, you can directly use this default initialized value, which is very convenient.

**In actual use, we basically use `nn.Conv2d()` in this form**


## Another very important structure in the pooling layer convolutional network is pooling, which uses the downsampling invariance of the picture, that is, the content of the picture can be seen when a picture becomes smaller, and the use of The pooling layer can reduce the size of the image, which greatly improves the computational efficiency, and the pooling layer has no parameters. There are many ways of pooling, such as maximum pooling, mean pooling, etc., and maximum pooling is generally used in convolutional networks.

There are also two ways of maximum pooling in pytorch, one is `nn.MaxPool2d()`, the other is `torch.nn.functional.max_pool2d()`, their input requirements for pictures are the same as convolution for pictures The input requirements are the same, so I won’t go into details. Let’s give an example below.

# 使用 nn.MaxPool2d
pool1 = nn.MaxPool2d(2, 2)
print('before max pool, image shape: {} x {}'.format(im.shape[2], im.shape[3]))
small_im1 = pool1(Variable(im))
small_im1 = small_im1.data.squeeze().numpy()
print('after max pool, image shape: {} x {} '.format(small_im1.shape[0], small_im1.shape[1]))

before max pool, image shape: 224 x 224

after max pool, image shape: 112 x 112

You can see that the size of the picture has been reduced by half, so has the picture changed? we can visualize

It can be seen that there is almost no change in the picture, indicating that the pooling layer only reduces the size of the picture and does not affect the content of the picture

# F.max_pool2d
print('before max pool, image shape: {} x {}'.format(im.shape[2], im.shape[3]))
small_im2 = F.max_pool2d(Variable(im), 2, 2)
small_im2 = small_im2.data.squeeze().numpy()
print('after max pool, image shape: {} x {} '.format(small_im1.shape[0], small_im1.shape[1]))
plt.imshow(small_im2, cmap='gray')

 before max pool, image shape: 224 x 224

after max pool, image shape: 112 x 1

12

**Same as the convolutional layer, in actual use, we generally use `nn.MaxPool2d()`** 

Guess you like

Origin blog.csdn.net/weixin_51781852/article/details/125693913