Several common interpolation methods in image processing: nearest neighbor interpolation, bilinear interpolation, and bicubic interpolation (with Pytorch test code)

Zero, Preface

When learning deformable convolution, because the learned displacement Δp n may be a decimal, the author uses a bilinear interpolation algorithm to determine the final sampling position of the convolution operation. Through interpolation algorithms, we can estimate data at unknown locations based on existing known data, and can use this method to perform tasks such as scaling, rotation, and geometric correction of images . Here I use this article to learn and summarize three common interpolation methods, including nearest neighbor interpolation, bilinear interpolation, and bicubic interpolation algorithms. Among them, the bilinear interpolation method is a more frequently used method, such as in Pytorch and Tensorflow The default interpolation method in the framework is the bilinear interpolation method.
We know that in digital image processing, the image is represented by the following figure, in which the position of each pixel is an integer:
Insert image description here
before interpolation, the following formula is first used to calculate the position of the pixel in the target image mapped back to the original image: What, dst and src are used here to represent the target image and the original image respectively. (x, y) is the pixel coordinate position. This formula is particularly important:
Insert image description here

1. Nearest Neighbor Interpolation

1.Related introduction

Nearest neighbor interpolation, also known as zero-order interpolation, is the simplest interpolation method with a small amount of calculation. For an unknown position, the value of its nearest pixel is directly assigned to it. This method usually causes interpolation . The gray value of pixels in the image is discontinuous, resulting in obvious jagged edges at the edge of the image, so this method is rarely used in reality.
Src and Dst are used here to represent the original image and the target image respectively. Their sizes are (Src_H, Src_W) and (Dst_H, Dst_W) respectively. Our purpose is to use the original image to fill the pixel values ​​in the target image .
For each element, the calculation of the filling value is divided into two steps:
1) Calculate the coordinates (dst_x, dst_y) in Dst corresponding to the coordinates (src_x, src_y) in Src through the above formula; 2
) Calculate the required (src_x, src_y) coordinates to perform the floor operation (I see many people say that the rounding operation is used, but I can see from the results that it is obviously floor), that is, rounding down to get the nearest neighbor point (dst_x, dst_y) mapped in the original image (floor (src_x),floor(src_y))
;
Insert image description here
For example, assuming that the sizes of the original image Src and the target image Dst are (2,2) and (4,4) respectively, our purpose is to use the original image to map the pixels in the target image Fill in the value and use Pytorch to get the result as shown above. Taking the first row of data 7, 7, 2, 2 in the target image as an example, use f(x,y) and g(x,y) to represent the target image and the original image respectively. For the grayscale value of the middle pixel, I will show the implementation process of the "nearest neighbor" interpolation method: the
Insert image description here
nearest neighbor interpolation method is not easy to use, so you don't have to worry about it, keep reading!

2. Code implementation

import torch
import torch.nn as nn
from torchvision import transforms


img=torch.randint(10,size=(1,2,2),dtype=torch.float32)
print(img)
print('---'*5)
nearest_neighbor_interpolation=transforms.Resize(size=(4,4),
                                       interpolation=transforms.InterpolationMode.NEAREST)
resize_img=nearest_neighbor_interpolation(img)
print(resize_img)

2. Bilinear Interpolation

1.Linear Interpolation

Before looking at bilinear interpolation, let's first understand linear interpolation: Linear interpolation is applied to the interpolation of one-dimensional data, and the interpolation calculation is performed based on two known adjacent points to the left and right of the point to be interpolated . The interpolation process is a process of knowing two points, using the two-point method to represent a straight line, and then finding y with known x or finding x with known y. For details, please see the following formula derivation and illustration: the purpose of the figure is to find the interpolation point
Insert image description here
( x, y), and the two points (x0, y0) and (x1, y1) on the left and right of this point are known. The "two-point method" is used to represent this straight line: formula (1) is used to find x when y is known; formula
Insert image description here
( 2) Used to find y when x is known;

2. Bilinear Interpolation

The bilinear interpolation method has a smoothing effect on the image. In fact , one-dimensional linear interpolation is performed on both the x and y directions of the two-dimensional data, and a total of three one-dimensional linear interpolations are performed .
This method is more commonly used. It uses four pixels adjacent to the unknown position to calculate its pixel value. The specific assignment process is calculated by the following formula (f(Q xy ) here can be understood as y x in the one-dimensional interpolation formula ):
Insert image description here
Insert image description here
Assume that the unknown point (dst_x, dst_y) in the target image corresponds to the point in the original image as P(x,y)=P(src_x,src_y), and Q11, Q12, Q21, and Q22 are the four adjacent points of P in the original image. The interpolation process for point P is divided into the following two steps:
1) Along the );
2) Use R1 and R2 to perform one-dimensional linear interpolation along the Y direction (once in total) to obtain the final two-dimensional linear interpolation result P;

Insert image description here
in order to better understand the selection of four adjacent points in bilinear interpolation , everyone needs to know the four-diagonal neighborhood in image processing , as shown in the figure below. For point P, there are four adjacent points in its diagonal direction.
Insert image description here
Analysis of the filling steps:
1) Use the formula at the beginning of the article to calculate how much the coordinates (dst_x, dst_y) in Dst correspond to the coordinates P (src_x, src_y) in Src; 2) Calculate the obtained
point P (src_x, src_y) in the original image according to the corresponding The coordinate relationship of the four corners of the corner is used to obtain its adjacent points: Q11, Q12, Q21, Q22;
3) Use the obtained adjacent points to perform one-dimensional linear interpolation along the X and Y directions to complete the bilinear interpolation process;

For example, assuming that the sizes of the original image Src and the target image Dst are (4,4) and (8,8) respectively, our purpose is to use the original image to fill the pixel values ​​in the target image, and use Pytorch to get the result as shown below , taking the point with the target image coordinates (5,5) as an example, use f(x,y), g(x,y) to represent the grayscale values ​​​​of the pixels in the original image and the target image respectively. I am interested in the "double line" The implementation process of the interpolation method is shown:
Insert image description here
the calculation process of the value to be filled at this point is as follows:
Insert image description here
There is a slight deviation between the manually calculated value and the internal implementation of Pytorch. It should be that the algorithm has been optimized internally by Pytorch:
Insert image description here

3. Code implementation

import torch
import torch.nn as nn
from torchvision import transforms


img=torch.randint(10,size=(1,4,4),dtype=torch.float32)
print(img)
print('---'*5)
bilinear_interpolation=transforms.Resize(size=(8,8),
                                       interpolation=transforms.InterpolationMode.BILINEAR)
resize_img=bilinear_interpolation(img)
print(resize_img)

3. Bicubic Interpolation

1.Related introduction

Bicubic interpolation, also known as bicubic interpolation, can produce smoother-edge interpolation results than bilinear interpolation methods. It can produce better and more accurate interpolation results, but it is slower .
During the interpolation process, the value of the point (x, y) = (i + u, j + v) in the figure is obtained by the weighted average of the 16 points adjacent to the point in the rectangular grid. The coordinates of the 16 points relative to the point are The order is {(i-1,j-1),(i-1,j),(i-1,j+1),(i-1,j+2),…,(i,j),… ,(i+2,j-1),(i+2,j),(i+2,j+1),(i+2,j+2)}, at this time, you need to follow the X and Y directions respectively The corresponding result is calculated using a polynomial interpolated cubic function .
The interpolation formula is:
Insert image description here
we use a ij to represent each neighboring point, f ij to represent the pixel value of point a ij in the original image , w i and w j respectively represent the weight of the point in the corresponding direction. For example, a 00 represents the first point in the upper left corner of the rectangular grid , and so on. Our first step is to calculate all the weights through the Bicubic function , and then perform the required interpolation calculations.
The constructed Bicubic function is as follows:
Insert image description here
When a=-0.5, the function shape is as follows:
Insert image description here
The basic knowledge has been understood. Next, let's look at how to implement the specific steps of the interpolation process. We use Dst and Src to represent the target image and the original image respectively. We need to know what the pixel value of point A (dst_x, dst_y) in Dst should be. A (dst_x,dst_y) corresponds to the point in the original image as P(src_x,src_y)=P(x,y):
Insert image description here

1) Use the formula at the beginning of the article to calculate src_x, src_y through dst_x, dst_y. At this time, generally src_x, src_y is a decimal, which can be recorded as src_x=i+u, src_y=j+v, and (i, j) and the surrounding 16 adjacent points form a rectangular grid; 2) Because the Bicubic function is a one-dimensional function, we need to
separately Calculate the weight corresponding to each point along the X and Y directions. The weight of each point is recorded as w ij (w_x,w_y). The calculation process is:

  • Clarify the distance l ij (l_x,l_y) between the X and Y directions of the point and point P. For example, the distance between a 00 (i-1,j-1) and P(i+u,j+v) is l 00 (i+u-i+1,j+v-j+1)=l 00 (u+1,v+1), a 33 (i+2,j+2) and P(i+u,j +v) is l 33 (i+2-iu,j+2-jv)=l 33 (2-u,2-v);
  • Therefore, the weight corresponding to each neighboring point is w(w_x,w_y)=w(W(l_x),W(l_y)), where W refers to the Bicubic function;

3) After obtaining the weight corresponding to each point, the interpolation result B(dst_x,dst_y) can be obtained through the interpolation formula; it is easy to

calculate the distance l(l_x,l_y) from a total of 16 points from a 00 to a 33 to point P. is: For example, the distance between
Insert image description here
a 00 and P is: l 00 (l_0,l_0)=l 00 (u+1,v+1), and the distance between a 33 and P is: l 33 (l_3,l_3) =l 33 (2-u,2-v), now everyone should understand the process of bicubic interpolation!
-------------------------------------------------- -------------------------------------------------- -----------------------
In fact, matrix representation can also be used in interpolation calculations, as follows:
Insert image description here

Among them, f(i+u,j+v) is the value to be calculated, and A and C respectively represent the weights, which are w_x, w_y in the first method. B is a matrix composed of 16 adjacent points in the original image. The function to calculate the weight is approximated using S(x).

2. Give an example

The input is Src: 5×5, the output is Dst: 10×10, let’s find the value of Dst midpoint (5,5): I
Insert image description here
manually implement the calculation process of the above matrix operation method (the calculation result is always calculated with Pytorch’s built-in function There is a certain deviation in the results, which makes me very puzzled):

import numpy as np
import math

src_w=src_h=5
dst_w=dst_h=10

dst_x=dst_y=5
src_x=dst_x*(src_h/dst_h)
src_y=dst_y*(src_w/dst_w)

i=math.floor(src_x)
j=math.floor(src_y)

u=src_x-i
v=src_y-j
# print(i,j,u,v)
# l_x=np.array([[1.5,0.5,0.5,1.5]])
# l_y=np.array([[1.5,0.5,0.5,1.5]])
base=[1,0,-1,-2]


def l_(r):
    a=np.zeros((1,4))
    for j in range(4):
        a[0,j]=r+base[j]
    return a


l_x,l_y=l_(u),l_(v)
# print(l_x)
# print(l_y)

# # print(l_x.shape)
#
#
def S_x(l_):
    s=np.zeros(shape=(1,4))
    for j in range(4):
        x_abs=math.fabs(l_[0,j])
        if x_abs <= 1:
            s[0,j] = 1-2* math.pow(x_abs, 2)+math.pow(x_abs, 3)
        elif x_abs < 2 and x_abs > 1:
            s[0,j] = 4-8*x_abs+5*math.pow(x_abs, 2) - math.pow(x_abs, 3)
    return s


A,C=S_x(l_x),S_x(l_y)

B=np.array([
         [9., 0., 7., 3.],
         [7., 0., 1., 8.],
         [1., 8., 1., 3.],
         [5., 5., 1., 1.]])
b=np.matmul(np.matmul(A,B),np.transpose(C))
print(b)

3. Code implementation

import torch
import torch.nn as nn
from torchvision import transforms


img=torch.randint(10,size=(1,4,4),dtype=torch.float32)
print(img)
print('---'*5)
bicubic_interpolation=transforms.Resize(size=(8,8),
                                       interpolation=transforms.InterpolationMode.BICUBIC)
resize_img=bicubic_interpolation(img)
print(resize_img)

4. Pytorch implementation

Please note that different interpolation methods in Pytorch are specified by the interpolation parameter in the transforms.Resize method. The above three methods only need to use the following three modes in sequence:

interpolation method Specify method
nearest neighbor interpolation transforms.InterpolationMode.NEAREST
bilinear interpolation transforms.InterpolationMode.BILINEAR
bicubic interpolation transforms.InterpolationMode.BICUBIC

Here I use a photo of Jay Chou to show the interpolation results corresponding to the three methods:
The original picture is as follows:
Insert image description here

import torch
from torchvision import transforms
from PIL import Image
from torchvision.utils import  save_image


img=Image.open('./Jay.png',mode='r')
img_to_tensor=transforms.ToTensor()(img)
# print(img_to_tensor.shape)

nearest_neighbor_interpolation=transforms.Resize(size=(1024,1024),
                                       interpolation=transforms.InterpolationMode.NEAREST)
nearest_resize_img=nearest_neighbor_interpolation(img_to_tensor)

bilinear_interpolation=transforms.Resize(size=(1024,10248),
                                       interpolation=transforms.InterpolationMode.BILINEAR)
bilinear_resize_img=bilinear_interpolation(img_to_tensor)
#
bicubic_interpolation=transforms.Resize(size=(1024,1024),
                                       interpolation=transforms.InterpolationMode.BICUBIC)
bicubic_resize_img=bicubic_interpolation(img_to_tensor)

save_image(nearest_resize_img,'./nearest.png')
save_image(bilinear_resize_img,'./bilinear.png')
save_image(bicubic_resize_img,'./bicubic.png')

Interpolation results:
1) Nearest neighbor interpolation
Insert image description here

2) Bilinear interpolation
Insert image description here

3) Bicubic interpolation
Insert image description here

reference:

1) https://blog.csdn.net/JNingWei/article/details/78822026
2) https://blog.csdn.net/qq_30815237/article/details/90605132
3) https://blog.csdn.net/ weixin_43135178/article/details/117262348
4) "Baidu Encyclopedia"
didn't pay attention to this event in July. I just happened to meet it today. Let's participate hehe!
The content of the article welcomes everyone’s advice and we can learn together.

Guess you like

Origin blog.csdn.net/qq_43665602/article/details/126853751