---Summarize

1. GCANet
Gated Context Aggregation Network for Image Dehazing and Derainin
The
insert image description here
insert image description here
two most important contributions of the paper are:
smooth dilated convolution, used to replace the original dilated convolution, eliminating gridding artifacts (Grid artifacts)
gated fusion sub-network, used to fuse features of different levels, is good for both low-level tasks and high-level tasks
smooth dilated convolution (smooth hollow convolution)

# SS convolution 分割和共享卷积(separate and shared convolution)
class ShareSepConv(nn.Module):
    def __init__(self, kernel_size):
        super(ShareSepConv, self).__init__()
        assert kernel_size % 2 == 1, 'kernel size should be odd'
        self.padding = (kernel_size - 1)//2
        weight_tensor = torch.zeros(1, 1, kernel_size, kernel_size)
        weight_tensor[0, 0, (kernel_size-1)//2, (kernel_size-1)//2] = 1
        self.weight = nn.Parameter(weight_tensor)
        self.kernel_size = kernel_size

    def forward(self, x):
        inc = x.size(1)
        expand_weight = self.weight.expand(inc, 1, self.kernel_size, self.kernel_size).contiguous()
        return F.conv2d(x, expand_weight,
                        None, 1, self.padding, 1, inc)


class SmoothDilatedResidualBlock(nn.Module):
    def __init__(self, channel_num, dilation=1, group=1):
        super(SmoothDilatedResidualBlock, self).__init__()
        self.pre_conv1 = ShareSepConv(dilation*2-1)
        self.conv1 = nn.Conv2d(channel_num, channel_num, 3, 1, padding=dilation, dilation=dilation, groups=group, bias=False)
        self.norm1 = nn.InstanceNorm2d(channel_num, affine=True)
        self.pre_conv2 = ShareSepConv(dilation*2-1)
        self.conv2 = nn.Conv2d(channel_num, channel_num, 3, 1, padding=dilation, dilation=dilation, groups=group, bias=False)
        self.norm2 = nn.InstanceNorm2d(channel_num, affine=True)

    def forward(self, x):
        y = F.relu(self.norm1(self.conv1(self.pre_conv1(x))))
        y = self.norm2(self.conv2(self.pre_conv2(y)))
        return F.relu(x+y)

	gates = self.gate(torch.cat((y1, y2, y3), dim=1))
        gated_y = y1 * gates[:, [0], :, :] + y2 * gates[:, [1], :, :] + y3 * gates[:, [2], :, :]
        y = F.relu(self.norm4(self.deconv3(gated_y)))
        y = F.relu(self.norm5(self.deconv2(y)))
        if self.only_residual:
            y = self.deconv1(y)
        else:
            y = F.relu(self.deconv1(y))

2. MSBDN
Multi-Scale Boosted Dehazing Network with Dense Feature Fusion
has a multi-scale enhanced dehazing network with dense feature fusion
insert image description here
**DFF: **The dense feature fusion module can simultaneously make up for the missing spatial information in high-resolution features, and use non- Adjacent feature insert image description hereinsert image description here
insert image description here
article contribution: Based on Unet, multi-scale use of feature information
insert image description hereinsert image description here

# DFF特征模块
class Encoder_MDCBlock1(torch.nn.Module):
    def __init__(self, num_filter, num_ft, kernel_size=4, stride=2, padding=1, bias=True, activation='prelu', norm=None, mode='iter1'):
        super(Encoder_MDCBlock1, self).__init__()
        self.mode = mode
        self.num_ft = num_ft - 1
        self.up_convs = nn.ModuleList()
        self.down_convs = nn.ModuleList()
        for i in range(self.num_ft):
            self.up_convs.append(
                DeconvBlock(num_filter//(2**i), num_filter//(2**(i+1)), kernel_size, stride, padding, bias, activation, norm=None)
            )
            self.down_convs.append(
                ConvBlock(num_filter//(2**(i+1)), num_filter//(2**i), kernel_size, stride, padding, bias, activation, norm=None)
            )
            #
               if self.mode == 'iter2':
            ft_fusion = ft_l
            for i in range(len(ft_h_list)):
                ft = ft_fusion
                for j in range(self.num_ft - i):
                    ft = self.up_convs[j](ft)
                ft = ft - ft_h_list[i]
                for j in range(self.num_ft - i):
                    # print(j)
                    ft = self.down_convs[self.num_ft - i - j - 1](ft)
                ft_fusion = ft_fusion + ft     
# DFF解码模块
class Decoder_MDCBlock1(torch.nn.Module):
    def __init__(self, num_filter, num_ft, kernel_size=4, stride=2, padding=1, bias=True, activation='prelu', norm=None, mode='iter1'):
        super(Decoder_MDCBlock1, self).__init__()
        self.mode = mode
        self.num_ft = num_ft - 1
        self.down_convs = nn.ModuleList()
        self.up_convs = nn.ModuleList()
        for i in range(self.num_ft):
            self.down_convs.append(
                ConvBlock(num_filter*(2**i), num_filter*(2**(i+1)), kernel_size, stride, padding, bias, activation, norm=None)
            )
            self.up_convs.append(
                DeconvBlock(num_filter*(2**(i+1)), num_filter*(2**i), kernel_size, stride, padding, bias, activation, norm=None)
            )
# 
        if self.mode == 'iter2':
            ft_fusion = ft_h
            for i in range(len(ft_l_list)):
                ft = ft_fusion
                for j in range(self.num_ft - i):
                    ft = self.down_convs[j](ft)
                ft = ft - ft_l_list[i]
                for j in range(self.num_ft - i):
                    ft = self.up_convs[self.num_ft - i - j - 1](ft)
                ft_fusion = ft_fusion + ft
                # SOS
                res8x = self.dense_4(res8x) + res8x - res16x 
                self.dense_4 = nn.Sequential(
                ResidualBlock(128),
                ResidualBlock(128),
                ResidualBlock(128)
        )

3. 4kDehazing
Ultra-High-Definition Image Dehazing via Multi-Guided Bilateral
Learning
insert image description hereinsert image description hereinsert image description here

insert image description hereinsert image description here


    def forward(self, x):
        
        x_u= F.interpolate(x, (320, 320), mode='bicubic', align_corners=True)
        
        x_r= F.interpolate(x, (256, 256), mode='bicubic', align_corners=True)
        coeff = self.downsample(self.u_net(x_r)).reshape(-1, 12, 16, 16, 16) 
              
        guidance_r = self.guide_r(x[:, 0:1, :, :])
        guidance_g = self.guide_g(x[:, 1:2, :, :])
        guidance_b = self.guide_b(x[:, 2:3, :, :])
        
        slice_coeffs_r = self.slice(coeff, guidance_r)
        slice_coeffs_g = self.slice(coeff, guidance_g) 
        slice_coeffs_b = self.slice(coeff, guidance_b)   
        
        x_u = self.u_net_mini(x_u)
        x_u = F.interpolate(x_u, (x.shape[2], x.shape[3]), mode='bicubic', align_corners=True)   
        
        output_r = self.apply_coeffs(slice_coeffs_r, self.p(self.r_point(x_u)))
        output_g = self.apply_coeffs(slice_coeffs_g, self.p(self.g_point(x_u)))
        output_b = self.apply_coeffs(slice_coeffs_b, self.p(self.b_point(x_u)))
        
        output = torch.cat((output_r, output_g, output_b), dim=1)
        output = self.fusion(output)
        output =  self.p(self.x_r_fusion(output) * x - output + 1)
 
 # 分三通道处理
 class ApplyCoeffs(nn.Module):
    def __init__(self):
        super(ApplyCoeffs, self).__init__()
        self.degree = 3

    def forward(self, coeff, full_res_input):
        R = torch.sum(full_res_input * coeff[:, 0:3, :, :], dim=1, keepdim=True) + coeff[:, 3:4, :, :]
        G = torch.sum(full_res_input * coeff[:, 4:7, :, :], dim=1, keepdim=True) + coeff[:, 7:8, :, :]
        B = torch.sum(full_res_input * coeff[:, 8:11, :, :], dim=1, keepdim=True) + coeff[:, 11:12, :, :]
        result = torch.cat([R, G, B], dim=1)
       
        
        return result

Supplement:
Bilateral filtering speed is very slow -> speed up, bilateral mesh.
First of all, it is clear that the bilateral grid is essentially a data structure

insert image description here

Taking the single-channel gray value as an example, the bilateral grid combines the two-dimensional spatial domain information of the image and the one-dimensional gray level information, which can be considered as a 3D array.
To give a simple example, suppose you now have a brush for filtering/smoothing, when you use this brush at a certain position ( x , y ) (x,y)(x ,y) clicked, correspondingly, the 3D bilateral grid ( x , y , E ( x , y ) ) (x,y,E(x,y))(x,y,E(x,y )) A point will appear at the position, which corresponds to the point you clicked on the 2D image E EE. With the movement of this brush, the three dimensions in the 3D bilateral space will be smoothed by Gaussian, so for flat areas, when the gray level changes little, performing Gaussian smoothing along the two-dimensional plane is equivalent to The image is Gaussian filtered; for the boundary area, the grayscale changes greatly, and the Gaussian attenuation range of the brush ensures that the value on the other side of the boundary is not imaged, so the boundary is preserved.

This operation of clicking a point on the image and then projecting it into the 3D bilateral space is called splat.
The brush has been brushed in the 3D bilateral space, so how to reconstruct the filtered image? Interpolation in bilateral space is termed slice.
To make a small summary, bilateral filtering can be simply understood as taking into account the information of the color domain in the spatial domain, and filtering with comprehensive weights, but if the calculation is performed directly according to the formula of bilateral filtering weights, it is often anxious in terms of speed, so the proposed The idea of ​​simulating bilateral filtering on a bilateral grid can take into account the information of the color gamut in three dimensions and speed up.
Now the process of fast bilateral filtering is briefly described as: splat/blur/slice, which is to perform sampling operation on the image, project it onto the 3D grid, perform filtering on the 3D, and then interpolate each ( x , y ) (x,y)(x,y) to reconstruct the filtered image.

Guess you like

Origin blog.csdn.net/weixin_44021553/article/details/123733650