Computer vision day 93 Learning pixel-level dilation filtering for efficient single image denoising

1 Introduction

Existing methods often make specific assumptions about the rain model, which are difficult to cover diverse situations in the real world, and must resort to complex optimization or stepwise refinement. However, for many efficiency-critical applications, this seriously affects the efficiency and effectiveness of these methods.

To fill this gap, in this paper, we treat single-image problem-solving as a general image enhancement problem and initially propose a model-free problem-solving method, namely EfficientDeRain . It is able to process rainy images within 10 ms (i.e. about 6 ms on average), more than 80 times faster than the state-of-the-art method (i.e. RCDNet), while achieving similar deraining effects.

We first propose a novel pixel-level dilated filtering method, which utilizes pixel-level kernels in the kernel prediction network to filter rainy images, thereby efficiently predicting multi-scale kernels corresponding to each pixel.

Then, to bridge the gap between synthetic and real data, we further propose an effective data augmentation method (i.e., RainMix), which helps to train the network for real rain image processing. We perform a comprehensive evaluation on synthetic and real rain datasets to demonstrate the effectiveness and efficiency of our method.

We released EfficientDeRain in [https://github.com/tsingqguo/efficientderain.git](https://github.com/tsingqguo/efficientderain.git).

figure 1:

Top: Comparison results (i.e., PSNR vs. Time and SSIM vs. Time) on the challenging Rain100H dataset.

Bottom: An example of using EfDeRain without RainMix's EfDeRain and state-of-the-art RCDNet (Wang et al. 2020a) for real rainy images. Note that the time costs of all comparison methods are calculated one by one on the same computer.image-20230517144607335

We demonstrate the strength of our method on synthetic and real rain datasets, achieving high performance and efficient removal. As shown in Figure 1, our method (i.e., EfDeRain) runs about 88 times faster than the state-of-the-art method RCDNet (Wang et al. 2020a), with similar PSNR and SSIM performance. In addition, after being equipped with RainMix, EfDeRain can achieve better visualization than RCDNet on real rainy images.

3 Methodology

3.1 Pixel-by-pixel image filtering and demodulation

In this section, we propose a model-free inference method based on image filtering. In general, rain can be considered a degradation that can cause effects such as occlusion, fog, motion blur, etc. Therefore, it is reasonable to use image filtering method to process it, which can effectively solve various degradation problems. Specifically, we process the input rainy image Ir ∈ RH×W 1 through a pixel-level filterimage-20230517145342786

where ˆI∈RH×W is the estimated image, o represents the pixel-by-pixel filtering operation, each pixel is processed by its exclusive kernel, and K∈RH×W×K2 contains the kernels of all pixels. Specifically, when solving the p-th pixel of Ir, we have its exclusive kernel, that is, the vector at the p-th position of k, and reshape it, which is recorded as Kp∈RK×K. We use p as the 2D coordinate of the pixel. Then, we can predict the removed pixels image-20230517145535315where the range of t is (−K−1 2, −K−1 2) ~ (K−1 2,K−1 2).

Next, consider two questions:

① How to effectively and efficiently estimate spatial variables, scale variables, and semantic-aware kernels.

② How to train a powerful derailed DNN, using synthetic data to bridge the gap with real deraining.

answer:

① Multi-expansion kernel prediction network, Section 3.2.

②The simple and effective method of increasing rainfall is recorded as RainMix, Section 3.3.

3.2 Learnable pixel-level dilation filter

Kernel prediction network (kernel prediction network)

We propose to take rain images as input and estimate the pixel-wise kernel K for removalimage-20230517151802122

Among them, KPN( ) represents a unet-like network, and its structure is shown in Figure 2.

image-20230517151900973

Multi-dilated image filtering and fusion (multi-dilated image filtering and fusion)

To enable our method to handle multi-scale rain streaks without compromising efficiency, we expand each prediction kernel to three scales by the idea of ​​dilated convolution (Yu and Koltun 2016).

For convolutional layers, we propose to extend pixel-level filtering in Equation 1 to pixel-level dilation filtering (Yu and Koltun 2016)image-20230517152023636

In the formula, l is the expansion coefficient controlling the application range of the same filter. In practical application, we considered 4 scales, namely l = 1,2,3,4. Through Eq. 4, we can get four analytical images, ˆI1, ˆI2, ˆI3, ˆI4. We then fuse these four images using a convolutional layer of size 3 × 3 to get the final output.

3.3 RainMix: A bridge between real-world and synthetic data

image-20230517151730924

We present a rainmix-based learning algorithm in Algorithm 1.

4 Experiments

4.1 Setups

**data set. **In order to comprehensively validate and evaluate our method, we conducted comparison and analysis experiments on 4 popular datasets, including Rain100H (Yang et al. 2017, 2019), Rain1400 (Fu et al. 2017) synthetic dataset, The recently proposed SPA Real Rainfall dataset (Wang et al. 2019) and Real Raindrop dataset (Qian et al. 2018).

**measure. **We adopt the commonly used Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity (SSIM) as quantitative evaluation metrics for all datasets. In general, the larger the PSNR and SSIM, the better the release effect.

Baselines . 9+5 has a total of 14 ways to remove rain. It should be noted that the time costs of all comparison methods are calculated one by one on the same PC using Intel Xeon CPU (E5-1650) and NVIDIA Quadro P6000 GPU.

4.2 Comparison of Rain100H&1400 datasets

image-20230517150620777

image-20230517150637234

image-20230517150645539

On the Rain100H and Rain1400 datasets, we compare our method with the nine baseline methods in Fig. 4. In general, our method, EfDeRain, achieves the lowest time cost when using top methods on both datasets for comparable PSNR or SSIM.

image-20230517150721307

We further compare the visualization results of EfDeRain with the state-of-the-art baseline methods RCDNet and PReNet, as shown in Fig. 5

4.3 Comparison of real SPA rain datasets

We further compare our method with 8 baseline methods on the SPA dataset (Wang et al. 2019), where the rain image is real and its ground truth is obtained by human labeling and multi-frame fusion. image-20230517150917155As shown in Figure 7, our method achieves almost the same PSNR and SSIM as the state-of-the-art method RCDNet, and outperforms all other baselines while running more than 71 times more efficiently than RCDNet.

image-20230517151015273We also visually compare our method with RCDNet and SPANet in Fig. 6. The results show that the method can handle rain traces in different modes better, and achieve better visualization results than RCDNet and SPANet, where the red arrows show the main differences from the other two models.

4.4 Comparison of real raindrop datasets

Besides on the rain streak image dataset, we also compare our method on the raindrop removal task to show the generalization ability of our method. We cultivate the capabilities of our method. We train our network on the Raindrop dataset (Qian et al. 2018) and compare it with 6 state-of-the-art baseline methods.

In particular, the DeRaindrop method (Qian et al. 2018) is specifically designed for this problem, where the regions of raindrops are perceived by an attention recurrent network. Without changing any architecture or hyperparameters, our method achieves the second-best result compared to competing SSIM DeRaindrop and outperforms all other DeRaindrop methods, demonstrating the effectiveness and generality of our method .

4.5 Ablation studies

We further validate the strength of our contribution with the Rain100H visualization results in Fig. 8

image-20230517151307763

Furthermore, we perform a visual comparison on the SPA dataset in Fig. 9 to verify the effectiveness of our RainMix. image-20230517151451318Therefore, our RainMix enhances the ability of our method in removing realistic rain traces even though the rain patterns are very diverse. In all cases, if we do not use RainMix, we observe that there will always be some traces of heavy rain that cannot be handled, as indicated by the red arrows in the figure (red arrows show the main differences between the two versions. Yellow arrows indicate no Raindrops labeled by humans are also removed by our method).

5 Conclusions

In this paper, we propose a new model-free demolding method called EfficientDeRain. Our method not only achieves remarkably high performance, but also runs more than 80 times more efficiently than state-of-the-art methods. First, we propose and design a novel pixel-level dilated filtering approach, where each pixel is filtered by a multi-scale kernel estimated by an offline-trained kernel prediction network. Second, we propose a simple yet effective data augmentation method for training networks, RainMix, that bridges the gap between synthetic and real data. Finally, we conduct large-scale evaluations on the popular and challenging synthetic datasets Rain100H and Rain1400, as well as the real-world datasets SPA and Raindrop, comprehensively validating the advantages of our method in terms of efficiency and removal quality .

In the future, we will investigate the impact of derailment on other computer vision tasks, e.g., object segmentation (Guo et al. 2018, 2017c) and object tracking (Guo et al. 2020c,a, 2017a,b), using state-of-the-art DNN tests work (Xie et al. 2019a; Du et al. 2019; Xie et al. 2019b; Ma et al. 2018a,b, 2019). We also want to study single images from the perspective of adversarial attack methods, e.g., (Guo et al. 2020b; Wang et al. 2020b; Cheng et al. 2020a; Gao et al. 2020; Cheng et al. People, 2020).

Guess you like

Origin blog.csdn.net/qq_43537420/article/details/130744300