【Paper Notes】—Low Light Image Enhancement—Supervised—URetinex-Net—2022-CVPR

【Introduction】

【题目】:URetinex-Net: Retinex-based Deep Unfolding Network for Low-light Image Enhancement 
【会议】:2022-CVPR
【机构】:深圳大学
【作者】:Wenhui Wu, Jian Weng, Pingping Zhang, Xu Wang, Wenhan Yang, Jianmin Jiang
【paper】:https://openaccess.thecvf.com/CVPR2022
【video】:https://www.youtube.com/watch?v=MJZ5HT1jGrA
【code_Pytorch】:https://github.com/AndersonYong/URetinex-Net

【Ask a question】

The hand-crafted priors and optimization-driven solutions commonly used by Retinex model-based methods lead to lack of adaptability and efficiency when dealing with low-light images.

Model-based Methods: Relying on hand-crafted priors, the optimization process is time-consuming.
Learning-based Methods: Guided inference is fast but lacks interpretability.

【solution】

A Retinex-based deep unfolding network (URetinex-Net) is proposed, which expands an optimization problem into a learnable network to decompose low-light images into reflection and illumination layers. By formulating the decomposition problem as an implicit prior regularization model, three learning-based modules are carefully designed, responsible for data-dependent initialization, efficient unrolling optimization, and user-specified illumination enhancement, respectively. In particular, the proposed unfolding optimization module achieves noise suppression and detail preservation of the final decomposition results by introducing two networks to adaptively fit implicit priors in a data-driven manner.

【Innovation】

  1. The URetinex-Net proposed for the LLIE problem includes three learnable modules: initialization module, expansion optimization module, and illumination adjustment module.
  2. Initialization module: Simultaneously estimate the reflectance and illuminance of the input through a learning method in a unified framework.
  3. Expanded optimization module: expands an optimization problem into a learnable network, adaptively fits implicit priors in a data-driven manner, and achieves noise suppression and detail preservation of the final decomposition result.
  4. Illumination Adjustment Module: Flexibly enhance illumination with user-defined ratios.

【Formula model】

【URetinex-Net network structure】

URetinex-Net for LLIE problems consists of three learnable modules:

  1. First, initial albedo and lighting are generated by passing the target low-light image to the initialization module.
  2. Afterwards, the unfolding optimization module iteratively refines the reflection and illumination layers (ß and ɻ are penalty parameters)
  3. Finally, the illumination adjustment module outputs the enhanced image according to the user-defined scale w.

Figure 2 (a) The overall framework of URetinex-Net, (b) the details of each stage in URetinex-Net, (c) the specific network structure of the denoising network GR applied in each stage.

Initialize the module

  1. Loss function: The first item is the reconstruction loss, and the second item is to maintain the overall structure of I.
  2. Structure: The initialization module consists of three Conv+LeakyReLU layers followed by Conv and ReLU layers. The kernel size of the entire convolutional layer is set to 3*3.
  3. Effect: Figure 3: Statistical characteristics of a patch in a low-light image. Obviously, the rigid initialization (b) changes the intensity statistics of the three channels {R, G, B} of the original low-light image, while the initialization module in this paper ( c) are well preserved.

Rigid initialization: The initial illumination L0 is initialized by finding the maximum value of the three color channels, and the initial reflectance R0 can be derived accordingly by R0=I/L0.

Expand the optimization module

Purpose: Avoid explicit regularization design and adaptively restore illumination and reflectance in a deep learning manner.

Supervised: The reflectance of the normal-light image is used as a reference. The structure-aware smoothness constraint [34] is imposed on the illumination of the normal-light image, and then the loss function of the decomposition of the normal-light image is as follows:

Loss function: The unrolled optimization module is trained in an end-to-end manner, where the parameters and network architectures of GR and GL are shared in different stages. During the optimization of the unfolded network, the normal light reflectance Rα generated by our initialization module is used as a reference. In terms of loss function, we use the summation of the loss function of reflectivity and illuminance, including the mean square error (MSE) loss between Pk and Rk at each stage, MSE loss, structural similarity loss, and the final recovery reflectivity RT The perceptual loss, the MSE loss between Qk and Lk at each stage, and the total change loss at Lk at each stage. The loss function of the unfolded optimization module is as follows:

⌀(·) indicates the advanced feature extractor pre-trained on ImageNet by the VGG19 network, and Rref indicates the reflectance of the normal light image.

Structure: Instead of introducing hand-crafted priors to manually design a specific loss function, a learning-based approach is developed to explore implicit priors from real-world data. In other words, two networks denoted GL and GR are introduced to update L and R respectively.

GL's simple fully convolutional network, which has five CONV layers, and then learns an implicit prior on L through RELU activation, so that the prior can be learned from the training data while avoiding the design of complex regularization terms.

Optimize the results of each stage decomposition of the unrolled module:

Light Adjustment Module

Takes the low light level L and the user-specific enhancement ratio w as input

Structure: The initialization module consists of three Conv+LeakyReLU layers followed by Conv and ReLU layers. The kernel size of the entire convolutional layer is set to 5*5.

[data set, comparison method, evaluation index]

【Experimental Results】

【Ablation Research】

To illustrate the effectiveness of the unfolding optimization module, we further simply stack the networks of GR and GL T times and discard the unfolding optimization , keeping the same network capacity as URetinex-Net. Finally, the performance of URetinex-Net under different stage T choices is investigated.

Table 2: where IM, UOM and IG are the abbreviations of the proposed initialization module, on-optimization module and illumination module, respectively.

Figure 6: By visual comparison in the figure, there is still poor detail preservation and color distortion on the enhancement results at T=1, while cleaner results can be obtained by using unwrapping optimization. Based on the trade-off between image quality and inference time, T=3 is chosen as the default setting.

Guess you like

Origin blog.csdn.net/qq_39751352/article/details/126663963