【Low Light Enhancement】Zero-DCE

I. Introduction

Paper:Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation
Code:https://github.com/Li-Chongyi/Zero-DCE_extension

Paper Contributions:

  • For the first time, a low-light enhanced network that does not require paired training data is proposed to avoid the risk of over-fitting and generalize well under different lighting conditions;
  • Design a pixel-by-pixel high-order curve that can efficiently perform brightness mapping in a wide dynamic range through multiple iterations;
  • Demonstrate the potential of training image augmentation networks via a no-reference loss function in the absence of reference images ;
  • The proposed Zero-DCE network can maintain the enhancement capability while reducing the computational load , providing multiple options to balance the enhancement capability and computational overhead.

2. Algorithm understanding

2.1 Low light enhancement curve

Inspired by the brightness adjustment curve in PS, the author tries to design a curve that automatically maps a low-light image to an enhanced version, and the parameters of the adaptive curve are only related to the input image. Since it is a fitting curve, it is natural to think of using a neural network with a strong fitting ability to solve it. The author believes that such a curve should have the following properties:

  • Each pixel value of the enhanced image should fall in the normalized range of [0,1] to avoid information loss caused by overflow truncation;
  • The curve should be monotonic to preserve the difference between adjacent pixels;
  • The form is as simple as possible, and it is differentiable in the process of gradient backpropagation;

Based on the above three characteristics, the author designed the following quadratic curve:
insert image description here
where xxx represents the pixel coordinates,α ∈ [ − 1 , 1 ] \alpha \in[-1,1]a[1,1 ] is a learnable parameter that controls both the progression of the curve and the exposure level. Each input pixel needs to be normalized to [0,1], and then the curve is applied to the three channels of RGB instead of only to the brightness channel, which can preserve the inherent color and reduce the risk of oversaturation.

Multiple iterations of formula (1) can obtain the high-order curve shown in formula (2), which can deal with more challenging low-light scenes, nnn is the number of iterations, which is set to 8 in this paper.
insert image description here
If all pixels in a single image use a curve, it is equivalent to a global mapping, and it is easy to cause local overexposure or underexposure. If each pixel of the input image corresponds to a high-order curve, the problem can be solved, and the operation is relatively simple, as shown in formula (3), the coefficient α \alphaα is changed to a coefficient matrixA \mathcal{A}A is enough, so each pixel has a corresponding coefficient.
insert image description here

2.2 Overall framework

The overall framework is relatively simple, input an RGB image, and output 3 ∗ n = 3 ∗ 8 3*n=3*8 through DCE-Net3n=38 A total of 24 coefficient maps (3 channels in RGB), and then each coefficient matrix is ​​substituted into the formula to iteratively solve the final enhanced image.
insert image description here
coefficient graphA \mathcal{A}An example of A , it can be seen that for any RGB channel, the value of the bright area is relatively small, and the value of the dark area is large.
insert image description here

2.3 Network structure

The network structure is also very simple. There are only 7 convolutional layers in total, and skip connections are added. Note that the last convolution is followed by the Tanh activation function to ensure that the output coefficients fall into the [-1,1] range. Because of 8 iterations, separate curves are applied to the three channels of RGB, so the output channel is 24.
insert image description here

2.4 Loss function

Since there is no reference image, the author designs a total loss function consisting of four no-reference loss functions from the four perspectives of spatial consistency, exposure control, color constancy, and spatial continuity.
insert image description here

2.4.1 Spatial consistency

The difference between the original image and the adjacent regions of the enhanced image is kept as close as possible to ensure that the enhanced image is spatially consistent with the pre-enhanced image.
insert image description here
The following figure can clearly explain the loss of spatial consistency. The original image and the enhanced image are divided into several local regions with a size of 4x4. I i − I j I_i-I_jIiIj Y i − Y j Y_i-Y_j YiYj, and calculate the difference between the absolute values ​​of the two. (The author's open source code is also written very clearly, using average pooling and convolution)

insert image description here

2.4.2 Exposure Control

In order to avoid overexposure or underexposure in local areas, the enhanced image is divided into MMM local areas with a size of 16x16, the average brightness of each local areaYYY is limited toEENear E , the text setsE = 0.6 E=0.6E=0.6
insert image description here

2.4.3 Color constancy

According to the grayscale world hypothesis, for an image with a large number of color changes, the average value of the three color components of R, G, and B tends to the same grayscale value K. For the enhanced image, the mean values ​​of the three channels of R, G, and B should be close. Based on this assumption, the author set a color constant loss to ensure that the color of the enhanced image is normal. The form is relatively simple, and the MSE of the average value is calculated for each of the three channels, and then summed.
insert image description here

2.4.4 Lighting smoothing

In order to maintain the monotonic relationship between adjacent pixels, in each curve parameter graph A \mathcal{A}Adding an illumination smoothing loss on A , which is the common total variation loss, promotes the spatial continuity of the enhanced image by limiting the horizontal and vertical gradients.
insert image description here

2.5 Zero-DCE++

Although the speed of Zero-DCE has already beaten various dark light enhancement algorithms, the author proposed a lighter version of Zero-DCE++ by adjusting the network structure, with only 10k parameters. For an image with a size of 1200×900×3, in The real-time inference FPS on a single GPU/CPU can reach 1000/11. The improvements are as follows:

  • Replace ordinary convolution with depthwise separable convolution;
  • The authors find that the curve parameter map A \mathcal{A} for each iteration stageA is similar in most cases, so reduce the number of parameter maps from 24 to 3, and reuse 3 parameter maps per iteration to deal with most scenarios (A n \mathcal{A}_nAnbecomes A \mathcal{A}A , each iteration usesA \mathcal{A}A);
    insert image description here
  • The proposed method is not sensitive to the size of the input image, so the downsampled image can be used as the network input, and then the output curve parameter map is upsampled back to the original resolution for image enhancement.

3. Effect test

Guess you like

Origin blog.csdn.net/zxdd2018/article/details/128080140#comments_27987984