Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement
Tsinghua University, University of Würzburg, and ETH Zurich published a transformer in ICCV2023 that worked on dark image enhancement and was open source.
The article believes that Retinex's I = R ⊙ LI=R\odot LI=R⊙L assumes clean R and L, but in fact it is not clean due to noise, so interference terms are added to L and R respectively, and the formula is changed to the following:
This article uses the first prediction L ‾ \overline LLThen use I ⊙ L ‾ I\odot\overline LI⊙Lretinex paradigm to predict enhancement outcomes. Combining the above formula we can get:
The first term is due to the assumption that L ⊙ L ‾ = 1 L\odot\overline L=1L⊙L=1 , so the first term is the result of the increase we want, which is clean R, and the second term is due toL ^ \hat LL^ Introduced interference, that is, interference from overexposure or underexposure, the third item isR ^ \hat RR^ Introduced interference, namely noise and artifacts. The second and third terms are collectively called corruption, and the following formula is obtained:
SinceI lu I_{lu}IluIt also includes corruption, which is not the final enhancement result we want. We can first estimate I lu I_{lu}Ilu, and then remove the C to obtain the final enhanced result.
The network structure is shown in the figure below, where L p L_pLpis the three-channel mean of the image. The picture below has a bit of a strange look at how the modules are unfolded. In fact, it is to extract L ‾ \overline L from the concatenated brightness image and the original image.Land features F lu F_{lu}Flu, and then use F lu F_{lu}FluRescale the V of the transformer in the subsequent repair process, which is used in the illumination-guided attention block. The subsequent repair process is to refine the preliminary enhancement results, suppress overexposed areas, and remove noise.
The experimental results are shown in the figure below. Only PSNR and SSIM are given, but there is no comparison with LLFlow, so the PSNR of only 22 can be called SOTA.
We also compared the enhanced results on exdark and user studies on multiple data sets.
Personally, I feel that there are no highlights in this work. It is just a network structure, but the idea is not particularly eye-catching and the effect is not particularly good. It has not yet given indicators such as lpips niqe LOE.