COMO-ViT paper reading notes

Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network

Insert image description here

  • This is a dark image enhancement paper for ICCV2023, a collaboration between Meituan, Megvii, Shenzhen Science and Technology Research Institute, Huawei’s Noah’s Ark Laboratory, and the University of Electronic Science and Technology of China. However, there is no open source code.

  • One of the contributions of the article is the proposal of the illumination adaptive gamma correction module that combines the Global Gamma Correction Module and the Local Gamma Correction Module, and the other is the proposal of the network structure of COMO-ViT. The overall process is shown in the figure below
    Insert image description here

  • The process is divided into 3 stages, the first is the dark image III performs convolution pooling and fully connected sigmoid to generate parameters for global gamma correction to achieve gamma correction. Here the gamma correction is expanded into Taylor's formula to speed up the operation
    Insert image description here
    Insert image description here

  • The second stage is to send the gamma corrected image and the original image to the second stage network to extract features, and use the spatial attention mechanism, and then add the two features together as the fused features and send them to the second stage. Three-stage network.

  • The third stage has two branches, a transformer branch performs self-attention inside the non-overlapping window, and a CNN branch supplements the transformer branch. Because there is no interaction between windows, the features of the CNN and the features of the transformer are added together, and then Send a global transformer to do the attention between windows, and the output is used as the feature sent to the next layer. After multiple such operations, the convolution sigmoid obtains the local gamma correction parameters and performs local gamma correction:

  • The loss function is as follows:
    Insert image description here

  • Finally, a PSNR of 22.2 was achieved on LOLv2-Real (not comparable to LLFlow hahaha):
    Insert image description here

Summarize

  • Personally, I feel that Taylor expansion should have no acceleration effect. Not only is there no open source, but there is no ablation experiment for this. At the same time, it is rare to only provide PSNR and SSIM. LPIPS LOE FID NIQE does not provide it, nor does it compare to PSNR. Compared to LLFlow (PSNR of 25.42). I feel like posting ICCV is a bit awkward. . .

Guess you like

Origin blog.csdn.net/weixin_44326452/article/details/132816872