Introductory tutorial on fusion of infrared and visible light images

Introductory tutorial on fusion of infrared and visible light images

The blogger is currently entering the second stage of research, which can be regarded as an introduction to the field of infrared and visible light images. I just summarized it for reference for newcomers who are new to this field.

This blog is a general introduction to the papers in the field of infrared and visible light image fusion that the blogger has learned about. For specific interpretations of the papers, you are welcome to come to the infrared and visible light image fusion column. For questions in this field, you are also welcome to send private messages or comment . Contact me via official account.

If there is any infringement, please contact the blogger

What is infrared and visible light image fusion

For our newbies, you may have a doubt, why do we need to fuse infrared and visible light?

Is it just to give you an infrared image and a visible light image, and fuse them directly without any brains?

Obviously not, here we look at a sample image.
Insert image description here
First of all, what we can see is that there is an obvious target information in the infrared image. What is target information?
Insert image description here

This picture is the information about people. What we see is a glowing man running, but what is the scene of him running? We cannot see the texture information of background information in infrared images. So what is texture information? Let's take a look next.

Looking at the visible light image, you can't see the running person in this image. You even think that this is definitely different from the time when the infrared image was taken, and this is the meaning of the fusion of infrared and visible light .

Look at the picture below. What do you find? Have you found that the leaf information can be seen more clearly in the light image, compared with the infrared image? This content is called texture information .
Insert image description here

After understanding these contents, we can talk about what exactly is the fusion of infrared and light images?

Generally speaking, it is to integrate the target information and texture information described above into one image, that is, a fused image. We hope that the fused image should retain the target intensity information of the infrared image, and also retain the texture information of the visible light image, as shown below.
Insert image description here

At this point, you have a preliminary understanding of image fusion. To facilitate your understanding, the content described in bold font is actually everyone’s understanding of it when you first did the fusion of infrared and visible light images. After you have studied it for a long time, After reading this paper, you will find that the texture information in infrared images is also an indispensable part. At the same time, there are also some extremely bright things in visible light images, which we want to retain.

Let’s take a look at the papers next.

paper

The following order of papers is the order in which I think it is acceptable for reading papers, for your reference. The introduction of each paper has a link to the original text and the blogger's own interpretation of the paper at the beginning.
Insert image description here

DeepFuse

DeepFuse paper link
DeepFuse paper interpretation
I still tend to put this paper in the first article. It is a very classic article. You can start reading from this article.

Insert image description here

DenseFuse

DenseFuse paper link
DenseFuse paper interpretation
After reading DeepFuse, you can start to read DenseFuse. The network structure of DenseFuse is very similar to DeepFuse. The innovation is that DenseFuse integrates DenseNet into the Encoder (decoder), which greatly reduces the encoding Information loss during the process.

The network structure in the figure below is an autoencoder. The so-called autoencoder has an encoder (Encoder), a decoder (Decoder) and an intermediate layer (Fusion Layer). The advantage of this network structure is that the encoder and decoder can be trained separately. After the training of both is completed, the appropriate middle layer can be selected.

What is the point of training separately?

The size of the data set we used in the early stage was very small, which led to the phenomenon of overfitting. So how can we improve the generalization ability of the network? At this time, we can use data sets with a large amount of data to train the encoder and decoder first, so that they have strong feature extraction capabilities and image restoration capabilities, and then finally merge them into our middle layer, and the fusion performance will be There will be a big improvement.
Insert image description here

RFN-Nest

RFN-Nest paper link
RFN-Nest paper interpretation
At this point, you may feel that the jump is a bit big, because the time of this paper is quite different from the previous two, but you can definitely understand it. If you have any questions, you can look at NestFuse first, but the blogger just looked at this one, so it’s still okay.

Looking at the network structure, you will find that it seems to have nothing to do with the previous one, but in fact, the network structure is still an autoencoder structure. The difference is that the encoder and fusion layer use a multi-scale network structure, and the fusion layer is no longer manual. designed using neural networks instead . For detailed content, please refer to the original text and interpretation.
Insert image description here

FusionGAN

FusionGAN paper link
FusionGAN paper interpretation
After reading so many papers based on autoencoders, are you a little tired? Let's change it and look at a new idea. At this point, we have to talk about it. Boss Ma Jiayi introduced GAN to infrared and light image fusion for the first time. I can only say invincible.
Insert image description here

DDcGAN

DDcGAN paper link
DDcGAN paper interpretation

FusionGAN's single discriminator will lead to an imbalance of information in the infrared and visible light images in the fused image, so the experts developed a dual discriminator. It aims to make the information in the fused image more balanced.
Insert image description here

AttentionFGAN

AttentionFGAN paper link
AttentionFGAN paper interpretation

Continuing the dual discriminator paper, an attention mechanism is introduced to determine whether the fused image contains visual texture information and target information in the infrared image by comparing the features of the area we want to pay attention to.
Insert image description here

GANMcC

GANMcC paper link
GANMcC paper interpretation
The dual discriminator needs to set a reasonable cycle training strategy. Once the strategy is set incorrectly, it will also lead to information imbalance. How to set a reasonable strategy is extremely challenging. Therefore, the big guys developed a multi-class discriminator to achieve information balance based on a single discriminator.
Insert image description here

Summarize

Time is limited, so I will introduce these articles first. For more paper interpretations, please pay attention to the image fusion column.
》》Image Fusion Column》《

If you have any questions in the field of image fusion, please feel free to message me privately, or contact me through the public account.

Insert image description here

Guess you like

Origin blog.csdn.net/qq_43627076/article/details/132516311