Generative confrontation network GAN preliminary learning

Generative confrontation network GAN is a hot research direction today, which was proposed by Goodfellow in 2014.
To learn GAN, you should first understand what GAN does and why you want to learn GAN.
The original intention of GAN is to generate data that does not exist in the real world, similar to making AI creative or imaginative. The application scenarios are as follows:
1. AI writers, AI music, can generate similar styles of tracks, AI painters and other AI objects that require creativity
2. Clear blurred images (remove rain, fog, shake, mosaic, etc.) , which requires AI to have the so-called imagination and be able to make up the plot.
3. Data enhancement, generating more new data for feed based on existing data, can slow down the phenomenon of model overfitting.

Introduction to the principle of GAN

The original GAN ​​algorithm is introduced here. Although it has some shortcomings, it provides a new way of generating confrontation.
Understand the two major guardians of GAN, G and D.
G is the generator, the generator: responsible for fabricating data out of thin air,
D is the discriminator, and the discriminator: responsible for judging whether the data is true
. This can be simply regarded as a game process between two networks. In the most original GAN ​​paper, G and D are two multi-layer perceptron networks. First of all, note that the data of GAN operation does not necessarily have to be image data, but for easier explanation, we use image data as an example to explain the following GAN here: explain the above picture a little bit, z is random noise (that is, some randomly

generated Data, which is the source of GAN-generated images) D conducts a two-category neural network training through the data of real and fake images (equivalent to natural labels). G can fabricate a "fake image" based on a string of random numbers, and use these fake images to deceive D. D is responsible for distinguishing whether it is a real image or a fake image, and will give a score. For example, G generates an image. D here has a high score, which proves that G is very successful; if D can effectively distinguish between true and false images, then the effect of G is not very good, and the parameters need to be adjusted. GAN is such a game process.
So how is GAN trained?
According to the GAN training algorithm,
there is a picture below.


GAN training can be subdivided into 2 steps in the same round of gradient inversion process. First train D and then train G; note that it is not waiting for all D to be trained Later, the training of G is started, because the training of D also needs the output value of G in the last round of gradient inversion as input.
when training DThe picture generated by the last round of G, and the real picture are directly stitched together as x, and then according to, 0 and 1 are placed in order, the fake picture corresponds to 0, and the real picture corresponds to 1. Then it can be generated by inputting x A score (from 0 to 1), through the loss function composed of score and y, the gradient can be reversed.
When training G, you need to treat G and D as a whole. The output of this whole is still score. Input a set of random vectors to generate a picture in G, and score the generated picture through D. It is the forward process of the DG system. score=1 is the goal that the DG system needs to optimize. The difference between score and y=1 can form a loss function, and then the gradient can be backpropagated. Note that the parameters of DG here are not available. training, so as to ensure that the training of G is in line with the scoring standard of D, which is like: if you take an exam, you don’t expect to be able to change the teacher’s scoring standard. It should be noted that the entire process of GAN is
infinite . Supervision
means that the real picture given has not been manually marked. You only know that it is a real picture, for example, it is full of faces, and D in the system does not know what the picture is. He only needs to distinguish true and false. G doesn't know what it is generating. Anyway, it just learns the appearance of real pictures to deceive D.
It is precisely because of the unsupervised GAN. During the generation process, G will generate some weird pictures according to its own ideas. Yes D can also give a very high score. For example, generate distorted images of human faces. This is caused by the lack of unsupervised purpose. Therefore, at the NIPS conference in the same year, there was a paper CONDITIONAL GAN that added supervision, enhanced controllability, and performed well.

Guess you like

Origin blog.csdn.net/code_Joe123/article/details/89380172