GAN principle

A variety of divergence

entropy

Here Insert Picture Description
The amount of information carried by the distribution P
/
using the minimum number of bytes required to encode P-based distribution of the sample P

Cross entropy

Here Insert Picture Description
P information distribution from the perspective of view the distribution of Q
/
use sample based on the distribution P Q "average code length" required to encode the desired
why cross entropy loss can be used to measure? Reference
training sample entropy distribution P is constant, equal to a minimum cross entropy minimization of KL divergence, i.e. the amount of information with the current distribution to fit the training data loss distribution.

KL tide

Here Insert Picture Description
Here Insert Picture Description
Non-negative asymmetry

Q distribution using approximate amount of loss of information when the distribution of the P
/
based encoding Q "extra length required to code" P sample distribution.

JS divergence

Here Insert Picture Description
The more similar the smaller the symmetry between 0-1

GAN principle

The loss of the original GAN ​​discriminator defined, we can obtain the optimal form of the discriminator; in the optimum discriminator, can define the original GAN ​​generator into an equivalent loss minimize the real distribution P r P_r And distributed generation P g P_g JS divergence between.
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Fixed G, D optimum is determined, and then substituting max DV (G, D), to give the JS divergence, minimum -2log2
minimize the above formula, i.e., JS divergence optimized, then there mustHere Insert Picture Description

Training problems

  1. G, D Training on each other
    after the update G, JS divergence does correspond to a smaller, but also affects the V (G, D) curve, and that the next MAXV (G, D) may become large, and that is D the ability to fit both the distribution worse
    solution updated multiple times D, G updated
  2. JS divergence problem solving plus-noise
    picture is made of low-dimensional vector to generate high-dimensional, since P r P_r versus P g P_g Almost impossible to have a non-negligible overlap, so that no matter how far apart they are constants JS divergence log 2 \log 2 , eventually leading to the gradient generator (approximately) is 0, the gradient disappears.
  3. Improved generator loss leads to instability & collapse mode diversity shortage Here Insert Picture DescriptionHere Insert Picture Description
    Here Insert Picture Description
    equal to minimize Here Insert Picture Description
    but also minimize KL, but also to maximize JS gradient instability

KL earlier problems: Asymmetric
Here Insert Picture Description
first generation is no real sample data set exists, the second is the error generated no real data in the sample, then I would prefer not to generate diversity sample, not trial and error.

Wgan

Earth-Mover (EM) distance

And W (P_r, P_g) is the "minimum consumption" under the "optimal path planning."
In all possible joint distribution, seeking real samples and generate the desired sample distance, taking the desired lower bound.
That is, the optimal joint distribution, Pr moved to the minimum consumption of Pg.
Wasserstein compared KL divergence distance, the superiority of JS divergence is that, even if the two distributions do not overlap, the distance still to reflect Wasserstein distance thereof.

Wgan

Here Insert Picture Description
Real samples taken for f (x), to generate a sample is taken -f (x) of the sector, there are restrictions on the gradient parameter w.
Here Insert Picture Description
Laplace continuousHere Insert Picture Description

The difference between the original GAN:
1. loss function
Here Insert Picture Description

  1. Laplace parameter truncated to meet conditions
    Here Insert Picture Description

  2. Removing the sigmoid discriminator
    because the original D (x) is 0, the value fit, and where the fitting is Wassertain discriminator distance.

Relativistic GANs

Here Insert Picture Description
Here Insert Picture Description

Published 35 original articles · won praise 2 · Views 1417

Guess you like

Origin blog.csdn.net/qq_30776035/article/details/104694112