1, p (x) is the probability distribution of the training data set, without having to know in advance, the network will be automatically learning, p (x) is derived for convenience only and proof formula.
2, the discriminator learning real data distribution p R & lt (X), the generator generates a false distribution p G (X), and minimizing the distance of the two distributions, the Nash equilibrium is reached, p R & lt (X) will be very close P G (X),
3, the training function