Generate a network paper reading: PGGAN (1): A quick look at the paper

Here is a quick overview, if you want to know more detailed papers please refer to:

1. What problem does the research solve?

The author is always solving a problem here, that is, the resolution of the generated image is relatively low.

2. Solutions

2.1 Analysis why high resolution is not possible

  • 1. Difficulty in training The author’s meaning: When high-resolution images are trained at the beginning, the discriminant model can easily identify which one is generated and which one is real, so it is not stable enough when it is returned. I personally feel that the main situation here is because the difference is too large at the beginning, the model suddenly loses its direction, and I don't know what to do to generate a picture that looks like the real one. (It is too difficult to directly generate a picture that looks like a real picture, so a process is required)
  • 2. Problems with small batches Due to resolution issues, a single image takes up a lot of memory, so a batch cannot have too many images. There is a problem with small batches, that is, each optimization may not represent the real optimization direction. (To the extreme, there is only one picture in a batch, so this optimization cannot represent the situation of other pictures.)
  • The author mainly solved the problem of poor training

2.2 How to solve the training problem

    1. Since training cannot be performed when the resolution is high, start training at a position with a low resolution, and then gradually increase. As shown in the figure below, the author starts training with a small resolution, and then gradually increases the resolution:
      insert image description here
    1. Then how to transition becomes the key at this time, as shown in the figure below, from 16 resolution to 32 resolution, here if we directly change from (a) to (c), then there is a problem that if the newly added 32 if At the beginning, the random number was not drawn well (the parameters of the model were randomly generated), so the good results of the previous training were all wasted, so the author added a (b) process.
      insert image description here
    1. What exactly does this (b) process do? First obtain a parameter α, which determines the impact of the newly added network on the result, which is relatively small at first, and then gradually increases with training. It can be seen that this makes the transition to the next layer softer.

Other things mentioned in the article that need attention

The problem of GAN network training collapse

The article mentioned that when the GAN network training collapses, it is often when the gradient increases sharply, so it can be blocked when this happens

Guess you like

Origin blog.csdn.net/qq_43210957/article/details/126937359