Deep Learning 7: Generative Adversarial Networks – Generative Adversarial Networks | GAN

Generative confrontation network – GAN is an unsupervised algorithm that has been very popular in the past 2 years. It can generate very realistic photos, images and even videos. It will be used in the photo processing software in our mobile phone.

Table of contents

The basic principle of generative confrontation network GAN

vernacular version

non vernacular version

The first stage: fix the "discriminator D" and train the "generator G"

The second stage: fix the "generator G" and train the "discriminator D"

Phase 1 and Phase 2 of the cycle

Advantages and disadvantages of GAN

Top 10 Typical GAN ​​Algorithms

13 Practical Applications of GANs


Manual feature extraction - automatic feature extraction

The most special and powerful thing about deep learning is the ability to learn feature extraction by yourself.

The super computing power of the machine can solve many problems that cannot be solved by humans. After automation, the learning ability is stronger and the adaptability is stronger.

Manually judge whether the generated results are good or bad——automatic judgment and optimization

The training set requires a large amount of manually labeled data, which is costly and inefficient. The same is true for the quality of the results generated by manual judgment, which has the problems of high cost and low efficiency.

However, GAN can automatically complete this process and continuously optimize it. This is a very efficient and low-cost method. How is GAN automated ? Let's explain his principle below.

The basic principle of generative confrontation network GAN

vernacular version

There is a very good explanation on Zhihu, which everyone should be able to understand:

Assuming that the law and order in a city is chaotic, there will be countless thieves in the city soon. Among these thieves, some may be master theft, and some may have no skills at all. If the city starts to improve its law and order, a "campaign" against crime is suddenly launched, and the police begin to resume patrolling in the city. Soon, a group of "not skilled" thieves will be caught. The reason why the unskilled thieves were caught was because the police were not skilled enough. After catching a group of low-end thieves, it is hard to say how the city's security level will become, but it is obvious that the city The average level of thieves here has been greatly improved.

The police began to continue training their crime-solving skills and began to catch the more and more cunning thieves. With the arrest of these professional habitual offenders, the police have also developed special skills. They can quickly spot suspicious persons from a group of people, so they go forward to interrogate and finally arrest the suspects; the life of thieves is also difficult. , because the level of the police has been greatly improved, if you still want to behave sneakily like before, you will be caught by the police soon.

non vernacular version

Generative confrontation network (GAN) consists of 2 important parts:

  1. Generator (Generator ): Data (in most cases, images) is generated by a machine in order to "fool" the discriminator
  2. Discriminator (Discriminator ): To judge whether this image is real or machine-generated, the purpose is to find out the "fake data" made by the generator

The process is described in detail below:

The first stage: fix the "discriminator D" and train the "generator G"

We use an OK discriminator, let a "generator G" continuously generate "fake data", and then give this "discriminator D" to judge.

In the beginning, "Generator G" was still weak, so it was easy to find out.

However, with continuous training, the skills of "Generator G" continued to improve, and finally deceived "Discriminator D".

At this time, the "discriminator D" is basically in the state of blind guessing, and the probability of judging whether it is fake data is 50%.

The second stage: fix the "generator G" and train the "discriminator D"

When the first stage is passed, there is no point in continuing to train the "generator G". At this time, we fix the "generator G", and then start training the "discriminator D".

"Discriminator D" has improved its discrimination ability through continuous training, and finally he can accurately judge all fake pictures.

By this time, "Generator G" has been unable to fool "Discriminator D".

Phase 1 and Phase 2 of the cycle

Through continuous loops, the capabilities of "Generator G" and "Discriminator D" are getting stronger and stronger.

In the end we got a very good "generator G", we can use it to generate the picture we want.

The following practical application section will show many "amazing" cases.

If you are interested in the detailed technical principles of GAN, you can take a look at the following two articles:

A Beginner's Guide to Generative Adversarial Networks (GAN) – With Code

" Long text explaining the detailed principle of generating confrontation network GAN (20 minutes reading) "

Advantages and disadvantages of GAN

3 advantages

  1. Better modeling of data distribution (sharper, clearer images)
  2. In theory, GANs can train any kind of generator network. Other frameworks require the generator network to have some specific functional form, such as the output layer being Gaussian.
  3. There is no need to use Markov chains for repeated sampling, no need for inference during the learning process, no complicated variational lower bounds, and avoiding the problem of approximate calculation of difficult probabilities.

2 defects

  1. Difficult to train, unstable. A good synchronization is required between the generator and the discriminator, but it is easy for D to converge and G to diverge in actual training. D/G training requires careful design.
  2. Mode Collapse problem. During the learning process of GANs, there may be missing patterns, and the generator begins to degenerate, always generating the same sample points, and cannot continue to learn.

Top 10 Typical GAN ​​Algorithms

There are hundreds of GAN algorithms, and everyone’s research on GAN is increasing exponentially. Currently, there are hundreds of forums about confrontation networks every month.

The figure below shows the number of papers published on GAN each month:

Papers on GANs are growing exponentially

If you are interested in the GANs algorithm, you can view almost all algorithms in the " GANs Zoo ". We have selected 10 representative algorithms from many algorithms for you, and technical personnel can read his papers and codes.

algorithm paper the code
HOWEVER Paper address code address
DCGAN Paper address code address
CGAN Paper address code address
CycleGAN Paper address code address
CoGAN Paper address code address
ProGAN Paper address code address
WRONG Paper address code address
SAGAN Paper address code address
BigGAN Paper address code address

The above content is compiled from the original text of " Generative Adversarial Networks – The Story So Far ". There are some rough descriptions of the algorithm. If you are interested, you can take a look.

13 Practical Applications of GANs

GAN does not seem as intuitive as "speech recognition" and "text mining". But his application has entered our lives. Here are some practical applications of GAN.

Generate an image dataset

The training of artificial intelligence requires a large amount of data sets. If all of them are collected and labeled manually, the cost will be very high. GAN can automatically generate some data sets and provide low-cost training data.

Vector algorithm case of GANs generating faces

generate face photo

Generating face photos is an application that everyone is familiar with, but what to do with the generated photos is a question that needs to be considered. Because this kind of face photo is still on the verge of law.

Examples of progress in the capabilities of GANs from 2014 to 2017

Generate photos, cartoon characters

GAN can generate not only human faces, but also other types of photos, and even comic characters.

Photos generated by GANs

Comic characters generated by GANs

Image to Image Conversion

Simply put, it is to convert one form of image into another form of image, just like adding a filter. For example:

  • Convert drafts to photos
  • Convert satellite photos to images for Google Maps
  • Convert photos into oil paintings
  • turn day into night

Example from sketch to color photo with pix2pix

GANs application - photo to oil painting, horse to zebra, winter to summer, photo to google map

Text to Image Conversion

In particular their StackGAN generates photorealistic photos from text descriptions of simple objects like birds and flowers.

Examples of textual descriptions of birds and GAN-generated photos from StackGAN

Semantic-Image-Photo Conversion

In a 2017 paper titled "  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs  ," it was demonstrated to use conditional GANs to generate realistic images given a semantic image or sketch as input.

Examples of semantic images and GAN-generated cityscape photos

Automatically generate models

In a 2017 paper titled "  Pose-Guided Human Image Generation  ," mannequins can be automatically generated, with new poses.

GAN generates new model poses

Photos to Emojis

GANs can automatically generate corresponding expressions (Emojis) from face photos.

Celebrity photos and GAN-generated emoji examples

photo editing

Specific photos can be generated using GANs, such as changing hair color, changing facial expressions, or even changing gender.

Effects of photo editing with IcGAN

Predict looks at different ages

Given a photo of a face, GAN can help you predict what you will look like at different ages.

Example of face photos generated with GAN with different apparent ages

Increase photo resolution to make photos clearer

Give GAN a photo, and he can generate a higher-resolution photo, making the photo clearer.

GANs add resolution to raw photos to make them sharper

photo restoration

If there is a problem with an area of ​​the photo (such as being painted or erased), GAN can repair this area and restore it to its original state.

Covering the middle part of the photo, GANs can be well repaired

Automatically generate 3D models

Given multiple 2D images from different angles, a 3D model can be generated.

The process of building from 2D image to 3D chair model

Generative Adversarial Networks (GAN, Generative Adversarial Networks)

It is a deep learning model and one of the most promising methods for unsupervised learning on complex distributions in recent years. The model produces quite good output through the mutual game learning of (at least) two modules in the framework: Generative Model and Discriminative Model . In the original GAN ​​theory, both G and D are not required to be neural networks, but only need to be able to fit the corresponding generation and discrimination functions. However, in practice, deep neural networks are generally used as G and D. An excellent GAN application requires a good training method, otherwise the output may be unsatisfactory due to the freedom of the neural network model.

Generative Adversarial Networks (GANs) are a class of artificial intelligence algorithms for unsupervised machine learning, implemented by two neural network systems competing against each other in a zero-sum game framework. They were introduced by Ian Goodfellow et al. In 2014 this technique could generate photos that looked at least superficially real to the observer of the person, with many realistic features (although the person in the test could actually be told in many cases).

Guess you like

Origin blog.csdn.net/qq_38998213/article/details/132516284