GAN (Generative Adversarial Network) has a practical book published. Understand?

What is GAN

GAN is a type of machine learning technology composed of two simultaneously trained models: one is a generator, which is trained to generate fake data; the other is a discriminator, which is trained to recognize fake data from real data.

The term generative indicates the overall goal of the model-to generate new data. The data generated by GAN through learning depends on the selected training set. For example, if we want to use GAN to synthesize a painting that looks like Da Vinci's work, we have to use Da Vinci's work as the training set.

The term adversarial refers to the two dynamic game and competitive models that constitute the GAN framework: generator and discriminator. The goal of the generator is to generate fake data that is indistinguishable from the real data in the training set-in the example just now, this means that it can create paintings that are the same as Leonardo’s paintings. The goal of the discriminator is to distinguish which is the real data from the training set and which is the pseudo data from the generator. In other words, the discriminator acts as an art appraisal expert, assessing the authenticity of works considered to be Da Vinci's paintings. These two networks are constantly fighting against each other, trying to deceive each other: the more realistic the fake data generated by the generator, the stronger the discriminator's ability to distinguish between authenticity and fake.

The term network refers to the type of machine learning model most commonly used in generators and discriminators: neural networks. Depending on the complexity of the GAN implementation, these networks range from the simplest feedforward neural network (Chapter 3) to convolutional neural network (Chapter 4) and more complex variants (such as U-Net in Chapter 9) .

How does GAN work

The mathematical theory supporting GAN is more complex (we will focus on the discussion in the next few chapters, especially Chapter 3 and Chapter 5). Fortunately, we have many real-world examples that can be used as analogies to make GANs It's easier to understand. Earlier we discussed an example of an art counterfeiter (generator) trying to fool an art appraisal expert (identifier). The more realistic the fake paintings made by the counterfeiters, the stronger the identification experts must have the ability to distinguish the authenticity. The reverse is also true: the better at judging whether a painting is true or not, the more the counterfeiter has to improve the counterfeiting technology to avoid being spotted on the spot.

There is also a metaphor often used to describe GAN (an example that Ian Goodfellow often likes to use), counterfeit money maker (generator) and detective who tries to arrest him (discriminator)-the more real the counterfeit money looks, the better it needs to be. Detectives can distinguish them, and vice versa.

In more professional terms, the goal of the generator is to generate samples that can capture the features of the training set to the greatest extent possible, so that the generated samples are the same as the training data. The generator can be regarded as a reverse object recognition model-the object recognition algorithm learns the patterns in the image in order to be able to recognize the content of the image. The generator is not to recognize these patterns, but to learn to create them from scratch. In fact, the input of the generator is usually just a vector of random numbers.

The generator continuously learns by receiving feedback from the classification results of the discriminator. The goal of the discriminator is to determine whether a particular sample is real (from the training set) or fake (generated by the generator). Therefore, every time the discriminator is "caught" and misjudges a fake image as a real image, the generator will know that it is doing a good job; on the contrary, whenever the discriminator correctly distinguishes the fake image generated by the generator , The generator will receive feedback that it needs to continue to improve.

The discriminator will continue to improve. Like other classifiers, it will learn from the deviation between the predicted label and the true label (true or false). So as the generator can better generate more realistic data, and the discriminator can better distinguish between true and false data, both networks are constantly improving at the same time.

Table 1.1 summarizes the key information of the two sub-networks of GAN.

GAN (Generative Adversarial Network) has published a practical book, what about it?

 

Which is the actual combat book?

GAN (Generative Adversarial Network) has published a practical book, what about it?

 

GAN combat

This book aims to guide people who are interested in generative adversarial networks (GAN) to learn from scratch. This book starts from the simplest example, introduces some of the most innovative GAN implementation and technical details, and then makes an intuitive explanation of these research progress, and fully presents all the content involved (excluding the most basic mathematics And principles), making the most cutting-edge research within reach.

The ultimate goal of this book is to provide the necessary knowledge and tools, so that you can not only fully understand the achievements of GAN so far, but also have the ability to freely choose and develop new applications. The model of generative confrontation is full of potential, waiting for people like you who are enterprising and want to make achievements in academic research and practical applications to explore! You are welcome to join our GAN journey.

Suitable for the crowd

This book is suitable for readers who already have some experience in machine learning and neural networks. Below is a list of what readers should know in advance under ideal circumstances. Although this book tries its best to make the content easy to understand, you should at least have confidence in the following 70% of the knowledge.

  • To be able to run the Python programs in the book, you do not need to be proficient in Python, but you should have at least two years of Python work experience (preferably a full-time data scientist or software engineer background).
  • Understand object-oriented programming, how to use objects, and how to find out their properties and methods; understand typical Python objects (such as Pandas DataFrame) and atypical objects (such as Keras layer).
  • Understand the basics of machine learning theory, such as training set and test set segmentation, overfitting, weights and hyperparameters, as well as supervised learning, unsupervised learning and reinforcement learning. Familiar with indicators such as accuracy and mean square error.
  • Understand basic statistics and calculus knowledge, such as probability, density function, probability distribution, differentiation and simple optimization.
  • Understand basic linear algebra knowledge, such as matrices and high-dimensional spaces, as well as the concept of principal component analysis.
  • Understand the basics of deep learning, such as feedforward networks, weights and biases, activation functions, regularization, stochastic gradient descent, and back propagation.
  • Basic knowledge or self-study of Python-based machine learning library-Keras is required.

The above requirements are not alarmist, but to ensure that you can make full use of the content of this book. Of course, you can learn anyway, but the less you know before, the more you need to search and learn online. If you feel that the above requirements are not enough, then start learning!

"GAN actual combat" code description

This book has many examples containing source code, placed in numbered lists or embedded in normal text. In both cases, the format of the source code is displayed in a monospace font style. Sometimes the code is highlighted in bold style to indicate a difference (with changes) from the previously shown code, such as adding a new function to an existing line of code.

This book has formatted most of the source code to fit the layout of the pages of this book. In addition, if there is an explanation of the code in the text, the comments in the source code are usually deleted. In order to highlight important concepts, the list is provided with notes. The code for the examples in this book can be downloaded from the "Supporting Resources" on the details page of the Asynchronous Community Book.

This book uses Jupyter Notebook, a standard tool for data science education, so you should master the usage of this tool first. This should not be difficult for intermediate Python learners. Sometimes it is difficult to access the GPU or make all the functions run normally, especially on Windows systems, so some chapters provide Google Colaboratory (abbreviated as Colab) Notebook. This is Google’s free platform, pre-packaged with necessary data science tools and a free GPU for a limited period of time. You can run these codes directly in the browser, or upload the codes of other chapters to Colab-they are compatible.

Online resources

GAN is an active field with good (albeit fragmented) resources. Those with academic inclination can find the latest related papers on the arXiv official website. arXiv is an online repository of electronic preprints of academic papers owned and operated by Cornell University.

The authors of this book are active contributors to the Medium writing platform (especially technology-centric publications  Towards  Data Science and Hacker Noon ), where you can find the latest content written by them.

Structure of this book

This book strives to strike a balance between theory and practice. The book is divided into three parts as follows.

The first part is an introduction to generative confrontation networks (GAN) and generative models. This part introduces the basic concepts of generative learning and GAN, and implements several of the most typical variants of GAN.

  • Chapter 1 introduces the Generative Adversarial Network (GAN) and explains its working principle at a high level. By studying the content of this chapter, you will understand that GAN is composed of two independent neural networks (generator and discriminator)-they are trained through dynamic competition. Mastering the knowledge in this chapter will lay the foundation for understanding the rest of the book.
  • Chapter 2 discusses the autoencoder, which is considered the predecessor of GAN in many respects. In view of the novelty of generative learning, adding this chapter will help put GAN in a broader context. This chapter gives the first code tutorial to build an autoencoder to generate handwritten numbers-we will also explore the same task in the GAN tutorials in the next few chapters. If you are already familiar with the content of autoencoder or want to study GAN directly, you can skip this chapter.
  • Chapter 3 delves into the theory behind GAN and adversarial learning. This chapter explains the main difference between GAN and traditional neural network, that is, discusses the difference in their cost function and training process. In the code tutorial at the end of this chapter, we will apply what we have learned to implement GAN in Keras and train it to generate handwritten digits.
  • Chapter 4 introduces convolutional neural networks and batch normalization. This chapter implements an advanced GAN structure, which uses convolutional networks as its generator and discriminator, and uses batch normalization to stabilize the training process.

The second part of GAN's frontier topics

On the basis of the first part, this part deeply studies the basic theory of GAN and implements a series of advanced GAN architectures.

  • Chapter 5 discusses many theoretical and practical obstacles to training GANs and methods to overcome these obstacles. This chapter provides a comprehensive overview of the preferred practices for training GAN based on related academic papers and speeches. It also covers the options for evaluating GAN performance and the reasons for worrying about this issue.
  • Chapter 6 explores the Progressively Growing Generative Adversarial Network (PGGAN), a cutting-edge training method for generators and discriminators. PGGAN achieved very good image quality and resolution by adding new layers during the training process. This chapter gives real code examples, and uses TensorFlow Hub (TFHub) to explain its working principle theoretically and practically.
  • Chapter 7 continues to explore innovations based on the core GAN model. You will understand the huge practical significance of improving the classification accuracy by using only a small part of the labeled training samples through semi-supervised learning. Use this chapter to implement a semi-supervised generative adversarial network (SGAN) and explain how it uses labels to transform the discriminator into a robust multi-class classifier.
  • Chapter 8 shows another GAN architecture that uses labels in training. By using labels or other conditional information when training the generator and discriminator, the Conditional Generative Adversarial Network (CGAN) solves one of the main shortcomings of the generator-the inability to clearly specify the samples to be synthesized. At the end of this chapter, a CGAN is implemented to directly view the generation of target data.
  • Chapter 9 discusses one of the most interesting GAN architectures: Cycle-Consistent Generative Adversarial Network (CycleGAN). This technology can convert one image into another image, for example, converting an image of a horse into an image of a zebra. This chapter introduces the architecture of CycleGAN and explains its main components and innovations. In the tutorial, CycleGAN is used to convert apples to oranges (you can also convert oranges to apples).

Where is the third part

This section discusses how to apply GAN and adversarial learning and where to apply them.

  • Chapter 10 introduces adversarial examples. Adversarial samples are a technique that deliberately deceives machine learning models to make mistakes. This chapter discusses their importance from theoretical and practical levels, and discusses their relationship with GAN.
  • Chapter 11 introduces the practical application of GAN, and explores how to apply the techniques introduced in the previous chapters to practical use cases in the field of medicine and fashion: in the field of medicine, how to use GAN to expand small data sets to improve classification accuracy; in fashion In the field, how GAN promotes the development of personalized customization.
  • Chapter 12 summarizes the main gains of GAN so far, discusses the related moral and ethical considerations of GAN, and introduces some emerging GAN technologies.

Why learn GAN

Since its invention, GAN has been hailed as "one of the most important innovations in deep learning" by experts in academia and industry. Yann LeCun, Facebook's head of artificial intelligence research, even stated that GAN and its variants are "the coolest idea in deep learning in the past 20 years." [2]

This excitement is reasonable. Other advances in the field of machine learning may be well-known among researchers, but for laymen, there may be more doubts than excitement. GAN has aroused great interest from researchers to the public-including the "New York Times", BBC , "Scientific American" and many other well-known media organizations, it may even be a result of GAN that drove you to buy this book. (Right?)

Perhaps the most noteworthy is the ability of GAN to create surrealist imagery. The faces shown in Figure 1.4 are not real people, they are all fake, which demonstrates the ability of GAN to synthesize real images with real photos. These faces are generated using a progressive growth generation confrontation network. For related content, see Chapter 6.

GAN (Generative Adversarial Network) has published a practical book, what about it?

 

(Source: Progressive Growing of GAN for Improved Quality, Stability and Variation , by Tero Karras et al., 2017.)
Figure 1.4 These realistic but false faces are generated by a progressive GAN trained on a collection of high-resolution celebrity portrait photos of

GAN Another remarkable achievement is converting the image to image (image-to-image translation) . Similar to the way sentences are translated from Chinese to Spanish, GAN can transform images from one style to another. As shown in Figure 1.5, GAN can convert an image of a horse into an image of a zebra, and turn a photo into a painting of Monet, which requires almost no supervision and no labels. The variant of GAN that makes this possible is the Cyclic Consistency Generative Adversarial Network (CycleGAN), see Chapter 9 for related content.

The more practical GAN ​​application is also fascinating. Amazon, the online retail giant, is trying to use GAN to provide fashion advice: by analyzing countless combinations, the system can learn to generate new products in any given style. [3] In medical research, GAN enhances the data set by synthesizing samples to improve diagnosis accuracy. [4] After mastering the details of training GAN and its variants, we will discuss these two applications in detail in Chapter 11.

GAN (Generative Adversarial Network) has published a practical book, what about it?

 

(Source: Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , by Jun-Yan Zhu et al., 2017.)
Figure 1.5 By using a GAN variant called CycleGAN, Monet’s paintings can be turned into photos , Or turn the zebra in the picture into a horse; vice versa

GAN is also regarded as an important cornerstone for the realization of general artificial intelligence [5] . It is an artificial system that can match the cognitive abilities of enemies, and can acquire professional knowledge in almost any field-from the motor skills needed to walk to the language expression skills, and even the creative skills needed to write poetry.

However, having the ability to generate new data and new images makes GANs sometimes dangerous. The dissemination of fake news and its dangers are already commonplace, and the ability of GANs to generate credible fake videos is also disturbing. At the end of an article about GAN in 2018-the title of this article is very appropriate "How to be an artificial intelligence"-"New York Times" reporters Cade Metz and Keith Collins talked about the worrying prospect: GAN may It is used to create and disseminate credible misinformation, such as false video clips of statements made by world leaders. Martin Giles, director of the San Francisco bureau of the Massachusetts Institute of Technology Review, also expressed his concerns. In the article "The Father of GAN: The Man Who Gives Machine Imagination" published in 2018, he mentioned that in the hands of skilled hackers GAN may be used to explore and exploit system vulnerabilities on an unprecedented scale. These worries prompted us to discuss the ethical considerations of the application of GAN (Chapter 12).

GAN can bring many benefits to the world, but any technological innovation is a double-edged sword. In this regard, we must have a philosophical awareness: it is impossible to "get rid of" a technology, so it is important to ensure that people like you understand the rapid rise of this technology and its huge potential.

This book can only touch on some of the functions that can be achieved by applying GAN, but we hope that this book can provide you with the necessary theoretical knowledge and practical skills, so that you can continue to explore your most interesting areas from all aspects.

Without further ado, let's get started!

Guess you like

Origin blog.csdn.net/epubit17/article/details/114686470