DataWhale team punch learning camp task10-2 GAN generated against network

Generative Adversarial Networks
In most of the book, we have discussed how to predict. In some form, we used the label mapping from data point to the depth of the neural network learning. This study is called discrimination learning, for example, we want to be able to distinguish between cats and dogs in the photo in the photo. Classification and regression are an example of discrimination learning. Subversion by back-propagation neural network training we believe that all knowledge discriminant learning about large and complex data sets. In just 5-6 years, the classification accuracy of high-resolution images from the useless into a human level (some warning). We will not trouble you, because all the other tasks determine the depth of neural networks are doing so well.

But there is more to machine learning than just solving discriminative tasks. For example, given a large dataset, without any labels, we might want to learn a model that concisely captures the characteristics of this data. Given such a model, we could sample synthetic data points that resemble the distribution of the training data. For example, given a large corpus of photographs of faces, we might want to be able to generate a new photorealistic image that looks like it might plausibly have come from the same dataset. This kind of learning is called generative modeling. However, machine learning, not just to solve distinguish between tasks. For example, given a large data set, without any label, we might want to learn a simple model to capture this data characteristics. Given such a model, we can sample the integrated data point is similar to the training data distribution. For example, given a large number of photographs of faces, we might want to be able to generate new realistic images, it looks as if it might come from the same data set. This learning model is called generation.

Until recently, we had no method that could synthesize novel photorealistic images. But the success of deep neural networks for discriminative learning opened up new possibilities. One big trend over the last three years has been the application of discriminative deep nets to overcome challenges in problems that we do not generally think of as supervised learning problems. The recurrent neural network language models are one example of using a discriminative network (trained to predict the next character) that once trained can act as a generative model. until recently, we have not It can be synthesized a novel method of realistic images. However, the depth of neural networks for successful discrimination learning opens up new possibilities. In the past three years, a major trend is the application of the Deep Web to distinguish overcome our problems are not usually considered supervised learning challenges. Recurrent neural network model is the use of language discrimination network (after training at a predictable character) an example, the network can act as a once trained generation model.

In 2014, a breakthrough paper introduced Generative adversarial networks (GANs) Goodfellow.Pouget-Abadie.Mirza.ea.2014, a clever new way to leverage the power of discriminative models to get good generative models. At their heart, GANs rely on the idea that a data generator is good if we cannot tell fake data apart from real data. In statistics, this is called a two-sample test - a test to answer the question whether datasets X={x1,…,xn} and X={x1,…,xn} Were drawn from the same distribution. The main difference between most statistics papers and GANs is that the latter use this idea in a constructive way. In other words, rather than just training a model to say "hey, these two datasets do not look like they came from the same distribution " , they use the two-sample test to provide training signals to a generative model. This allows us to improve the data generator until it generates something that resembles the real data. At the very least, it needs to fool the classifier. Even if our classifier is a state of the art deep neural network. 2014 years a breakthrough paper introduces Generative Adversarial Network (GANs) Goodfellow.Pouget- Abadie.Mirza.ea.2014, this is a species using the power of the discriminant model to obtain new smart method to generate a good model. GAN's core idea is that if we can not separate the false data and real data area, the data generator is very good. In statistics, this is called the two sampling - reply data set {X = X . 1 , ..., X n- } and X' = {X ' . 1 , ..., X ' n- } from the same distribution. Most statistical documents and that the main difference between GAN, the latter in a constructive way to use this idea. In other words, they do not just train the model said, "Hey, it looks as if the two data sets are not from the same distribution," but the use of two test samples provide training signal is generated by the model. This allows us to improve data generator until it generated content similar to real data so far. At least, it needs to fool the classifier. Even if our classification is the most advanced neural network depth.
Here Insert Picture Description
GAN architecture can be seen. As you can see, GAN architecture has two parts - first of all, we need a device (for example, deep network, but may actually be anything, such as a game rendering engine), which may be able to generate that looks like the real thing data. If the image to be processed, it is necessary to generate an image. If you want to handle voice, you need to generate an audio sequence, and so on. We call generator network. The second part is the network discriminator. It tried to distinguish falsified data and real data area. The two networks compete with each other. Builder network attempts to deceive the discriminator network. In this case, the discriminator will adapt to the new network falsified data. This information is then used to improve the network generators, and the like.

The discriminator is a binary classifier to distinguish if the input x is real (from real data) or fake (from the generator). Typically, the discriminator outputs a scalar prediction o ∈ R for input, such as using a dense layer with hidden size . 1, and the then Applies to Obtain The Sigmoid function Predicted Probability D (X) =. 1 / (E +. 1 -o ). The label Assume for Y The IS to true. 1 and Data 0 Data for The Fake. We Train The discriminator to Minimize the cross-entropy loss, ie, the discriminator is a binary classifier for distinguishing the input x is true (from real data) or forged (from generator). Typically, the input and output of discriminator scalar prediction o∈R, for example, the size of the hidden layer having a dense 1, S-shaped function is then applied to obtain a predicted probability D (x) = 1 / ( 1 + e ^ -o ^). Suppose y is real data of the tag 1, tag dummy data is zero. We train discriminator to minimize cross entropy loss, i.e.,
Here Insert Picture Description
the For The Generator, IT Parameter First Draws some ∈ Z R & lt DA Source of randomness from, EG, Normal Distribution A Z ~ N (0,1). We Often field The Call AS Z latent variable. It the then Applies to Generate A function X ' = G (Z). The Generator of The Goal of IS The discriminator to Classify Fool to X ' = G (Z) to true Data AS, IE, WE want D (G (Z)) ≈ 1. OTHER the In words, for GIVEN A discriminator D, WE The Update The Parameters of Generator G to maximize the cross-entropy loss when y = 0, ie, for the generator, it is first drawn from a number of parameters z∈R ^ d randomness source (e.g. a normal distribution z~N (0,1)) in ^. We usually referred to as a latent variable z. Then, it applies a function to generate X ' = G (Z). Object generator cheating discriminator, the X ' = G (Z) is classified as real data, i.e., we want D (G (z)) ≈1 . In other words, for a given discriminator D, we when y = 0, the update parameter generator G to maximize the cross-entropy loss, i.e.,
Here Insert Picture Description
the If The discriminator does Perfect Job A, the then D (X ') ≈ 0 so the above loss near 0, which results the gradients are too small to make a good progress for the generator So commonly we minimize the following loss:. If the determination is done perfectly, the D (X ' ) ≈0 , so the above losses close to 0, which leads the gradient is too small to bring good progress for the generator. Therefore, in general we will minimize the loss of the following:
Here Insert Picture Description
Which IS Just Feed X ' = G (Z) The discriminator INTO Giving label Y = 1. But this is just the X ' = G (Z) is fed into a discriminator, but given label y = 1.

To sum up, D and G are playing a "minimax" game with the comprehensive objective function: In summary, D and G are toys comprehensive objective function of the "Minimax" game:
Here Insert Picture Description
Many of Gans at The Applications are in at The context of images. As a demonstration purpose, we are going to content ourselves with fitting a much simpler distribution first. We will illustrate what happens if we use GANs to build the world's most inefficient estimator of parameters for a Gaussian. Let's get started. many GAN applications programs are in the image environment. For demonstration purposes, we will first meet to simplify simple release. We will illustrate the use of GAN establish if the least efficient parameter estimator Gaussian world, what will happen. let's start.

%matplotlib inline
import matplotlib.pyplot as plt
from torch.utils.data import DataLoader
from torch import nn
import numpy as np
from torch.autograd import Variable
import torch

Generate some “real” data

Since this is going to be the world's lamest example, we simply generate data drawn from a Gaussian. Since this will be the world's most complicated example, we need only to data generated from the Gaussian.

X=np.random.normal(size=(1000,2))
A=np.array([[1,2],[-0.1,0.5]])
b=np.array([1,2])
data=X.dot(A)+b

WE GOT See the What's the let. This Should BE A Gaussian shifted in some arbitrary Way The Rather Mean b and covariance with the Matrix A T A. Let's see what we got. It should be based on the mean and covariance matrix A b T Gauss A is in some way rather arbitrary displacement.

plt.figure(figsize=(3.5,2.5))
plt.scatter(X[:100,0],X[:100,1],color='red')
plt.show()
plt.figure(figsize=(3.5,2.5))
plt.scatter(data[:100,0],data[:100,1],color='blue')
plt.show()
print("The covariance matrix is\n%s" % np.dot(A.T, A))

Here Insert Picture Description
Here Insert Picture Description
The covariance matrix is
[[1.01 1.95]
[1.95 4.25]]

batch_size=8
data_iter=DataLoader(data,batch_size=batch_size)

Generator

Our generator network will be the simplest network possible -... A single layer linear model This is since we will be driving that linear network with a Gaussian data generator Hence, it literally only needs to learn the parameters to fake things perfectly our generation network would be the easiest network - single linear model. This is because we use a Gaussian data generator to drive the linear network. Therefore, it is actually just need to learn parameters can be perfectly forged things.

class net_G(nn.Module):
    def __init__(self):
        super(net_G,self).__init__()
        self.model=nn.Sequential(
            nn.Linear(2,2),
        )
        self._initialize_weights()
    def forward(self,x):
        x=self.model(x)
        return x
    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m,nn.Linear):
                m.weight.data.normal_(0,0.02)
                m.bias.data.zero_()

Discriminator

For the discriminator we will be a bit more discriminating:. We will use an MLP with 3 layers to make things a bit more interesting for the differentiator, we will be more discriminative: We will use the MLP with three layers to make things more interesting.

class net_D(nn.Module):
    def __init__(self):
        super(net_D,self).__init__()
        self.model=nn.Sequential(
            nn.Linear(2,5),
            nn.Tanh(),
            nn.Linear(5,3),
            nn.Tanh(),
            nn.Linear(3,1),
            nn.Sigmoid()
        )
        self._initialize_weights()
    def forward(self,x):
        x=self.model(x)
        return x
    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m,nn.Linear):
                m.weight.data.normal_(0,0.02)
                m.bias.data.zero_()

Training

First we define a function to update the discriminator. First, we define a function to update the discriminator.

# Saved in the d2l package for later use
def update_D(X,Z,net_D,net_G,loss,trainer_D):
    batch_size=X.shape[0]
    Tensor=torch.FloatTensor
    ones=Variable(Tensor(np.ones(batch_size))).view(batch_size,1)
    zeros = Variable(Tensor(np.zeros(batch_size))).view(batch_size,1)
    real_Y=net_D(X.float())
    fake_X=net_G(Z)
    fake_Y=net_D(fake_X)
    loss_D=(loss(real_Y,ones)+loss(fake_Y,zeros))/2
    loss_D.backward()
    trainer_D.step()
    return float(loss_D.sum())

The generator is updated similarly. Here we reuse the cross-entropy loss but change the label of the fake data from 0 to 1. Similar update generator. Here, we re-use cross-entropy loss, but the label dummy data changed from 0 to 1.

# Saved in the d2l package for later use
def update_G(Z,net_D,net_G,loss,trainer_G):
    batch_size=Z.shape[0]
    Tensor=torch.FloatTensor
    ones=Variable(Tensor(np.ones((batch_size,)))).view(batch_size,1)
    fake_X=net_G(Z)
    fake_Y=net_D(fake_X)
    loss_G=loss(fake_Y,ones)
    loss_G.backward()
    trainer_G.step()
    return float(loss_G.sum())

Both the discriminator and the generator performs a binary logistic regression with the cross-entropy loss. We use Adam to smooth the training process. In each iteration, we first update the discriminator and then the generator. We visualize both losses and generated examples.
Identification and generator perform binary logistic regression with cross entropy loss. We use Adam to simplify the training process. In each iteration, we first update discriminator, and then update the generator. The example we will generate visual loss.

def train(net_D,net_G,data_iter,num_epochs,lr_D,lr_G,latent_dim,data):
    loss=nn.BCELoss()
    Tensor=torch.FloatTensor
    trainer_D=torch.optim.Adam(net_D.parameters(),lr=lr_D)
    trainer_G=torch.optim.Adam(net_G.parameters(),lr=lr_G)
    plt.figure(figsize=(7,4))
    d_loss_point=[]
    g_loss_point=[]
    d_loss=0
    g_loss=0
    for epoch in range(1,num_epochs+1):
        d_loss_sum=0
        g_loss_sum=0
        batch=0
        for X in data_iter:
            batch+=1
            X=Variable(X)
            batch_size=X.shape[0]
            Z=Variable(Tensor(np.random.normal(0,1,(batch_size,latent_dim))))
            trainer_D.zero_grad()
            d_loss = update_D(X, Z, net_D, net_G, loss, trainer_D)
            d_loss_sum+=d_loss
            trainer_G.zero_grad()
            g_loss = update_G(Z, net_D, net_G, loss, trainer_G)
            g_loss_sum+=g_loss
        d_loss_point.append(d_loss_sum/batch)
        g_loss_point.append(g_loss_sum/batch)
    plt.ylabel('Loss', fontdict={'size': 14})
    plt.xlabel('epoch', fontdict={'size': 14})
    plt.xticks(range(0,num_epochs+1,3))
    plt.plot(range(1,num_epochs+1),d_loss_point,color='orange',label='discriminator')
    plt.plot(range(1,num_epochs+1),g_loss_point,color='blue',label='generator')
    plt.legend()
    plt.show()
    print(d_loss,g_loss)
    
    Z =Variable(Tensor( np.random.normal(0, 1, size=(100, latent_dim))))
    fake_X=net_G(Z).detach().numpy()
    plt.figure(figsize=(3.5,2.5))
    plt.scatter(data[:,0],data[:,1],color='blue',label='real')
    plt.scatter(fake_X[:,0],fake_X[:,1],color='orange',label='generated')
    plt.legend()
    plt.show()

Now we specify the hyper-parameters to fit the Gaussian distribution.
Now, we specify the parameters to fit the super-Gaussian distribution.

if __name__ == '__main__':
    lr_D,lr_G,latent_dim,num_epochs=0.05,0.005,2,20
    generator=net_G()
    discriminator=net_D()
    train(discriminator,generator,data_iter,num_epochs,lr_D,lr_G,latent_dim,data)

Here Insert Picture Description
0.6932446360588074 0.6927103996276855
Here Insert Picture Description
Summary

  • Generative adversarial networks (GANs) composes of two deep networks, the generator and the discriminator. Against generation network (Gans) deep by the two networks, namely a generator and a discriminator.
  • The Image Generator. Generates AS of The Closer much to Image The AS Possible to true to Fool The discriminator, Entropy Via the Maximizing The Cross-Loss, IE, max log (D (X ' )). By maximizing the cross-entropy generator losses (i.e., max log (D (X ' ))) to generate a real image of the image as close as possible to fool the discriminator.
  • The discriminator tries to distinguish the generated images from the true images, via minimizing the cross-entropy loss, ie, min - log (D (x)) -. (1-y) log (1-D (x)) discriminator trying to minimize the loss of cross-entropy, i.e. min-log (D (x)) - (1-y) log (1-D (x)), to separate the generated image and the real image area.
Published 31 original articles · won praise 0 · Views 791

Guess you like

Origin blog.csdn.net/qq_44750620/article/details/104521253