Simple automatic encoder in TensorFlow

Author: chen_h
Micro Signal & QQ: 862251340
micro-channel public number: coderpai


Generating confrontation Network (GAN) has recently become a very popular network architecture, network architecture of this popular initially because by imitating the famous painter's artistic style, but recently passed DeepFake, it can be seamlessly replaced the video facial expressions while maintaining high-quality output.

GAN most important part of an automatic encoder. Autoencoder neural network architecture has two parts: the input to the encoder and the decoder output, and the same data. At first, this may sound confusing and useless, but to copy the input data by training the neural network. What we really do is to make learning data "compressed version," it is to compress the data, and then we'll restore data input through decompression. In the jargon, this conversion is to find "potential space" Our representation of the data.

In this article, we will explain how to achieve automatic coding in Python, and how to use it to encode and decode data in time. You can install what Python 3 environments, as well as TensorFlow environment because we will follow the form of the code to learn.

For an automatic encoder, if the encoder is good, it must meet the following four requirements:

  • Compressed data, i.e. the potential data dimensions <input data;
  • You need a good copy of input data, retain the original features of most of the data;
  • It allows us to easily obtain the encoded data layer;
  • It allows us to easily perform the decoding operation to restore the input data;

Here Insert Picture Description

I will show you two representations, to implement this functionality using the underlying interface and high-level interface. Underlying interface is used directly TensorFlow API to write high-level interfaces we use to achieve our keras API functions. If you want to know more about the underlying model logic, I suggest you use TensorFlow API to achieve.

In this article, we will directly TensorFlow core API to operate, it requires some essential knowledge. We first take a brief explanation of the three basic concepts: tf.placeholder, tf.Variable and tf.Tensor.

tf.placeholder just a variable, it will be input to the model, but not part of the model itself. It basically allows us to tell the model it should expect a certain type and a variable dimension. This variable declaration strongly typed languages ​​are very similar.

tf.Variable with other programming languages ​​almost the same variable, its declaration statement is similar to most strongly typed language. For example, the network weights are tf.Variables.

tf.Tensor little more complicated. In our example, we can see it as an object symbol contains representations of operations. For example, given the weights placeholder variables W and X, W generally represents a matrix multiplication X * is a tensor, but not to the results of the tensor specific value X and W are given.

Now that you've introduced, then the process of building the network will be quite simple, but it may be a bit circuitous. We will use MNIST data set, the data set is stored as a 28 * 28 handwritten digital picture data sets. We define D as the input data, in this case, the image size of the elongated dimension 784, and the code length is set to 128 after compression, is represented by d, and then forget 3 layer network with the following dimensions: D -> d -> D.

We will now implement a simple automatic coding class, and layer by layer explained.

First, we need a placeholder to enter data:

self.X = tf.placeholder(tf.float32, shape=(None, D))

We then define a first layer weights (with additional deviation) coding phase begins as a variable:

self.W1 = tf.Variable(tf.random_normal(shape=(D,d)))
self.b1 = tf.Variable(np.zeros(d).astype(np.float32))

Please note that the weights dimension is D * d, from a higher dimension to the minimum dimension. Next, we create for the coding layer tensor, as the input and the right multiplication between weight and add variations, all of which are relu activation function is activated.

self.Z = tf.nn.relu( tf.matmul(self.X, self.W1) + self.b1 )

Then we enter the decode stage, which encode the same, but expanded from a lower to a higher dimension dimension.

self.W2 = tf.Variable(tf.random_normal(shape=(d,D)))
self.b2 = tf.Variable(np.zeros(D).astype(np.float32))

Finally, we define the output tensor themselves predictors. Select sigmoid activation function for simplicity, since it is always in the interval [0,1], which is the same range of the normalized pixels of the input of FIG.

logits = tf.matmul(self.Z, self.W2) + self.b2 
self.X_hat = tf.nn.sigmoid(logits)

This is a whole network! We need only one loss function and a optimizer, we can be very happy to start training. Select the cross-entropy loss function is sigmoid, which means we will issue as pixel-level binary classification, which for a black and white image data set is significant.

About optimization, which is a very common way, maybe we should create a model, which optimize the output in case of problems is the best? ?

self.cost = tf.reduce_sum(
    tf.nn.sigmoid_cross_entropy_with_logits(
        # Expected result (a.k.a. itself for autoencoder)
        labels=self.X,
        logits=logits
    )
)
        
self.optimizer = tf.train.RMSPropOptimizer(learning_rate=0.005).minimize(self.cost)

Finally a slight technical term: the session (session), as this is a context manager object and rear connectors, need to be initialized:

self.init_op = tf.global_variables_initializer()        
if(self.sess == None):
            self.sess = tf.Session()
self.sess = tf.get_default_session()
self.sess.run(self.init_op)

Now we need to train the model, but do not worry, it's very simple in TensorFlow in:

# Prepare the batches 
epochs = 10 
batch_size = 64       
n_batches = len(X) // bs
        
for i in range(epochs):
    # Permute the input data
    X_perm = np.random.permutation(X)
    for j in range(n_batches):
        # Load data for current batch
        batch = X_perm[j*batch_size:(j+1)*batch_size]
        # Run the batch training!
        _, costs = self.sess.run((self.optimizer, self.cost),
                                    feed_dict={self.X: batch})

The last line is the only really interesting line. It tells TensorFlow use as a placeholder for the batch input X running training step, and given the optimizer and loss of weight update function to perform.

Let's look at some examples of this network output:

Here Insert Picture Description

Now that everything looks perfect, but we can only write a rebuild their network. So, how do we actually use automatic encoder? Further we need to define two operations, encoding and decoding. This is actually very simple:

def encode(self, X):
    return self.sess.run(self.Z, feed_dict={self.X: X})

Here we tell TensorFlow to calculate Z, if you look back, you will find that is representative of coded tensor. Decoding process is very simple:

def decode(self, Z):
    return self.sess.run(self.X_hat, feed_dict={self.Z: Z})

This time, we explicitly gives encoded by Z tensor flow, we calculated before encoding function, we tell it to calculate the predicted output X_hat.

Now you can see, even a simple network, but the code is also very long. Of course, we can be re-parameterized for each weight and use the list, instead of a single variable. But when we need to quickly test multiple structures or automatically what will happen? This will become very complicated. But do not worry, we use high-level API can easily do all the work! This interface is keras.

All network definitions, losses, optimization and training fitting can be used to get a few lines of code:

t_model = Sequential()
t_model.add(Dense(256, input_shape=(784,)))
t_model.add(Dense(128, name='bottleneck'))
t_model.add(Dense(784, activation=tf.nn.sigmoid))
t_model.compile(optimizer=tf.train.AdamOptimizer(0.001),
              loss=tf.losses.sigmoid_cross_entropy)
t_model.fit(x, x, batch_size=32, epochs=10)

It is such that we have a trained network. Life is sweet ah

But how about the entire coding and decoding it? Yes, it looks a bit tricky, but do not worry, they help us solve. We need a few things:

  • A session variable;
  • Enter tensor, feed_dict used to develop input parameters;
  • Tensor coding for retrieving encoded and decoded as an input parameter feed_dict;
  • Decoding tensor for retrieving decoding value;

Nonsense about

We accomplish this goal by obtaining the required level of interest from the tensor! Please note that by naming coding layer, we can easily be retrieved.

session = tf.get_default_session()
if(self.sess == None):
    self.sess = tf.Session()
# Get input tensor
def get_input_tensor(model):
    return model.layers[0].input
# get bottleneck tensor
def get_encode_tensor(model):
    return model.get_layer(name='encode').output
# Get output tensor
def get_output_tensor(model):
    return model.layers[-1].output

Now given a trained model, you can get all the variables required:

t_input = get_input_tensor(t_model)
t_enc = get_bottleneck_tensor(t_model)
t_dec = get_output_tensor(t_model)
session = tf.get_default_session()
# enc will store the actual encoded values of x
enc = session.run(t_enc, feed_dict={t_input:x})
# dec will store the actual decoded values of enc
dec = session.run(t_dec, feed_dict={t_enc:enc})

The full version of the code, please click here

Original link, please click here

Published 414 original articles · won praise 168 · views 470 000 +

Guess you like

Origin blog.csdn.net/CoderPai/article/details/91499559