Deep Learning in the Browser: TensorFlow.js (5) Building a Neural Network

This time I can finally start real deep learning, starting with a neural network.

Neural Network (Neural Network) is the basis of deep learning, the basic concepts include: neurons, layers, back propagation and so on. If I talk about it in detail, I don't think I can finish it without five to ten articles. Simply put, it simulates the way neurons in the brain work, using a model that combines multiple neurons into a network structure to classify data.

Image result for 神经网络

  • A neural network is a feedback network with a multi-layer structure, including input, output and hidden layers.
  • Each layer consists of several neurons.
  • The entire network uses backpropagation to learn the difference between the feedback output and the expected value.
  • It can be understood that the network is a function ouput=function(input). With the deepening of the network level, the neural network can simulate a very complex nonlinear function. Of course, the cost of learning is higher, because the parameters to be learned will vary with the number of layers. and the number of neurons in each layer increases.

TensorFlowJs provides good support for Neural Networks/Deep Neural Networks. Including: model tf.model, layer tf.layer.

Let's take a look at how to use TensorFlowJS to build a simple neural network for handwriting recognition of MINST data.

Build the network

function nn_model() {
  const model = tf.sequential();
  model.add(tf.layers.dense({
    units: 32, inputShape: [784]
  }));
  model.add(tf.layers.dense({
    units: 256
  }));
  model.add(tf.layers.dense(
    {units: 10, kernelInitializer: 'varianceScaling', activation: 'softmax'}));
  return model;
}

The above code builds a neural network with two hidden layers, the first layer has 32 neurons and the second layer has 256 neurons.

  • tf.sequential builds a serialized network model, such that the output of each layer of the network is connected to the input of the next layer, similar to a stack of each layer. There are no branches or jumps.

  • Use model.add to add a layer to the model

  • tf.layers.dense provides a fully connected layer. units defines the number of neurons in the layer. inputShape is the shape of the input data. The first layer in the network must explicitly specify the input shape, and the rest of the layers default to input from the previous layers.

  • The last layer determines the result of the classifier, so we use softmax as the activation function, and the units is 10, which represents the classification result of the digits 0-9 of 10.

network initialization

const model = nn_model();
const LEARNING_RATE = 0.15;
const optimizer = tf.train.sgd(LEARNING_RATE);
model.compile({
  optimizer: optimizer,
  loss: 'categoricalCrossentropy',
  metrics: ['accuracy'],
});
  • Initialize the model, define the learning rate, optimizer
  • Call the model.compile method to define the loss function.

train the network

async function train() {
  const BATCH_SIZE = 16;
  const TRAIN_BATCHES = 100;

  const TEST_BATCH_SIZE = 100;
  const TEST_ITERATION_FREQUENCY = 5;

  for (let i = 0; i < TRAIN_BATCHES; i++) {
    const batch = data.nextTrainBatch(BATCH_SIZE);

    let testBatch;
    let validationData;
    // Every few batches test the accuracy of the mode.
    if (i % TEST_ITERATION_FREQUENCY === 0 && i > 0 ) {
      testBatch = data.nextTestBatch(TEST_BATCH_SIZE);
      validationData = [
        testBatch.xs.reshape([TEST_BATCH_SIZE, 784]), testBatch.labels
      ];
    }

    // The entire dataset doesn't fit into memory so we call fit repeatedly
    // with batches.
    const history = await model.fit(
        batch.xs.reshape([BATCH_SIZE, 784]), batch.labels,
        {batchSize: BATCH_SIZE, validationData, epochs: 1});

    batch.xs.dispose();
    batch.labels.dispose();
    if (testBatch != null) {
      testBatch.xs.dispose();
      testBatch.labels.dispose();
    }
    await tf.nextFrame();
  }
}
  • The core method of training is to call the model.fit(x,y,config) method. x is the training data and y is the classification label for training. config is optional.
  • During the training process, we use testBactch for validation and calculate the accuracy. The result is stored in the return value of model.fit.
  • Call the dispose method to release the memory occupied by the tensor
  • tf.nextFrame() returns a Promise, mainly used for web animations.
    static nextFrame(): Promise<void> {
      return new Promise<void>(resolve => requestAnimationFrame(() => resolve()));
    }

     

You can try my example on codepen .

By changing the number of layers in the model and the number of neurons in each layer, we can evaluate whether the model is effective.

Batch: 16 Neurons: 32+256 Accuracy: 0.84

Batch: 64 Neurons: 32+256 Accuracy: 0.92

Batch: 16 Neurons: 32+256+256+32 Accuracy: 0.75

Batch: 16 Neurons: 32+256+256+256 Accuracy: 0.11

We found that the deeper the network is not the better, in the last example of 4 layers, the training loss is high and the effect is poor.

It is really hard to define these hyperparameters in deep learning .

A larger batchSize is better than a smaller one, but we cannot load larger batches of training data due to browser memory constraints.

More discoveries are left for you to try.

There are many types of neural networks, and we can continue to learn about them in the future.

f38d2260c0b5c0e03c6793601f2da59ffd8dd3ab

 

refer to:

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325521024&siteId=291194637