TensorFlow version of HelloWord

Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/hekaiyou/article/details/88877042

Here is an entry code for TensorFlow official website provides, for the novice is a machine learning, so hard Yeah, so the following can only be understood in a row by row.

import tensorflow as tf

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

model.compile(
  optimizer='adam',
  loss='sparse_categorical_crossentropy',
  metrics=['accuracy'],
)

model.fit(
  x_train,
  y_train,
  epochs=5,
)

model.evaluate(x_test, y_test)

The first is to import tensorflow library, normal operation, without explanation.

import tensorflow as tf

To understand the following sub-paragraphs:

  • tf.keras: TensorFlow of KerasAPI specification implementation, use it to set TensorFlow program, so the question is, what is KerasAPI it? KerasAPI is a high-level API for building and training models, including top-level support for TensorFlow specific functions.
  • datasets: It is DatasetsAPI implementation and support, using DatasetsAPI can be extended to large data sets or multi-unit training. So the question again, what is DatasetsAPI it? DatasetsAPI achieved loaded from memory or hard disk file data composing the data set, the data set while a series of transformation operations, the final data set to provide a range of functions of the other API.
  • mnist: Is MNIST data set, from the National Institute of Standards and Technology, composed of figures from 250 different people's handwriting, of which 50% are high school students, 50% of the staff from the Census Bureau. The same proportion of handwritten digital data.

Together means is that we want to download image data sets of data MNIST.

mnist = tf.keras.datasets.mnist

The above download stuff itself does not use the standard image formats store, use the following code load_datamethod, that is, tf.kerascomes MNIST data collection methods to be loaded manually unpack.

Picture data will be extracted into the 2-dimensional tensor: [image index, pixel index] where each one represents the intensity of a particular image pixel values ​​ranging from [0, 255] to [-0.5, 0.5]. "Image index" represents the number of data sets pictures, the upper limit from 0 to the data set. "Pixel index" representing the number of picture pixels obtained from the picture of the pixel value to 0.

That is, the load_datamethod to MNIST data set into training and test sets to use. x_trainTraining set is the picture, y_trainis the training set label, x_testis the training set image, y_testis the training set label.

Then the image of what is slightly below this picture, each picture has a 28X28 pixels with a digital array to represent each picture, the array expands into a vector of length equal to 28X28 = 784. Therefore, image MNIST data set that is 784-dimensional vector space inside the point, quite complex, 784 Victoria yet.

1503285601200049.png

(x_train, y_train),(x_test, y_test) = mnist.load_data()

Recall the above said several points:

  • x_trainTraining set is the picture, x_testis the training set images.
  • Pictures MNIST data sets is the point in the 784-dimensional vector space inside.
  • A point is a pixel unit, the pixel value range is [0, 255].

Therefore, the following code means very clear, the RGB values of the image, in addition to 255 normalized (0,1)decimal between. Is normalized, the data into (0,1)or (1,1)decimal between, primarily for convenience of data processing proposed by, the data is mapped to the range of 0 to 1 of the process, machine processing it may be more convenient and fast.

There is also a sense the Internet, said Li unknown reason: the dimensionless expression becomes dimensionless expressions, facilitate different units or magnitude of indicators can be compared and weighted. Normalization is a way to simplify the calculations, is about to have a dimension of expression, through transformation into a dimensionless expression become a scalar.

x_train, x_test = x_train / 255.0, x_test / 255.0

Above have said, tf.kerasit is TensorFlow achieve KerasAPI specification, so tf.kerasyou can run any compatible with Keras code. The following code, we used the Keras code.

Keras model is the main data structure, this data structure defines a complete map, you can join any of the existing network architecture diagram. In Keras, the model is constructed by combining a layer, usually a layer model composed of FIG. The most common type is the model layer stack ( tf.keras.Sequential) model.

Keras documentation Shi tf.keras.Sequential, TensorFlow documentation Shi tf.keras.models.Sequential, no difference, how to write can effect a kind. The following code to build a simple fully connected network, i.e., the multilayer perceptron, sequential or call ( Sequential) model.

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

There is also the code above four functions, respectively, in the following explanation mean is valid.

tf.keras.layers.Flatten(input_shape=(28, 28)),

tf.layers.FlattenFunction, while retaining the axis (axis 0) panning input tensor. The shaft ( axis) is used to define more than one-dimensional array of properties, the two-dimensional data with two axes, a first axis 0 along the vertical rows down, each represents a method of performing down the column or row label / index value; along a first axis extending in the horizontal row direction, corresponding to the representation performed along each transverse row or column labels. Here are a comparison of the image of the picture to explain:

1252882-20181130160009588-1393126436.png

The look back, above said, each picture has a 28X28 pixel, using code means came out, is the format of the image from the 2d array (28x28 pixels) is converted into an array of 28x28 = 1d 784 pixels. The fundamental purpose is to flatten the image data, in simple words, this is the real dimension reduction attack it.

The next line of code is instantiated using the constructor argument fully connected ( tf.keras.layers.Dense) layer, Denselayer is a dense or fully connected neural connection layer. After the pixels are flattened, a network of two tf.keras.layers.Densesequences.

The first Denselayer of neurons, or nodes 512, the layer is achieved outputs = activation(inputs * kernel + bias), in activationthe activation of the transmission parameters as a function of activation, is created by the weight of the layer weight matrix (when a non-None). kernelWeight matrix is created by the weight of the layer, and biasis created by the deviation vector layer (use_bias is True).

Summary of what is, there is a full connection layer 512, the tf.nn.relufunction is a linear correction is calculated.

tf.keras.layers.Dense(512, activation=tf.nn.relu),

tf.layers.DropoutFunction, the drop-out rate ( Dropout) applied to the input. DropoutDuring each update includes random fractional rate input unit is set to 0, which helps to prevent overfitting ( overfitting). Unit Reservation proportion 1 / (1-rate) is scaled so that their sum constant within the training time and the inference time.

Is to set about the proportion of neurons need to disconnect, to prevent over-fitting, over-fitting is defined as: In order to get consistent hypothesis the assumption becomes overly strict called over-fitting. The actual purpose is to allow more inclusive model number.

tf.keras.layers.Dropout(0.2),

tf.nn.softmaxFunction calculated softmaxactivation. In mathematics, especially in probability theory and related fields, Softmax function, also known as normalized exponential function, is an extension of logic functions.

The following code is a second, final layer of nodes is 10 softmaxlayers, return probability score array 10, the sum of 1, each node comprising one indication of the probability of belonging to a current image 10 based fraction . There is a connection layer 10 is full.

tf.keras.layers.Dense(10, activation=tf.nn.softmax)

Now look at the model, first, an image format from each array dimension reduction to 1d 2d array, and then stored in each array to 1d 2d array 784 (28x28) rows and N columns, the data from the original 784 dimension reduction to 2-D.

Create two fully connected (tf.keras.layers.Dense) layer 512 is a node or neuron fully connected layer calculated by the linear correction, by a normalized index ( tf.nn.softmax) function evaluation 10 layer fully connected nodes.

Finally, the connection between the two full (tf.keras.layers.Dense) layer to prevent overfitting by quitting rate (Dropout) function. This process is like following picture shows it:

softmax-layer-generic.png

After building the model, we can call compilethe configuration of the learning process model approach. This approach has several parameters below:

  • optimizerParameters: optimizer object specifies the training process, from tf.trainpassing it optimizer module instance.
  • lossParameters: cross entropy loss function to be minimized during optimization function. The loss function, or by names from a tf.keras.lossesspecified pass callable module.
  • metricsParameter: evaluation list for monitoring training. They are tf.keras.metricsstring name or object in the module may be invoked.
model.compile(
  optimizer='adam',
  loss='sparse_categorical_crossentropy',
  metrics=['accuracy'],
)

Because small data sets, we use memory NumPy array of training and evaluation models that use the fitmethod to fit the model to the training data. He began training five iterations.

Wherein x_trainthe input data, y_traina label tuple. epochsParameters, for training in cycles, a cycle is the first iteration of the entire input data to smaller batches completed iterations.

model.fit(
  x_train,
  y_train,
  epochs=5,
)

tf.keras.Model.evaluateNumPy data may be used to assess damage and inferential model metrics data provided. Our input x_testand y_testtwo sets of model tests after training test.

model.evaluate(x_test, y_test)

Finally, we can execute code, after the execution is complete, the console output is the following content:

Epoch 1/5
60000/60000 [==============================] - 8s 131us/sample - loss: 0.2183 - acc: 0.9355
Epoch 2/5
60000/60000 [==============================] - 7s 119us/sample - loss: 0.0961 - acc: 0.9704
Epoch 3/5
60000/60000 [==============================] - 6s 102us/sample - loss: 0.0691 - acc: 0.9784
Epoch 4/5
60000/60000 [==============================] - 6s 97us/sample - loss: 0.0530 - acc: 0.9826
Epoch 5/5
60000/60000 [==============================] - 6s 97us/sample - loss: 0.0432 - acc: 0.9862
10000/10000 [==============================] - 0s 36us/sample - loss: 0.0801 - acc: 0.9766

# 时期 1/5
# 60000/60000 [==============================] - 6秒 104微秒/抽样 - 损失: 0.2170 - 准确率: 0.9354
# 时期 2/5
# 60000/60000 [==============================] - 6秒 104微秒/抽样 - 损失: 0.0959 - 准确率: 0.9701
# 时期 3/5
# 60000/60000 [==============================] - 7秒 121微秒/抽样 - 损失: 0.0668 - 准确率: 0.9790
# 时期 4/5
# 60000/60000 [==============================] - 7秒 121微秒/抽样 - 损失: 0.0527 - 准确率: 0.9830
# 时期 5/5
# 60000/60000 [==============================] - 7秒 110微秒/抽样 - 损失: 0.0424 - 准确率: 0.9860
# 10000/10000 [==============================] - 0秒 40微秒/抽样 - 损失: 0.0718 - 准确率: 0.9798

Guess you like

Origin blog.csdn.net/hekaiyou/article/details/88877042