1.tensorflow of running processes
tensorflow running process has two main steps, namely the structural model and training .
In the construction phase model, we need to construct a graph (Graph) to describe our model. FIG called, can be understood as a flowchart, the data is input -> intermediate processing -> process output represented like this below.
Note that at this time is not going to happen in the actual operation. In the model building after completion, will enter training step. At this time you will have the actual data input, the gradient calculation and other operations. So, how to construct an abstract model of it? Here we must mention a few concepts tensorflow in: Tensor, Variable, placeholder , and in the training phase, we need to introduce the Session . The following first explain some of the above concepts
1.1 concept description
1.1.1 Tensor
Tensor mean tensor, but as I understand it, in fact, refers to the matrix. It can be understood as the representation tensorflow matrix. Tensor generation There are many ways, the easiest As
-
Import
tensorflow
AS
TF
# code in all of the following, this line are removed, the default has been introduced
-
a = tf.zeros(shape=[
1,
2])
- 1
- 2
Note, however, since before the start of training, all the data are abstract concepts, that is to say, at this time but said this should be a zero matrix 1 * 5, without the actual assignment, there is no allocation of space, so if At this print, there will be the following:
-
print(a)
-
#===>Tensor(
"zeros:0", shape=
(1,
2), dtype=
float32)
- 1
- 2
Only after the training process begins, in order to obtain the actual value of a
-
sess = tf.InteractiveSession()
-
print
(sess.run (a))
-
#===>
[[ 0. 0.]]
- 1
- 2
- 3
Session concept design to the side, will be mentioned later
1.1.2 Variable
Named Incredibles, is a variable meaning. Generally used to represent each of the parameter calculation in FIG comprising a matrix and vector. For example, I would like to model the figure above, it is an expression
(RELU is an activation function, particularly visible here ) where with I used to train the parameters, then the time these two values can be represented by Variable. The initial function Variable There are many other options, not to mention here, only enter a Tensor is also possible
W = tf.Variable(tf.zeros(shape=[1,2]))
- 1
Notice that the W is the same as an abstract concept, but with Tensor different, have a specific value after Variable must be initialized .
-
tensor = tf.zeros(shape=[
1,
2])
-
variable = tf.Variable(tensor)
-
sess = tf.InteractiveSession()
-
#
Print
(sess.run (variable)) # will complain
-
sess.run (tf.initialize_all_variables ()) # of variable is initialized
-
print(sess.run(variable))
-
#===>
[[ 0. 0.]]
- 1
- 2
- 3
- 4
- 5
- 6
- 7
1.1.3 placeholder
Called placeholder, is also an abstract concept. It indicates the input format for the output data. Tells the system: There is a value / vector / matrix, and now I can not give you a specific number, but I will make an official run on! E.g. x and y in the above formula. Because there is no specific numerical values, so long as the size can be specified
-
x = tf.placeholder(tf.
float32,
[1,
5],name=
'input')
-
y = tf.placeholder(tf.
float32,[None,
5],name=
'input')
- 1
- 2
The above two forms, the first x, represents the input is a [1,5] of the lateral amount.
The second form, indicates that the input is a [?, 5] matrix. So under what circumstances would so use it? When is the need to enter a number [1,5] data. For example, I have a batch of 10 data, then I can be expressed as [10,5] matrix. If a group of five, and that is [5,5] of the matrix. tensorflow automatically for batch processing
1.1.4 Session
session, which is session. My understanding is, the session is an abstract model of the implementor . Why before the code to use multiple session? Because the model is an abstract thing, only to realize later model, to be able to get specific value. Similarly, the specific parameter training, forecasts, and even the actual value of the query variables to be used in the session , look back to know
1.2 Model Construction
Here we use the classification code in mnist official tutorial datasets, the formula can be written as
Then the code for the description of the model
-
# Established abstract model
-
X
= TF
.placeholder
(TF
.float
32, [
None
,
784
])
# input placeholder
-
Y
= TF
.placeholder
(TF
.float
32, [
None
,
10
])
# placeholder output (desired output)
-
W = tf
.Variable(tf
.zeros([
784,
10]))
-
b = tf
.Variable(tf
.zeros([
10]))
-
TF = A .nn .softmax (TF .matmul ( X , W is) + B) # A represents the actual output of the model
-
-
# Define the loss function and training methods
-
TF = cross_entropy .reduce _mean (-tf .reduce _sum appended to ( Y * TF .log (A), reduction_indices = [ . 1 ])) # is the cross-entropy loss function
-
TF = Optimizer .train .GradientDescentOptimizer ( 0.5 ) # gradient descent learning rate 0.5
-
= Optimizer Train .minimize (cross_entropy) # training objectives: to minimize the loss function
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
This way you can see, all the elements (Fig structure, loss of function, decrease training methods and goals) models have been included in the train inside. We can train called the training model . So we also need to test model
-
correct_prediction = tf
.equal(tf
.argmax(a,
1), tf
.argmax(
y,
1))
-
accuracy = tf
.reduce_mean(tf
.cast(correct_prediction, tf
.float
32))
- 1
- 2
The above two lines of code, tf.argmax find represents the maximum position (that is, classification and prediction of the actual classification), and see if they agree, is to return true, is not it returns false, so get a boolean array. tf.cast boolean arrays will turn into an int array, and finally averaged to obtain a classification accuracy (how, is not very clever)
1.3 practical training
After training with the model and test model, we can begin the actual training
-
TF = sess .InteractiveSession () # establish an interactive session
-
TF .initialize _all_variables () .run () # initialize all variables
-
for i
in range(
1000):
-
batch_xs, batch_ys = MNIST .train .next _batch ( 100 ) # 100 to obtain a number of data
-
Train .run ({ X : batch_xs, Y : batch_ys}) # training model to provide input and output
-
print(sess
.run(accuracy,feed_dict={
x:mnist
.test
.images,
y:mnist
.test
.labels}))
- 1
- 2
- 3
- 4
- 5
- 6
It can be seen in the future to build a complete model, as long as we provide both input and output, the model for the model will be able to conduct their own training and testing. In the middle of derivation, seeking gradient back-propagation and so complicated things, tensorflow will help you automatically.
2. The actual code
In practice, further comprising code to obtain data
-
"""A very simple MNIST classifier.
-
See extensive documentation at
-
http://tensorflow.org/tutorials/mnist/beginners/index.md
-
"""
-
from __future__
import absolute_import
-
from __future__
import division
-
from __future__
import print_function
-
-
# Import date
-
from tensorflow.examples.tutorials.mnist
import input_data
-
-
import tensorflow
as tf
-
-
flags = tf.app.flags
-
FLAGS = flags.FLAGS
-
flags.DEFINE_string ( 'data_dir' , '/ tmp / data /' , 'Directory for Storing Data' ) # The data in / tmp / data folder
-
-
= input_data.read_data_sets MNIST (FLAGS.data_dir, one_hot = True ) # read data set
-
-
-
# Established abstract model
-
tf.placeholder = X (tf.float32, [ None , 784 ]) # placeholder
-
y = tf.placeholder(tf.float32, [
None,
10])
-
W = tf.Variable(tf.zeros([
784,
10]))
-
b = tf.Variable(tf.zeros([
10]))
-
a = tf.nn.softmax(tf.matmul(x, W) + b)
-
-
# Define the loss function and training methods
-
= tf.reduce_mean cross_entropy (-tf.reduce_sum (Y * tf.log (A), reduction_indices = [ . 1 ])) # is the cross-entropy loss function
-
= tf.train.GradientDescentOptimizer Optimizer ( 0.5 ) # gradient descent learning rate 0.5
-
= optimizer.minimize Train (cross_entropy) # training objectives: to minimize the loss function
-
-
# Test trained model
-
correct_prediction = tf.equal(tf.argmax(a,
1), tf.argmax(y,
1))
-
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
-
-
# Train
-
= tf.InteractiveSession sess () # establish an interactive session
-
tf.initialize_all_variables().run()
-
for i
in range(
1000):
-
batch_xs, batch_ys = mnist.train.next_batch(
100)
-
train.run({x: batch_xs, y: batch_ys})
-
print(sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels}))
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
The resulting classification accuracy rate of about 91%