To use TensorFlow
, you must understand TensorFlow
:
- Use diagrams
(graph)
to represent tasks - Execution graph
(Session)
in the context of what is called a session(context)
- use
tensor
presentation data (Variable)
Maintain state through variables- Use
feed
and to assign values to or get data fromfetch
any operation(arbitrary operation)
Overview
TensorFlow
It is a programming system that uses graphs to represent computing tasks. The nodes in the graph are called op
( operation
abbreviation for), one op
obtains 0 or more tensor
, performs calculations, and produces 0 or more tensor
. Each tensor
is a typed multidimensional array. For example, you can represent a set of image pixels as a four-dimensional array of floating-point numbers, the four dimensions are [batch, height, width, channels]
.
A TensorFlow
graph describes the process of computation. In order to perform computations, the graph must 会话
be started, distribute 会话
the graph op
to devices such as CPUs or GPUs, and provide op
methods for execution, which, when executed, will produce tensor
returns. In python language, what is returned tensor
is an numpy ndarry
object; in C/C++ language, what is returned is tensor
an tensorflow::Tensor
instance.
Computational graph
Tensorflow
Programs are usually organized into a build phase and an execution phase. In the build phase, op
the execution steps are described as a graph, and in the execution phase, the execution of the graph is performed using sessions op
.
For example, it is common to create a graph to represent and train a neural network during the build phase, and then repeatedly perform training on the graph during the execution phase op
.
Tensorflow
Support C/C++, python programming language. Currently, TensorFlow
the python library is easier to use, it provides a large number of helper functions to simplify the work of building graphs, which are not yet supported by the C/C++ library.
The conversation library for the three languages (session libraries)
is the same.
Build the graph
The first step in a component diagram is to create a source op (source op)
. The source op
does not require any input. The output of the source op
is passed to the other op
for operation.
In the python library, op
the return value of the constructor represents the constructed op
output, and these return values can be passed to others op
as input.
TensorFlow
There is a default graph in the Python library to which the (default graph)
constructor op
can add nodes. This default graph is sufficient for many programs, read the Graph
class documentation to learn how to manage multiple views.
import tensorflow as tf
# 创建一个常量op, 产生一个1x2矩阵,这个op被作为一个节点
# 加到默认视图中
# 构造器的返回值代表该常量op的返回值
matrix1 = tr.constant([[3., 3.]])
# 创建另一个常量op, 产生一个2x1的矩阵
matrix2 = tr.constant([[2.], [2.]])
# 创建一个矩阵乘法matmul op,把matrix1和matrix2作为输入:
product = tf.matmul(matrix1, matrix2)
The default graph now has three nodes, two constant() op
and matmul() op
. In order to actually get the result of the matrix multiplication, you have to start the graph in the session.
Start a graph in one session
The graph cannot be started until the construction phase is complete. The first step in starting a graph is to create an Session
object, the session constructor will not be able to start the default graph without any creation parameters.
For the full session API, read the Session
class .
# 启动默认图
sess = tf.Session()
# 调用sess的'run()' 方法来执行矩阵乘法op,传入'product'作为该方法的参数
# 上面提到,'product'代表了矩阵乘法op的输出,传入它是向方法表明,我们希望取回
# 矩阵乘法op的输出。
#
#整个执行过程是自动化的,会话负责传递op所需的全部输入。op通常是并发执行的。
#
# 函数调用'run(product)' 触发了图中三个op(两个常量op和一个矩阵乘法op)的执行。
# 返回值'result'是一个numpy 'ndarray'对象。
result = sess.run(product)
print result
# ==>[[12.]]
# 完成任务,关闭会话
sess.close()
Session
Objects need to be closed to release resources after use. In addition to explicit calls close
, you can also use with
code to automatically complete the closing action:
with tf.Session() as sess:
result = sess.run([product])
print result
In implementation, Tensorflow
the graph definition is transformed into operations that are executed in a distributed manner to take full advantage of the available computing resources (such as CPU or GPU). Generally, you don't need to explicitly specify whether to use CPU or GPU, Tensorflow
it can be automatically detected. If a GPU is detected, Tensorflow
the first GPU found is used whenever possible to perform the operation.
If there is more than one available GPU on the machine, the other GPUs except the first are not involved in the calculation. In order to Tensorflow
use these GPUs, you must op
explicitly assign them to execute. with...Device
Statements are used to assign specific CPU or GPU operations:
with tf.Session() as sess:
with tf.device("/gpu:1"):
matrix1 = tf.constant([[3., 3.]])
matrix2 = tf.constant([[2.], [2.]])
product = tf.matmul(matrix1, matrix2)
Devices are identified by strings. Currently supported devices include:
/cpu:0
: The CPU of the machine/gpu:0
: the machine's first GPU, if any/gpu:1
: the second GPU of the machine, and so on
interactive use
The python examples in the documentation use a session Session
to start the graph and call Session.run()
methods to perform operations.
To facilitate the use of a python interactive environment such as IPython, an alternative class, use and method can be used InteractiveSession
instead . This avoids using a variable to hold the session:Session
Tensor.eval()
Operation.run()
Session.run()
# 进入一个交互式Tensorflow会话
import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.Variable([1.0, 2.0])
a = tf.constant([3.0, 3.0]);
# 使用初始化器initializer op的run()方法初始化x
x.initializer.run()
# 增加一个减法sub op,从x减去a。运行减法op,输出结果
sud = tf.sub(x, a)
print sub.eval()
# ==>[-2. -1.]
Tensor
Tensorflow
Programs use tensor
data structures to represent all data, in computation graphs, and data passed between operations tensor
. You can think Tensorflow
of it tensor
as a one n
-dimensional array or list. One tensor
contains a static type rank
and one shape
.
order
In the Tensorflow
system, the dimensions of a tensor are described as orders. But the order of a tensor and the order of a matrix are not the same concept. The order of a tensor is a quantitative description of the dimension of the tensor. The following tensor ( list
defined in python) is of order 2:
t = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
You can think of a second-order tensor as what we usually call a matrix, and a first-order tensor can be thought of as a vector. For a second-order tensor, you can use the statement t[i, j]
to access any element in it. And for rank 3 tensors you can t[i, j, k]
access any element via:
order | math example | python example |
---|---|---|
0 |
scalar (size only) | s=483 |
1 |
vector (magnitude and orientation) | v=[1.1, 2.2, 3.3] |
2 |
matrix (datasheet) | m=[[1, 2, 3], [4, 5, 6], [7, 8, 9]] |
3 |
3 rank tensor |
t=[[[2], [4], [6]], [[8], [9], [10]], [[11], [12], [13]]] |
n |
n order |
… |
shape
Tensorflow
The documentation uses three notations to conveniently describe the dimensions of a tensor: rank, shape, and dimension. The following shows the relationship between them:
order | shape | dimension | Example |
---|---|---|---|
0 |
[] |
0-D | a 0-dimensional tensor, a scalar |
1 |
[D0] |
1-D | A 1D tensor of the form[5] |
2 |
[D0, D1] |
2-D | A 2D tensor of the form[3, 4] |
3 |
[D0, D1, D2] |
3-D | A 3D tensor of the form[1, 4, 3] |
n |
[D0, D1, ... Dn] |
n-D | The form of an n-dimensional tensor[D0, D1, ..., Dn] |
type of data
In addition to dimensions, tensor
there is a datatype attribute. You can specify any of the following data types for a tensor:
type of data | python type | describe |
---|---|---|
DT_FLOAT |
tf.float32 |
32-bit floating point number |
DT_DOUBLE |
tf.float64 |
64-bit floating point |
DT_INT64 |
tf.int64 |
64-bit signed integer |
DT_INT32 |
tf.int32 |
32-bit signed integer |
DF_INT16 |
tf.int16 |
16-bit signed integer |
DT_INT8 |
tf.int8 |
8-bit signed integer |
DT_UINT8 |
tf.uint8 |
8-bit unsigned integer |
DT_STRING |
tf.string |
variable-length byte array, each tensor element is a byte array |
DT_BOOL |
tf.bool |
boolean |
DT_COMPLEX64 |
tf.complex64 |
Negative numbers consisting of 32-bit floating point numbers: real and imaginary |
DT_QINT32 |
tf.qint32 |
32-bit signed integer for quantizing Ops |
DT_QINT8 |
tf.qint8 |
8-bit signed integer for quantizing Ops |
DT_QUINT8 |
tf.quint8 |
8-bit unsigned int for quantizing Ops |
variable
See more details in Variables . Variables maintain state information during the execution of the graph. The following example demonstrates how to implement a simple counter using variables:
# 创建一个变量,初始为标量0
state = tf.Variable(0, name="counter")
# 创建一个op,其作用是使`state`增加1
one = tf.constant(1)
new_value = tf.add(state, one)
update = tf.assign(state, new_value)
# 启动图后,变量必须先经过init op初始化
# 首先先增加一个初始化op到图中
init_op = tf.initialize_all_variables()
# 启动图
with tf.Session() as sess:
# 运行init op
sess.run(init_op)
# 打印 state 的初始值
print sess.run(state)
# 运行op, 更新state 并打印
for _ in range(3):
sess.run(update)
print sess.run(state)
# 输出:
# 0
# 1
# 2
# 3
The operation in the code assign()
is part of the expression described by the graph, just like the operation, so it doesn't actually perform the assignment operation until the execution expression add()
is called .run()
The parameters in a statistical model are usually represented as a set of variables. For example, you can store the weights of a neural network as some variable in a tensor
. During training, this is updated by repeatedly training the graph tensor
.
Fetch
In order to get back the output of the operation, you can pass in some when using Session
the object's run()
call execution graph tensor
, which tensor
will help you get the result back. In the previous example, we only retrieved a single node state
, but you can retrieve multiple tensor
:
input1 = tf.constant(3.0)
input2 = tf.constant(4.0)
input3 = tf.constant(5.0)
intermed = tf.add(input2, input3)
mul = tf.mul(input1, intermed)
with tf.Session() as sess:
result = sess.run([mul, intermed])
print result
# print
# [27.0, 9.0]
Need to get more tensor
values, get them in op
a sequential run of '(instead of getting them one by one tenter
).
Feed
The above examples are introduced in a computational graph, tensor
stored as constants or variables. A mechanism Tensorflow
is also provided that can temporarily replace any operation in the graph, which can submit a patch to any operation in the graph, directly inserting one .feed
tensor
tensor
feed
To temporarily replace the output of an operation with a tensor
value, you can provide feed
data as run()
an argument to the call. feed
It is only valid within the method that calls it, and when the method ends, feed
it will disappear. The most common use case is to designate some special action as an feed
action, and the way to mark it is tf.placeholder()
to create placeholders for these actions.
input1 = tf.placeholder(tf.types.float32)
input2 = tf.placeholder(tf.types.float32)
output = tf.mul(input1, input2)
with tf.Session() as see:
print sess.run([output], feed_dict={input:[7.], input2:[2.]})
# print
# [array([ 14.], dtype=float32)]