Python Fun AI hottest application framework TensorFlow practice ☝☝☝

Python Fun AI hottest application framework TensorFlow practice 

 

With the application of TensorFlow in research and product of the increasingly widespread, many developers and researchers are hoping to further study this deep learning framework. In the framework vote yesterday Almost Human launched in 2144 have 1441 participants are using TensorFlow framework, the framework is all the highest usage. But TensorFlow this static calculation map has some learning costs, and therefore a lot of preparation into the pit stop beginners. This article describes the learning TensorFlow series of tutorials designed a simple theory and practice step by step to help beginners grasp TensorFlow programming skills.

This tutorial series is divided into six parts, from TensorFlow Why did you choose to implement convolution neural network, introduces the beginner skills needed. Almost Human herein describes the advantages and disadvantages and the like Caffe TensorFlow base PyTorch and deep learning framework, including storing static FIG computing, tensor, TensorBoard visualization and model parameters, and the like.

 

Why TensorFlow?

In this article, we will compare the most popular deep learning framework (including Caffe, Theano, PyTorch, TensorFlow and Keras), to help you choose the most appropriate framework for the application.

Caffe 1. : the first mainstream production-depth learning library, in 2014, initiated by UC Berkeley.

advantage:

  • fast

  • GPU support

  • Pretty Matlab and Python interfaces

Disadvantages:

  • not flexible. In the Caffe, each node is treated as a layer, so if you want a new type layer, you need to define a complete forward, backward, and gradient update process. These layers are network building blocks you need to choose endless list. (In contrast, in TensorFlow, each node is a tensor operations such as matrix addition, multiplication or convolution. You can easily define a layer composition as these operations. Thus TensorFlow smaller building blocks that allow more flexible and modular.)

  • It requires a lot of unnecessary redundant code. If you want to support both CPU and GPU, you need to achieve for each additional function. You also need to use an ordinary text editor to define your model. Really a headache! Almost everyone wants to programmatically define the model, because it is good modularity between different components. Interestingly, the main architect of Caffe TensorFlow now work in a team.

  • Specificity. Target only in computer vision (but do very well).

  • It is not written in Python! If you want to introduce new changes, you need to program (for smaller changes, you can use it in Python and Matlab interfaces) in C ++ and CUDA.

  • Poor documentation.

  • Installation more difficult! There are a large number of dependencies.

  • Only a few kinds of input format, only one output format HDF5 (although you can always use it for Python / C ++ / Matlab interfaces to run, and derive the output data).

  • Not available for constructing the network cycle.

Theano 2. : built by the University of Montreal research team. Theano top open depth values constructed library, including Keras, Lasagne and Blocks. Yoshua Bengio at September 28, 2017 announcement, Theano development will be terminated . So in fact Theano is dead!

advantage:

  • Beautiful abstract computation graph (comparable TensorFlow)

  • CPU and GPU are optimized

  • Well adapted to numerical optimization tasks

  • Advanced packaging (Keras, Lasagne)

Disadvantages:

  • The original Theano only relatively low-level API

import numpy
           for _ in range(T):
               h = torch.matmul(W, h) + b
  • Large models may take a long time to compile

  • It does not support multi-GPU

  • Error message may not be helpful (and sometimes frustrating)

Pytorch 3. : 2017 Nian 1 Yue, Facebook version of the Python Torch library (written in Lua) open source.

advantage:

  • FIG dynamic calculation (FIG means is generated at runtime), allows you to handle input and output of variable length, for example, it is very useful when used RNN.

  • As another example, in PyTorch, you can use the standard Python syntax to write a for loop.

  • A large number of pre-training model

  • Easy combination of a large number of modular components

  • Easy to write your own layer type, easy to run on the GPU

  • When "TensorBoard" lacks some key features, "Losswise" can be used as a substitute for the Pytorch

Disadvantages:

  • Limited References / Resources other than an official document

  • No commercial support

4. TensorFlow: a combination of lower-level symbolic computation library (e.g. Theano) and higher-level network specification library (e.g., Blocks and Lasagne) together.

advantage:

  • By Google to develop, maintain, and therefore can guarantee ongoing support, development.

  • Huge, active community

  • Network training low-level, high-level interface

  • "TensorBoard" is a powerful visualization package designed to track the performance of the network topology and to make debugging easier.

  • Written in Python (although some portions have a major impact on performance is achieved with C ++), which is a quite readable programming language

  • Support for multi-GPU. So you can freely run code on different computers, without having to stop or restart the program

  • Faster than the model compiler options based Theano

  • Shorter compile times than Theano

  • TensorFlow not only supports deep learning, as well as support for strengthening learning tools and other algorithms.

Disadvantages:

  • FIG pure Python is calculated, and therefore slower

  • Figure structure is static, meaning that map must be "compiled" run

Keras 5.: Keras is a more advanced, the most user-friendly API, has a rear end configurable, written and maintained by Google Brain team members Francis Chollet.

advantage:

  • Provide high-level API to build deep learning model, making it easy to read and use

  • Preparation of specifications document

  • Large, active community

  • Other libraries located deep learning (e.g. Theano and TensorFlow, configurable) above

  • Using object-oriented design, all content objects are considered (e.g. network layer parameters, optimization, etc.). All model parameters can be accessed as object properties.

E.g:

  • model.layers [3] .output the third layer providing a model

  • model.layers [3] .weights symbol list tensor weight

Disadvantages:

  • Since the purpose is very common, it is quite lacking in terms of performance

  • You have performance problems (because not optimized for it), but when used in conjunction with back-end Theano good results when used in conjunction with back-end TensorFlow

  • Unlike TensorFlow or as flexible PyTorch

TensorFlow basis

TensorFlow using a data flow graph (data flow graphs), open source software library for numerical calculation. Wherein data transfer is representative of Tensor tensor (multi-dimensional array), Flow calculated using the representative calculates FIG. Data flow diagrams with "nodes" (nodes) and "edge" (edges) are composed to describe the mathematical operations directed graph. "Node" is generally used to indicate the applied mathematical operations, but also can indicate the end of data input and output of the starting point, or read / write end persistent variables (persistent variable) is. Side of the input / output relationship between the nodes. The data transfer edges may be dynamically adjusted dimension of multidimensional data array, i.e. tensor (tensor).

FIG computing session

The first step is to understand learning TensorFlow its main features - "calculation map" method. TensorFlow substantially all of the code consists of two important parts:

1. Create a "calculation map" indicates that the calculated data flow

2. Run "sessions", executes the operation

In fact, separate definitions TensorFlow calculated from its execution. The two parts will be explained in detail in the following sections. Until then, remember that the first step is to import TensorFlow!

import tensorflow as tf

In this way, Python can access all classes, methods and symbols of TensorFlow. With this command, TensorFlow library will be imported under the alias "tf", so that later we can use it instead of typing in the full name of its "TensorFlow."

1. FIG Calculation

TensorFlow creative ideas is the maximum numerical calculation is expressed as FIG. Put another way, any backbone TensorFlow program is a calculation map. As mentioned TensorFlow official website, "a calculation chart is organized into a series of operations on TensorFlow graph nodes."

First, what is the node and operation? The best way to explain, for example. Suppose we "x ^ 2y + y + 2 f (x, y) =" write the code for the function. FIG TensorFlow calculation is as follows:

Figure 2: Construction of calculating TensorFlow FIG.

As shown above, there are a series of nodes calculated by the edges FIG interconnected configuration. Each node is called op, the abbreviation of Operation (operation). Each node represents a computation Thus, the operation might be generated or tensor tensor calculation. Each node in an amount of zero or more cards as input, and generates as output a tensor.

Now let's build a simple computation graph.

import tensorflow as tf
a = 2
b = 3
c = tf.add(a, b, name='Add')
print(c)
______________________________________________________
Tensor("Add:0", shape=(), dtype=int32)

FIG calculated and generated for the variables:

Figure 3: Left: Tensorboard generated in FIG visualized; Right: Variable generated (when operating in a debug mode from the acquired debugger PyCharm screenshot)

In order to assess the actual node, must run in the session is calculated in FIG. Briefly, code written only generates only used to determine the expected size and tensor FIG operation performed on them. However, it will not be assigned to any tensor.

Thus, TensorFlow Graph function similar to the definition in Python. It is "no" (just like the function definition there will be no execution result) as you perform any calculations. It "only" define computing operations.

2. Session (Session)

In TensorFlow, all the different variables and operations are stored in the calculation of FIG. So after we've built the model needed to chart, also you need to open a session (Session) to run the entire calculation chart. In the session, we can calculate the distribution to all available CPU and GPU resources. As a simple example, FIG running computing and obtaining the value of c:

sess = tf.Session()
print(sess.run(c))
sess.close()
__________________________________________
5

This code creates a the Session () object (assigned to sess), and then (second row) call its operating method to run to evaluate FIG enough computing c. After the calculation is completed need to close the session to help reclaim system resources, or resource leak problem occurs.

TensorFlow tensor

import tensorflow as tf

TensorFlow 中最基本的单位是常量(Constant)、变量(Variable)和占位符(Placeholder)。常量定义后值和维度不可变,变量定义后值可变而维度不可变。在神经网络中,变量一般可作为储存权重和其他信息的矩阵,而常量可作为储存超参数或其他结构信息的变量。

1. 常量

创建一个节点取常数值,它接收以下的变量:

tf.constant(value, dtype=None, shape=None, name='Const', verify_shape=False)

我们来创建两个常量并将它们加起来。常量张量可以通过定义一个值来简单地定义:

# create graph
a = tf.constant(2)
b = tf.constant(3)
c = a + b
# launch the graph in a session
with tf.Session() as sess:
    print(sess.run(c))
____________________________________________________
5    

现在我们来看看创建的计算图和生成的数据类型:

2. 变量

变量是状态性的节点,输出的是它们当前的值,意味着它们可以在一个计算图的多次执行中保留它们的值。它们有一系列的有用特征,例如:

它们可以在训练期间或训练后保存到硬盘上。这允许来自不同公司和团队的人们保存、恢复和发送他们的模型参数给别人。

默认情况下,梯度更新(在所有神经网络中应用)将应用到计算图中的所有变量。实际上,变量是你希望调整以最小化损失函数的东西。

为了创建变量,你可以按如下方式使用 tf.Variable:

# Create a variable.
w = tf.Variable(<initial-value>, name=<optional-name>)

以下语句声明一个 2 行 3 列的变量矩阵,该变量的值服从标准差为 1 的正态分布,并随机生成。

w1=tf.Variable(tf.random_normal([2,3],stddev=1,seed=1))

TensorFlow 还有 tf.truncated_normal() 函数,即截断正态分布随机数,它只保留 [mean-2*stddev,mean+2*stddev] 范围内的随机数。

调用 tf.Variable 来创建一个变量是一种老方法。TensorFlow 推荐使用封装器 tf.get_variable,它能接收命名、形状等参数:

tf.get_variable(name,
                shape=None,
                dtype=None,
                initializer=None,
                regularizer=None,
                trainable=True,
                collections=None,
                caching_device=None,
                partitioner=None,
                validate_shape=True,
                use_resource=None,
                custom_getter=None,
                constraint=None)

变量在使用前需要初始化。为此,我们必须调用「变量初始值设定项操作」,并在 session 上运行该操作。

a = tf.get_variable(name="var_1", initializer=tf.constant(2))
b = tf.get_variable(name="var_2", initializer=tf.constant(3))
c = tf.add(a, b, name="Add1")

# launch the graph in a session
with tf.Session() as sess:
    # now let's evaluate their value
    print(sess.run(a))
    print(sess.run(b))
    print(sess.run(c))

3. 占位符

我们已经创建了各种形式的常量和变量,但 TensorFlow 同样还支持占位符。占位符并没有初始值,它只会分配必要的内存。在会话中,占位符可以使用 feed_dict 馈送数据。

feed_dict 是一个字典,在字典中需要给出每一个用到的占位符的取值。在训练神经网络时需要每次提供一个批量的训练样本,如果每次迭代选取的数据要通过常量表示,那么 TensorFlow 的计算图会非常大。因为每增加一个常量,TensorFlow 都会在计算图中增加一个节点。所以说拥有几百万次迭代的神经网络会拥有极其庞大的计算图,而占位符却可以解决这一点,它只会拥有占位符这一个节点。

a = tf.constant([5, 5, 5], tf.float32, name='A')
b = tf.placeholder(tf.float32, shape=[3], name='B')
c = tf.add(a, b, name="Add")

with tf.Session() as sess:
    # create a dictionary:
    d = {b: [1, 2, 3]}
    # feed it to the placeholder
    print(sess.run(c, feed_dict=d)) 
 ___________________________________________________
 [6. 7. 8.]

它生成的计算图与变量如下所示:

现在,我们已经能创建一个简单的神经网络。如下利用随机生成的数据创建了一个三层全连接网络:

import tensorflow as tf
from numpy.random import RandomState

batch_size=10
w1=tf.Variable(tf.random_normal([2,3],stddev=1,seed=1))
w2=tf.Variable(tf.random_normal([3,1],stddev=1,seed=1))

# None 可以根据batch 大小确定维度,在shape的一个维度上使用None
x=tf.placeholder(tf.float32,shape=(None,2))
y=tf.placeholder(tf.float32,shape=(None,1))

#激活函数使用ReLU
a=tf.nn.relu(tf.matmul(x,w1))
yhat=tf.nn.relu(tf.matmul(a,w2))

#定义交叉熵为损失函数,训练过程使用Adam算法最小化交叉熵
cross_entropy=-tf.reduce_mean(y*tf.log(tf.clip_by_value(yhat,1e-10,1.0)))
train_step=tf.train.AdamOptimizer(0.001).minimize(cross_entropy)

rdm=RandomState(1)
data_size=516

#生成两个特征,共data_size个样本
X=rdm.rand(data_size,2)
#定义规则给出样本标签,所有x1+x2<1的样本认为是正样本,其他为负样本。Y,1为正样本
Y = [[int(x1+x2 < 1)] for (x1, x2) in X]

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(w1))
    print(sess.run(w2))
    steps=11000
    for i in range(steps):

        #选定每一个批量读取的首尾位置,确保在1个epoch内采样训练
        start = i * batch_size % data_size
        end = min(start + batch_size,data_size)
        sess.run(train_step,feed_dict={x:X[start:end],y:Y[start:end]})
        if i % 1000 == 0:
            training_loss= sess.run(cross_entropy,feed_dict={x:X,y:Y})
            print("在迭代 %d 次后,训练损失为 %g"%(i,training_loss))

上面的代码定义了一个简单的三层全连接网络(输入层、隐藏层和输出层分别为 2、3 和 2 个神经元),隐藏层和输出层的激活函数使用的是 ReLU 函数。该模型训练的样本总数为 512,每次迭代读取的批量为 10。这个简单的全连接网络以交叉熵为损失函数,并使用 Adam 优化算法进行权重更新。

其中需要注意的几个函数如 tf.nn.relu() 代表调用 ReLU 激活函数,tf.matmul() 为矩阵乘法等。tf.clip_by_value(yhat,1e-10,1.0) 这一语句代表的是截断 yhat 的值,因为这一语句是嵌套在 tf.log() 函数内的,所以我们需要确保 yhat 的取值不会导致对数无穷大。

TensorBoard 基础

TensorBoard 是一个可视化软件,在所有的 TensorFlow 标准安装中都包含了 TensorBoard。按谷歌的话说:「使用 TensorFlow 执行的计算,例如训练一个大规模深度神经网络,可能复杂且令人困惑。为了更加容易理解、调试和优化 TensorFlow 程序,我们内置了一套可视化工具,即 TensorBoard。」

TensorFlow 程序既能解决非常简单也能解决非常复杂的问题,它们都有两种基本组件——运算和张量。如前所述,你创建了一个由一系列运算构成的模型,馈送数据到模型上,张量将在运算之间流动,直到得到了输出张量,即你的结果。

完全配置好后,TensorBoard 窗口将呈现与下图类似的画面:

TensorBoard 的创建是为了帮助你了解模型中张量的流动,以便调试和优化模型。它通常用于两项任务:

1. 图形可视化

2. 编写摘要(或可视化学习)

在本教程中,我们将介绍 TensorBoard 的上述两项主要用法。尽早学习使用 TensorBoard,可以让使用 TensorFlow 的工作更有趣也更有成效。

1. 计算图可视化

强大的 TensorFlow 计算图会变得极其复杂。可视化图形有助于理解并对其进行调试。这是一个在 TensorFlow 网站工作的可视化示例。

为了激活 TensorFlow 程序 TensorBoard,需要向其中添加几行代码。这将把 TensorFlow 运算导出到一个名为「event file」(或 event log file)的文件中。TensorBoard 能够读取此文件并深入了解模型图及其性能。

现在我们来编写一个简单的 TensorFlow 程序,并用 TensorBoard 可视化其计算图。先创建两个常量并将其添加到一起。常数张量可以简单地通过定义它们的值来定义:

import tensorflow as tf

# create graph
a = tf.constant(2)
b = tf.constant(3)
c = tf.add(a, b)
# launch the graph in a session
with tf.Session() as sess:
    print(sess.run(c))
_____________________________________________

为了用 TensorBoard 可视化程序,我们需要编写程序的日志文件。为了编写事件文件,我们首先需要为那些日志编写一个 writer,使用以下代码:

writer = tf.summary.FileWriter([logdir], [graph])

其中 [logdir] 是你想要保存那些日志文件的文件夹。你可以选择 [logdir] 作为某些有意义的东西,例如『./graphs』。第二个参数 [graph] 是我们正在编写的程序的计算图。有两种获取计算图的方法:

1. 使用 tf.get_default_graph() 调用计算图,返回程序的默认计算图

2. 将计算图设置为 sess.graph,返回会话的计算图(注意这里需要我们已经创建了会话)

我们将在以下的例子中展示两种方法。然而,第二种方法更加常用。不管用哪种方法,确保仅当你定义了计算图之后才创建一个 writer。否则,TensorBoard 中可视化的计算图将是不完整的。让我们添加 writer 到第一个例子中并可视化计算图。

import tensorflow as tf

# create graph
a = tf.constant(2)
b = tf.constant(3)
c = tf.add(a, b)

# creating the writer out of the session
# writer = tf.summary.FileWriter('./graphs', tf.get_default_graph())

# launch the graph in a session
with tf.Session() as sess:
    # or creating the writer inside the session
    writer = tf.summary.FileWriter('./graphs', sess.graph)
    print(sess.run(c))
    # don't forget to close the writer at the end
    writer.close()

接下来转到 Terminal,确保当前工作目录与运行 Python 代码的位置相同。例如,此处我们可以使用以下代码切换到目录

$ cd ~/Desktop/tensorboard

接下来运行:

$ tensorboard --logdir="./graphs" —port 6006

这将为你生成一个链接。ctrl + 左键单击该链接(或将其复制到浏览器中,或只需打开浏览器并转到 http://localhost:6006/)。接下来将显示 TensorBoard 页面,如下所示:

参数存储与加载

在基础部分中,最后还介绍了模型参数的保存与恢复。一般 TensorFlow 模型持久化可使用 tf.train.Saver() 完成,它会将 TensorFlow 模型保存为 .ckpt 格式的文件。一般该文件目录下会有三个文件,第一个 model.ckpt.meta 保存了 TensorFlow 计算图的结构,第二个 model.ckpt 文件保存了 TensorFlow 中每一个变量的取值,而最后一个 cheekpoint 文件保存了同目录下所有的模型文件列表。

为了保存和恢复模型变量,我们需要在构建计算图后调用 tf.train.Saver(),例如:

# create the graph
X = tf.placeholder(..)
Y = tf.placeholder(..)
w = tf.get_variale(..)
b = tf.get_variale(..)
...
loss = tf.losses.mean_squared_error(..)
optimizer = tf.train.AdamOptimizer(..).minimize(loss)
...

saver = tf.tfain.Saver()

在训练模式中,我们需要打开会话初始化变量和运行计算图,并在训练结束时调用 saver.save() 保存变量:

# TRAIN
with tf.Session() as sess:
    sess.run(tf.globale_variables_initializer())
    # train our model
    for step in range(steps):
        sess.run(optimizer)
        ...
    saved_path = saver.save(sess, './my-model', global_step=step)

在测试模式中,我们需要使用 saver.restore() 恢复参数:

# TEST
with tf.Session() as sess:
    saver.restore(sess, './my-model')
    ...

当然,模型持久化还有非常多的内容,例如由 MetaGraphDef Protocol Buffer 定义的计算图节点元数据。读者可继续阅读完整的教程或其它书籍以了解详细信息。

Guess you like

Origin www.cnblogs.com/itye2/p/11653459.html