本文翻译自官方教程这里

1、什么是tensorflow模型？

在我们训练好模型之后需要保存以方便随时拿过来用，那tensorflow是如何保存模型的呢？主要分为两部分：

a）Meta graph

这个文件里保存了完整的tensorflow图的信息：比如所有的变量，操作，集合等等。这个文件是以.meta为后缀的.

b）Checkpoint file:

这个文件是以.ckpt为后缀的,是个二进制的文件，里面保存了.meta文件中对应变量或者tensor或者操作等等的值。但是现在变成了三个文件，分别是以.index 和.data为后缀的两个文件加上一个checkpoint文件。其中.data文件保存了我们训练的变量的值;checkpoint 文件仅仅记录最新保存的模型是哪个。

最后主要就是如下的四个文件：

2、如何保存模型

因为训练模型的时间很长，而每次我们要用模型的时候不可能重新在训练一次，所以我们要把模型保存起来。在tensorflow中如果我们想保存graph和参数的值，我们需要用到tf.train.Saver() 类，首先实例化saver类：

saver = tf.train.Saver()

要知道tensorflow的variables只有在一个session中才有意义。因此我们必须保存在某个session中的模型。我们可以调用下面的方法来保存：

saver.save(sess, 'my-test-model')

在这里sess是一个session对象，my-test-model是你给这个模型起的名字。下面是一个完整的例子：

import tensorflow as tf
w1 = tf.Variable(tf.random_normal(shape=[2]), name='w1')
w2 = tf.Variable(tf.random_normal(shape=[5]), name='w2')
saver = tf.train.Saver()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
saver.save(sess, 'my_test_model')

# This will save following files in Tensorflow v >= 0.11
# my_test_model.data-00000-of-00001
# my_test_model.index
# my_test_model.meta
# checkpoint

如果我们需要保存1000次迭代之后的模型，我们可以这样来做：

saver.save(sess, 'my_test_model',global_step=1000)

这样生成如下四个文件：

my_test_model-1000.index

my_test_model-1000.meta

my_test_model-1000.data-00000-of-00001

checkpoint

在训练的时候我们想每1000次保存一下模型，.meta文件第一次（1000次迭代）就生成了，因为图没有发生变化所以，在之后的次数中（比如2000，3000次迭代）我们不需要再次的生成。我们可以这样做：

saver.save(sess, 'my-model', global_step=step,write_meta_graph=False)

如果我们想只保存4个最新的模型，并且每2小时保存一次。我们可以这样做：

saver = tf.train.Saver(max_to_keep=4, keep_checkpoint_every_n_hours=2)

值得注意的是如果我们不给他传任何参数tf.train.Saver(),他会保存所有的变量。如果只想保存部分变量我们需要给他参数。例子如下

import tensorflow as tf

w1 = tf.Variable(tf.random_normal(shape=[2]), name='w1')

w2 = tf.Variable(tf.random_normal(shape=[5]), name='w2')

saver = tf.train.Saver([w1,w2])

sess = tf.Session()

sess.run(tf.global_variables_initializer())

saver.save(sess, 'my_test_model',global_step=1000)

3、引入一个与训练的模型

你需要做两件事：

a) Create the network:

两种方式来做到这件事

1：自己写一个一模一样的网络出来

2：像这样直接引入meta文件中保存的网络graph

saver = tf.train.import_meta_graph('my_test_model-1000.meta')

b) Load the parameters:

例子如下：

with tf.Session() as sess:

  new_saver = tf.train.import_meta_graph('my_test_model-1000.meta')

  new_saver.restore(sess, tf.train.latest_checkpoint('./'))

这样像w1，w2这样的变量的值就被加载进来了。

4. 利用重载的模型干点什么

现在我们可以保存和重载模型了。如下，我们在构建一个network时用到了placeholders。但是当我们保存模型的时候 placeholders 的值并没有保存。

import tensorflow as tf


#Prepare to feed input, i.e. feed_dict and placeholders

w1 = tf.placeholder("float", name="w1")

w2 = tf.placeholder("float", name="w2")

b1= tf.Variable(2.0,name="bias")

feed_dict ={w1:4,w2:8}


#Define a test operation that we will restore

w3 = tf.add(w1,w2)

w4 = tf.multiply(w3,b1,name="op_to_restore")

sess = tf.Session()

sess.run(tf.global_variables_initializer())


#Create a saver object which will save all the variables

saver = tf.train.Saver()


#Run the operation by feeding input

print sess.run(w4,feed_dict)

#Prints 24 which is sum of (w1+w2)*b1


#Now, save the graph

saver.save(sess, 'my_test_model',global_step=1000)

现在当我们想要恢复他的时候，我们不仅要恢复graph和weights，我们还要准备一个新的feed_dict 来feed给新的网络。我们可以通过 graph.get_tensor_by_name() 方法来得到placeholders 或者operation的reference（引用）。


#How to access saved variable/Tensor/placeholders

w1 = graph.get_tensor_by_name("w1:0")


## How to access saved operation

op_to_restore = graph.get_tensor_by_name("op_to_restore:0")

如果我们想用新的数据来搞这个网络，就像下面这样做：

import tensorflow as tf


sess=tf.Session()    

#First let's load meta graph and restore weights

saver = tf.train.import_meta_graph('my_test_model-1000.meta')

saver.restore(sess,tf.train.latest_checkpoint('./'))



# Now, let's access and create placeholders variables and

# create feed-dict to feed new data


graph = tf.get_default_graph()

w1 = graph.get_tensor_by_name("w1:0")

w2 = graph.get_tensor_by_name("w2:0")

feed_dict ={w1:13.0,w2:17.0}


#Now, access the op that you want to run.

op_to_restore = graph.get_tensor_by_name("op_to_restore:0")


print sess.run(op_to_restore,feed_dict)

#This will print 60 which is calculated

#using new values of w1 and w2 and saved value of b1.

如果我们想增加一些新的层或者新的操作：

import tensorflow as tf


sess=tf.Session()    

#First let's load meta graph and restore weights

saver = tf.train.import_meta_graph('my_test_model-1000.meta')

saver.restore(sess,tf.train.latest_checkpoint('./'))



# Now, let's access and create placeholders variables and

# create feed-dict to feed new data


graph = tf.get_default_graph()

w1 = graph.get_tensor_by_name("w1:0")

w2 = graph.get_tensor_by_name("w2:0")

feed_dict ={w1:13.0,w2:17.0}


#Now, access the op that you want to run.

op_to_restore = graph.get_tensor_by_name("op_to_restore:0")


#Add more to the current graph

add_on_op = tf.multiply(op_to_restore,2)


print sess.run(add_on_op,feed_dict)

#This will print 120.

只想恢复部分old graph，然后在此基础上做点事情？没问题！

......

......

saver = tf.train.import_meta_graph('vgg.meta')

# Access the graph

graph = tf.get_default_graph()

## Prepare the feed_dict for feeding data for fine-tuning


#Access the appropriate output for fine-tuning

fc7= graph.get_tensor_by_name('fc7:0')


#use this if you only want to change gradients of the last layer

fc7 = tf.stop_gradient(fc7) # It's an identity function

fc7_shape= fc7.get_shape().as_list()


new_outputs=2

weights = tf.Variable(tf.truncated_normal([fc7_shape[3], num_outputs], stddev=0.05))

biases = tf.Variable(tf.constant(0.05, shape=[num_outputs]))

output = tf.matmul(fc7, weights) + biases

pred = tf.nn.softmax(output)


# Now, you run this with fine-tuning data in sess.run()

第一次写博客感觉还可以。就是这个代码块的背景色不知道怎么调，想给弄成白底的。

tensorflow 保存和恢复模型