tensorflow训练出的参数转化为caffe框架下的.caffemodel模型

转https://my.csdn.net/jiongnima

1.目的：介绍caffemodel里面记录的信息。我们大概知道里面记录了模型的参数。

针对tensoeflow训练所得的结果，我们来看一下我们得到下面的文件

1.有一个checkpoint，这是一个训练结果的索引，里面记录的是我们在训练不同阶段保存的模型。首先大家要记住，这是一个很有用的文件。

2.有一种.meta文件,这是一个记录Graph的文件，在tensorflow中，Graph记录了所有数据的流向，规定了整个模型的结构。

3.有一种data-00000-of-00001结构的文件，这个文件是记录了我们训练得到的数据，是以压缩性形式存储的。

4.有一个index类型的文件，这个文件记录了数据的index，就是需要提取参数的时候，在meta文件中找到了参数名，然后通过这个index，再从训练数据文件中提取数据具体的值。

5.一个events文件，记录了一些其他日志。

如何去提取出tensorflow的训练参数

(1)载入数据流图

(2)通过checkpoint找到目前最新的训练保存结果

(3)提取训练得到的所有参数

具体代码如下：

#!/usr/bin/python

import tensorflow as tf

import numpy as np

with tf.Session() as sess:

new_saver = tf.train.import_meta_graph('model.ckpt-189200.meta') #导入训练数据流图

for var in tf.trainable_variables(): #get the param names

print var.name #print parameters' names

new_saver.restore(sess, tf.train.latest_checkpoint('./')) #找到最新的训练数据

all_vars = tf.trainable_variables()

for v in all_vars:

v_4d = np.array(sess.run(v)) #get the real parameters

1.问题：InvalidArgumentError (see above for traceback): Cannot assign a device to node

解决：有些tf.Variable()不允许在GPU运行，需要在CPU运行。修改配置里面的allow_soft_placement=True，而allow_soft_placement=True，就是运行在出现不允许在GPU运行的时候，可以切换到CPU运行。

config = tf.ConfigProto(allow_soft_placement=True)
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
2.new_saver.restore(sess, tf.train.latest_checkpoint('./')) #找到最新的训练数据。可以直接导入最近训练数据的路径。
new_saver.restore(sess, './models/c3d_ucf_model-2990')
注意：
'./models/c3d_ucf_model-2990'是'./models/c3d_ucf_model-2990.data-00000-of-00001’。在提取参数时去掉后缀。config = tf.ConfigProto(allow_soft_placement=True)
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
2.new_saver.restore(sess, tf.train.latest_checkpoint('./')) #找到最新的训练数据。可以直接导入最近训练数据的路径。
new_saver.restore(sess, './models/c3d_ucf_model-2990')
注意：
'./models/c3d_ucf_model-2990'是'./models/c3d_ucf_model-2990.data-00000-of-00001’。在提取参数时去掉后缀。

在我们找到参数之后，下面，我们就要把tensorflow框架格式规定的参数转化为caffe框架规定格式规范的参数

目前我们手里面已经有了哪些东西：

1. 我们有自己的tensorflow训练程序，也就是说我们知道训练的网络架构。

2. 我们能够得到tensorflow架构训练得到的参数，并且我们知道我们的主要目的是得到一个caffemodel。

测试模型时需要什么：

首先需要一个caffemodel，其次，在模型测试的时候，我们需要一个.prototxt文件，该文件记录了网络前传的逻辑顺序。

多说两句，如何将train.prototxt、训练得到的caffemodel文件还有test.prototxt文件关联起来呢？

caffemodel里面包含了绝大部分train.prototxt的内容，train.prototxt除了约定了训练网络架构与参数配置，更重要的是规定了键名，这个键名就是layer中的"name"参数，而该键名也会记录在caffemodel中。在我们训练完毕模型并使用test.prototxt结合caffemodel对模型进行测试时，相当于是根据test.prortotxt中的layer的"name"参数去取得键名，然后根据这个键名在caffemodel中取得参数，然后才能进行网络的前向传播。test.prototxt是根据键名去caffemodel中取参数的，也就是说，如果提供的键名在caffemodel中找寻不到，那么也就无从取值。在读caffemodel文件的时候，我们使用了ReadProtoFromBinaryFile函数将参数从二进制读出到proto中，搭配WriteProtoToTextFile函数将参数从proto中写入文件。

经过上面一段话的阐述，我们明白了，我们目前还缺少什么东西。

(1) 我们需要一个test.prototxt。

我们需要将tensorflow训练出来的参数转化成文本，并且写在test.prototxt里面。

首先，对于(1)，在撰写test.prototxt的时候，应该按照tensorflow训练程序的网络架构写出caffe版本的网络架构。

在写作test.prototxt的时候，需要对tensorflow框架下面的训练网络架构train.prototxt相当熟悉，并且清楚tensorflow和caffe下面的框架协议规范(在写网络时tensorflow和caffe的不同)。

网址：https://blog.csdn.net/jiongnima/article/details/78382972

最后，形成一个完整的test.prototxt。

(2) 我们能够打印出tensorflow下训练得到的权重参数的名字，也就是说可以得到权重。在tensorflow框架下卷积层的权重shape是[kernel_height, kernel_width, input_channels, output_channels]，在caffe框架中，卷积层权重参数shape是[output_channels, input_channels, kernel_height, kernel_width]

把每一层对应的权重参数逐一取出来，并相应地新建了.prototxt文件并按照caffemodel下面的参数格式把权重参数写入了文件中。
在这里提取的是tensorflow训练得到的C3Dd的模型，卷积核维度是5：[kernel_time,kernel_height, kernel_width, input_channels, output_channels]，则需要改成caffe框架下[output_channels, input_channels, kernel_time,kernel_height, kernel_width]
#!/usr/bin/python
#!/usr/bin/python
import tensorflow as tf
import numpy as np
config = tf.ConfigProto(allow_soft_placement=True)
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
    new_saver = tf.train.import_meta_graph('/home/shajie/PycharmProjects/C3D-tensorflow-master/models/c3d_ucf_model-2990.meta') #load graph
    new_saver.restore(sess, '/home/shajie/PycharmProjects/C3D-tensorflow-master/models/c3d_ucf_model-2990') #find the newest training result
    for var in tf.trainable_variables():
        print var.name
    all_vars = tf.trainable_variables()
    for v in all_vars:
        name = v.name
        fname = name + '.prototxt'
        fname = fname.replace('/','_')
        print fname
        v_4d = np.array(sess.run(v))
        print v_4d.shape
        if v_4d.ndim == 5:
     #v_4d.shape [ T,H, W, I, O ]# np.swapaxes(X,0,2)把数组的第一维和第三维交换
            v_4d = np.swapaxes(v_4d, 0, 4) # swap O,T
            v_4d = np.swapaxes(v_4d, 1, 3) # swap I, H
            v_4d = np.swapaxes(v_4d, 2, 4) # swap W, T
            print v_4d.shape
            f = open(fname, 'w')
            vshape = v_4d.shape[:]#vshape=(64, 3, 3, 3, 3)
            v_1d = v_4d.reshape(v_4d.shape[0]*v_4d.shape[1]*v_4d.shape[2]*v_4d.shape[3]*v_4d.shape[4])#v_1d.shape=(5184,)
            f.write('  blobs {\n')
            for vv in v_1d:
            f.write('    data: %8f' % vv)
            f.write('\n')
            f.write('    shape {\n')
            for s in vshape:
            f.write('      dim: ' + str(s))#print dims
            f.write('\n')
            f.write('    }\n')
            f.write('  }\n')
        elif v_4d.ndim == 1 :#do not swap
            f = open(fname, 'w')
            f.write('  blobs {\n')
            for vv in v_4d:
                 f.write('    data: %.8f' % vv)
                 f.write('\n')
            f.write('    shape {\n')
            f.write('      dim: ' + str(v_4d.shape[0]))#print dims
            f.write('\n')
            f.write('    }\n')
            f.write('  }\n')
        f.close()
得到下面的文件，每一个W，b都会得到对应的caffe框架下的权重文件.prototxt。
在这里提取的是tensorflow训练得到的C3Dd的模型，卷积核维度是5：[kernel_time,kernel_height, kernel_width, input_channels, output_channels]，则需要改成caffe框架下[output_channels, input_channels, kernel_time,kernel_height, kernel_width]
#!/usr/bin/python
#!/usr/bin/python
import tensorflow as tf
import numpy as np
config = tf.ConfigProto(allow_soft_placement=True)
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
    new_saver = tf.train.import_meta_graph('/home/shajie/PycharmProjects/C3D-tensorflow-master/models/c3d_ucf_model-2990.meta') #load graph
    new_saver.restore(sess, '/home/shajie/PycharmProjects/C3D-tensorflow-master/models/c3d_ucf_model-2990') #find the newest training result
    for var in tf.trainable_variables():
        print var.name
    all_vars = tf.trainable_variables()
    for v in all_vars:
        name = v.name
        fname = name + '.prototxt'
        fname = fname.replace('/','_')
        print fname
        v_4d = np.array(sess.run(v))
        print v_4d.shape
        if v_4d.ndim == 5:
     #v_4d.shape [ T,H, W, I, O ]# np.swapaxes(X,0,2)把数组的第一维和第三维交换
            v_4d = np.swapaxes(v_4d, 0, 4) # swap O,T
            v_4d = np.swapaxes(v_4d, 1, 3) # swap I, H
            v_4d = np.swapaxes(v_4d, 2, 4) # swap W, T
            print v_4d.shape
            f = open(fname, 'w')
            vshape = v_4d.shape[:]#vshape=(64, 3, 3, 3, 3)
            v_1d = v_4d.reshape(v_4d.shape[0]*v_4d.shape[1]*v_4d.shape[2]*v_4d.shape[3]*v_4d.shape[4])#v_1d.shape=(5184,)
            f.write('  blobs {\n')
            for vv in v_1d:
            f.write('    data: %8f' % vv)
            f.write('\n')
            f.write('    shape {\n')
            for s in vshape:
            f.write('      dim: ' + str(s))#print dims
            f.write('\n')
            f.write('    }\n')
            f.write('  }\n')
        elif v_4d.ndim == 1 :#do not swap
            f = open(fname, 'w')
            f.write('  blobs {\n')
            for vv in v_4d:
                 f.write('    data: %.8f' % vv)
                 f.write('\n')
            f.write('    shape {\n')
            f.write('      dim: ' + str(v_4d.shape[0]))#print dims
            f.write('\n')
            f.write('    }\n')
            f.write('  }\n')
        f.close()
得到下面的文件，每一个W，b都会得到对应的caffe框架下的权重文件.prototxt。

权重参数的shape是变成了caffe框架下的[output_channels, input_channels, kernel_time，kernel_height, kernel_width]。在每个.prototxt文件后面都会有一个

我们有了一个test.prototxt文件，还有了各个层的参数，那么，下面就将我们转化得到的参数写入test.prototxt文件就好了。

layer {

name: "conv_layer_name"

type: "Convolution"

bottom: "bottom_blob"

top: "top_blob"

param { lr_mult: ... }

convolution_param {

num_output: output_dims

kernel_size: kernel_size

pad: padding_size

stride: stride

bias_term: false

}

#add params

blobs: {

data: ...

...

shape {

dim: ...

}

生成了model.prototxt文件，该文件是相当大的。强烈推荐大家使用脚本对文件进行拼接得到最终的模型文件。(会在后面补上怎么拼接的https://blog.csdn.net/jiongnima/article/details/78382972)。

model.prototxt文件会被转化为.caffemodel的模型文件并在测试程序中被调用。

model.prototxt文件主要记录了网络架构以及各层对应的参数。可是我们在使用caffe进行网络前传得到结果的时候，是需要使用一个caffemodel文件的。那么，如何通过model.prototxt格式的参数文件去生成这个caffemodel文件呢？(caffemodel里面包含了绝大部分train.prototxt的内容，train.prototxt除了约定了训练网络架构与参数配置，更重要的是规定了键名)

在生成caffemodel时，先使用ReadProtoFromTextFile将文件中的参数写入proto，然后再使用WriteProtoToBinaryFile函数将proto中的参数转化为caffemodel。

#include <caffe/caffe.hpp>

#include <google/protobuf/io/coded_stream.h>

#include <google/protobuf/io/zero_copy_stream_impl.h>

#include <google/protobuf/text_format.h>

#include <algorithm>

#include <iosfwd>

#include <memory>

#include <string>

#include <utility>

#include <vector>

#include <iostream>

#include "caffe/common.hpp"

#include "caffe/proto/caffe.pb.h"

#include "caffe/util/io.hpp"

using namespace caffe;

using namespace std;

using google::protobuf::io::FileInputStream;

using google::protobuf::io::FileOutputStream;

using google::protobuf::io::ZeroCopyInputStream;

using google::protobuf::io::CodedInputStream;

using google::protobuf::io::ZeroCopyOutputStream;

using google::protobuf::io::CodedOutputStream;

using google::protobuf::Message;

int main()

{

NetParameter proto;

ReadProtoFromTextFile("/home/cvlab/model/model.prototxt", &proto);

#读取保存为text文档的proto文件

WriteProtoToBinaryFile(proto, "/home/cvlab/model/model.caffemodel");

#将proto保存到文件，文件的序列化格式是binary的

return 0;

}

上面的代码文件命名为wm.cpp ，对应的CMakeLists.txt文件如下所示：

cmake_minimum_required (VERSION 2.8)

project (write_model)

add_executable(write_model wm.cpp)

include_directories ( /home/cvlab/caffe-master/include

/usr/local/include

/usr/local/cuda/include

/usr/include )

target_link_libraries(write_model

/home/cvlab/caffe-master/build/lib/libcaffe.so

/usr/lib/x86_64-linux-gnu/libglog.so

/usr/lib/x86_64-linux-gnu/libboost_system.so

)

运行一下该程序，首先要编译：

（1）cmake .

（2）make

然后运行：

(3)./write_model

可以看到，生成了全新的model.caffemodel文件。

总的来说，进行tensorflow2caffe框架转换，笔者探索的步骤如下：

(1) 首先弄懂caffemodel里面到底是什么？tensorflow2caffe(1)

(2) 从tensorflow中取出模型参数。tensorflow2caffe(2)

(3) 复现.prototxt格式的网络结构，按需在caffe上重写层，并将tensorflow中取出的模型参数转到caffe下，得到网络参数文件。tensorflow2caffe(3)

(4) 生成caffemodel，详见本篇博客上半部分。

(5) 利用网络结构文件与caffemodel进行网络的前向传播得到结果。在c++程序中调用caffe训练完毕的模型进行分类

tensorflow训练出的参数转化为caffe框架下的.caffemodel模型

猜你喜欢