压缩神经网络实验记录（剪枝 + rebirth + mobilenet）

本文转自：https://blog.csdn.net/jason19966/article/details/79137666

实验内容

设计一个原始的神经网络。记录其运行时间和模型大小和准确率。
对该网络进行剪枝操作，记录其运行时间和模型大小和准确率。
对剪枝后的几个网络分别分部做rebirth，记录其运行时间和模型大小和准确率。
对所有之前涉及的网络做mobile net 。记录其运行时间和模型大小和准确率。

实验基础

原始神经网络的架构 （被舍弃）如下：

经过测试，以上的神经网络的架构训练网络很难，原始模型在训练10epoch后准确率才超过50%，后面根据该模型改写的mobile net 根本就没法训练了，正确率始终是10%左右，这根瞎猜没区别了。
更换神经网络架构，后来还是采用了之前用的架构，如下：

虽然直接跟换以后解决了上面模型的问题，但这个问题，我不是很能想明白，前面的网络的最前面的并行网络的设计，我是参照GoogLeNet设计的，为什么效果这么差。后来通过
数据集 ：mnist手写数字的数据集。

实验详情记录

训练难度对比

原始网络 的准确度曲线如下，我们可以看到不到2K 步，正确率就基本在0.9几了。
mobile net 的准确度曲线如下，我们可以看到前期准确度一直很低很低，知道6k 步以后才慢慢上升，直到10k以后才达到90%多。

对比以上两个网络，我们可以明显的看出，经过使用mobile net 改造后的网络训练难度明显很高，我的电脑跑得很慢，我前期差点就要放弃了。

训练准确度，时间，模型大小

实验步骤与数据。
1. 先训练具有上面架构的原始模型t001.pb，记录模型的大小，预测速度，和准确度。
2. 对原始模型进行迭代剪枝7轮，得到t101.pb,t102.pb,…t107.pb七个模型。分别记录模型的大小，预测速度，和准确度。（conv2，conv3，fc1，表示对应层的输出）

3.对原始模型和剪枝后的模型进行rebirth，由于rebirth后的模型大小不变，主要记录其速度和准确度。（时间1和时间2，只是进行了分别跑了两次，记录下来的，可以证明预测速度受电脑状态的影响还是有点大）
rebirth
4. 对前面rebirth 后的模型再进行剪枝。（使用前面rebirth后的结果进行剪枝，比如 t501 对应t401、t502对应t402）
这里写图片描述
5. 根据原始模型t001.pb和剪枝一次后的模型t101.pb分别构建对应的mobile net 模型t201.pb(原始模型）和 201.pb（剪枝一次后的模型）。
mobile数据
实验结果分析
1. 剪枝对于缩小模型大小和加快速度都有很好的效果。
2. rebirth在加快速度方面有很好的效果。
3. 将剪枝和rebirth 结合起来，反复迭代，可以得到一个很小，速度很快，准确度也不受影响的模型。
4. mobile net 可以加快速度，但训练难度较大。
遗留问题
1, mobile net得到的模型的大小，反而比之前的大了。（这有可能是我的代码写得有问题，我的mobile net 部分的代码如下，希望有人能帮忙指出问题。）
代码中主要是使用depseparable_conv3v3函数代替了之前的卷积函数，以实现mobilenet的结构。

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf
import tensorflow.contrib.slim as slim
import time

def depseparable_conv3v3(input_data,depthwise_filter, pointwise_filter,name):
    # input_data = tf.Variable(np.random.rand(1, 9, 9, 3), dtype=np.float32)
    # depthwise_filter = tf.Variable(np.random.rand(2, 2, 3, 4), dtype=np.float32)
    # pointwise_filter = tf.Variable(np.random.rand(1, 1, 12, 20), dtype=np.float32)
    y = tf.nn.separable_conv2d(input_data, depthwise_filter, pointwise_filter, strides=[1, 1, 1, 1], padding='SAME',name = name)
log_dir = 'F:'
modlefile = '201.pb'

def variable_summaries(var):
    """Attach a lot of summaries to a Tensor (for TensorBoard visualization)."""
    with tf.name_scope('summaries'):
      # 计算参数的均值，并使用tf.summary.scaler记录
      mean = tf.reduce_mean(var)
      tf.summary.scalar('mean', mean)
      # 计算参数的标准差
      with tf.name_scope('stddev'):
        stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
      # 使用tf.summary.scaler记录记录下标准差，最大值，最小值
      tf.summary.scalar('stddev', stddev)
      tf.summary.scalar('max', tf.reduce_max(var))
      tf.summary.scalar('min', tf.reduce_min(var))
      # 用直方图记录参数的分布
      tf.summary.histogram('histogram', var)

def load_data():
    from tensorflow.examples.tutorials.mnist import input_data
    mnist = input_data.read_data_sets('/tmp/data/', one_hot=True)
    return mnist


def weight_variable(shape,name):
    initial=tf.truncated_normal(shape,stddev=0.1)
    return tf.Variable(initial,name =name)
def bias_variable(shape,name):
    initial=tf.constant(0.1,shape=shape)
    return tf.Variable(initial,name=name)
def conv2d(x,W,name):
    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='VALID',name=name)
def conv2d_same(x,W,name):
    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME',name=name)
def depseparable_conv(input_data,depthwise_filter, pointwise_filter,name):
    # input_data = tf.Variable(np.random.rand(1, 9, 9, 3), dtype=np.float32)
    # depthwise_filter = tf.Variable(np.random.rand(2, 2, 3, 1), dtype=np.float32)
    # pointwise_filter = tf.Variable(np.random.rand(1, 1, 3, 20), dtype=np.float32)
    y = tf.nn.separable_conv2d(input_data, depthwise_filter, pointwise_filter, strides=[1, 1, 1, 1], padding='SAME',name = name+"_separa")
    return y
def max_pool_2x2(x,name):
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='VALID',name = name)
def max_pool_2x2_same(x,name):
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,1,1,1],padding='SAME',name = name)
def variable_weight_loss(shape,stddev,w1):
    var=tf.Variable(tf.truncated_normal(shape,stddev=stddev))
    if w1 is not None:
        weight_loss=tf.multiply(tf.nn.l2_loss(var),w1,name="weight_loss")
        tf.add_to_collection("losses",weight_loss)
    return var

def evaluate_pictures(n_epochs=50,batch_size=50):
    def loss(logits, labels):
        labels = tf.cast(labels, tf.int64) # 类型转换
        cross_entropy = tf.reduce_mean(
            tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=logits), name='cross_entropy') # 内部执行了softmax。
        tf.add_to_collection('losses', cross_entropy)
        return tf.add_n(tf.get_collection('losses'), name='total_loss')

    mnist = load_data()
    train_set_x  = mnist.train.images
    train_set_y = mnist.train.labels
    test_set_x = mnist.test.images
    test_set_y = mnist.test.labels



    # 计算各数据集的batch个数
    n_train_batches = train_set_x.shape[0]
    n_test_batches  = test_set_x.shape[0]
    n_train_batches = int(n_train_batches / batch_size)
    n_test_batches  = int(n_test_batches / batch_size)
    print("... building the model")

    # 搭建神经网络
    x = tf.placeholder(tf.float32, shape=[None, 784], name = 'input_x')
    y = tf.placeholder(tf.float32, shape=[None, 10], name = 'label_y')
    keep_prob = tf.placeholder(tf.float32,name = 'keep_prob')
    x_images = tf.reshape(x, [-1, 28, 28, 1], name = 'x_tensor')
    tf.summary.image('input', x_images, 10)


    with tf.name_scope("conv2"):
        wd_cov2 = weight_variable([5, 5, 1, 1],name = 'wd_conv2')
        wp_cov2 = weight_variable([1, 1, 1, 10],name = 'wp_conv2')
        b_cov2 = bias_variable([10],name = 'b_conv2')
        h_cov2 = tf.nn.relu(depseparable_conv(x_images, wd_cov2,wp_cov2, name = 'conv2') + b_cov2,name = 'relu_conv2')
        h_pool2 = max_pool_2x2(h_cov2,name = 'maxpooling_conv2')
    with tf.name_scope("conv3"):
        wd_cov3 = weight_variable([5, 5, 10, 1],name = 'wd_conv3')
        wp_cov3 = weight_variable([1, 1, 10, 27],name = 'wp_conv3')
        b_cov3 = bias_variable([27],name = 'b_conv3')
        h_cov3 = tf.nn.relu(depseparable_conv(h_pool2, wd_cov3,wp_cov3, name = 'conv3') + b_cov3,name = 'relu_conv3')
        h_pool3 = max_pool_2x2(h_cov3,name = 'maxpooling_conv3')
    with tf.name_scope("fc1"):
        h_pool2_reshape = tf.reshape(h_pool3, [-1, 7*7*27], name='cnn_fc_convert')
        w_fc1 = weight_variable([7*7*27,496],name = 'w_fc1')
        b_fc1 = bias_variable([496],name = 'b_fc1')
        h_fc1 = tf.nn.relu(tf.matmul(h_pool2_reshape, w_fc1) + b_fc1,name= 'relu_fc1')
    with tf.name_scope("dropout"):
        h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob,name= 'dropout')
    with tf.name_scope("fc2"):
        w_fc2 = weight_variable([496, 10],name = 'w_fc2')
        b_fc2 = bias_variable([10],name = 'b_dc2')
        y_conv = tf.nn.bias_add(tf.matmul(h_fc1_drop, w_fc2), b_fc2,name = 'y')
    with tf.name_scope("loss"):
        loss=loss(labels=y, logits=y_conv)
    train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss)
    with tf.name_scope("accuracy"):
        correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32),name='accuracy')
        tf.summary.scalar('accuracy', accuracy)
    # 启动session
    sess=tf.Session()
    sess.run(tf.global_variables_initializer()) #初始化graph的参数

    best_validation_acc = 0
    epoch = 0

    print("... training")

    print(tf.get_default_graph().get_collection(tf.GraphKeys.GLOBAL_VARIABLES))

    # summaries合并
    merged = tf.summary.merge_all()
    # 写到指定的磁盘路径中
    train_writer = tf.summary.FileWriter('F:/sum/train_mo', sess.graph)
    test_writer = tf.summary.FileWriter(log_dir + '/test')
    while (epoch < n_epochs):
        epoch = epoch + 1
        for minibatch_index in range(n_train_batches):
            iter = (epoch - 1) * n_train_batches + minibatch_index
            summary,acc,_=sess.run([merged, accuracy,train_step],feed_dict={x: train_set_x[minibatch_index * batch_size: (minibatch_index + 1) * batch_size],
                y: train_set_y[minibatch_index * batch_size: (minibatch_index + 1) * batch_size], keep_prob: 0.5})
            print('epoch %i, step %d,minibatch %i / %i, train acc %f' % (epoch, iter, minibatch_index + 1, n_train_batches,acc))
            run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
            run_metadata = tf.RunMetadata()
            train_writer.add_run_metadata(run_metadata, 'step%03d' % iter)
            train_writer.add_summary(summary, iter)
            # train_writer.add_summary(summary, iter)

            if (iter + 1) % 100 == 0:
                valid_acc=0
                for i in range(n_test_batches):
                    acc=sess.run([accuracy],feed_dict={x: test_set_x[i*batch_size:(i+1)*batch_size], y: test_set_y[i*batch_size:(i+1)*batch_size], keep_prob:1})
                    valid_acc =valid_acc+ acc[0]
                valid_acc=valid_acc/n_test_batches
                print('                         validation acc %g' %(valid_acc ))
                if valid_acc>best_validation_acc:
                    best_validation_acc=valid_acc
                    output_graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, output_node_names=["accuracy/accuracy"])
                    with tf.gfile.FastGFile(modlefile, mode = 'wb') as f:
                        f.write(output_graph_def.SerializeToString())

    train_writer.close()
    print('Optimization complete.')
    test_acc=0;
    start_time=time.time()
    valid_acc=0
    print(start_time)
    for i in range(n_test_batches):
        valid_acc =valid_acc+ sess.run(accuracy,feed_dict={x: test_set_x[i*batch_size:(i+1)*batch_size], y: test_set_y[i*batch_size:(i+1)*batch_size], keep_prob:1})
    end_time=time.time()
    test_acc=valid_acc/n_test_batches
    print("test accuracy %g" % test_acc)
    print((end_time - start_time)*1000/60)

if __name__ == '__main__':
    evaluate_pictures()

相关论文链接：
rebirth：论文链接
mobile net ：论文链接
剪枝：没找到。

更多案例请关注“思享会Club”公众号或者关注思享会博客：http://gkhelp.cn/

在这里插入图片描述