最近在做量化相关的工作，在老师的推荐下看了这篇文章，这篇文章是google2018新的作品，非常良心，讲解非常详细，而且有代码可以work。

一、参考文献

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko

(Submitted on 15 Dec 2017)

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.

文章讲解：

Google CVPR 2018论文：CNN量化技术

Additionally, the minimum and maximum values for activations are determined during training. This allows a model trained with quantization in the loop to be converted to a fixed point inference model with little effort, eliminating the need for a separate calibration step.

此外，激活的最小值和最大值在训练期间确定。这使得在循环中用量化训练的模型可以毫不费力地转换成固定点推断模型，从而不需要单独的校准步骤。(校准是为了获得参数的范围)

二、具体实现

github 代码：

https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md

The linked model tar files contain the following:

Trained model checkpoints：

mobilenet_v1_1.0_224.ckpt.data-00000-of-00001（保存变量及其取值）

mobilenet_v1_1.0_224.ckpt.index

mobilenet_v1_1.0_224.ckpt.meta（保存图结构）

Eval graph text protos (to be easily viewed) ：mobilenet_v1_1.0_224_eval.pbtxt
Frozen trained models：mobilenet_v1_1.0_224_frozen.pb（模型大小：17173742）
Info file containing input and output information：mobilenet_v1_1.0_224_info.txt
Converted TensorFlow Lite flatbuffer model：mobilenet_v1_1.0_224.tflite（模型大小：4276000）

Note that quantized model GraphDefs (pb文件) are still float models, they just have FakeQuantization operation embedded to simulate quantization. These are converted by TensorFlow Lite to be fully quantized. The final effect of quantization can be seen by comparing the frozen fake quantized graph to the size of the TFLite flatbuffer, i.e. The TFLite flatbuffer is about 1/4 the size. For more information on the quantization techniques used here, see here.

MobileNet V1 scripts（路径：models/research/slim/nets/mobilenet_v1.md 在nets文件夹下关于mobilenet的script）

This package contains scripts for training floating point and eight-bit fixed point TensorFlow models.

Quantization tools used are described in contrib/quantize.（量化工具）

Conversion to fully quantized models for mobile can be done through TensorFlow Lite.（上文说过了现在文件只是fake quantization，所以要全量化需要用TensorFlow lite转换）

Accuracies were computed by evaluating using a single image crop.（即未使用多尺度，这些精度是通过使用单个图像作物评估来计算的。一些学术论文报告使用多种尺度的多种作物来提高准确度。）

我不用bazel，直接跑脚本即可（bazel 用到build和workspace文件，可以看看对应的规则和依赖）

运行结果：

######### float ##########

(tf17-2)fuhao@user-ubuntu:~/workspace/projects/tf-models/research/slim$ CUDA_VISIBLE_DEVICES=0 python nets/mobilenet_v1_eval.py

--dataset_dir "/home/fuhao/workspace/data/ILSVRC2012"

--checkpoint_dir "/tmp/checkpoints/mobilenet_v1_1.0_224.ckpt"

--eval_dir "/tmp/mobilenet/imagenet/eval-float"

eval/Recall_5[0.89988]
eval/Accuracy[0.7102]

######### quantization ##########

(tf17-2) fuhao@user-ubuntu:~/workspace/projects/tf-models/research/slim$ CUDA_VISIBLE_DEVICES=0 python nets/mobilenet_v1_eval.py

--dataset_dir "/home/fuhao/workspace/data/ILSVRC2012"

--checkpoint_dir "/tmp/checkpoints/mobilenet_v1_1.0_224_quant.ckpt"

--eval_dir "/tmp/mobilenet/imagenet/eval-22-true"

--quantize=True

eval/Recall_5[0.89226]
eval/Accuracy[0.69914]

思路：

用TFLite 将模型全量化，然后可以看一下量化后的模型内容（怎么打开看tflite的内容？）

tflite文件怎么用？

可视化pb文件：将pb转换成pbtxt：Tensorflow GraphDef pb 文件读和写（binary format text format ）

三、参考资料

Tensorflow MobileNet移动端迁移学习指南2

https://blog.csdn.net/gubenpeiyuan/article/details/79671558

TensorFlow For Poets

https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#3

开发者指南

https://www.tensorflow.org/mobile/tflite/devguide

【Google量化】Mobilenet TensorFlow-Slim