Tensorflow Lite -- camera demo

简介

TF lite是Tensorflow官方提供的在移动设备运行机器学习模型的解决方案。

主要优点：

性能（没有明显的准确率的下降）
低延迟
模型体积小
兼容性（安卓，ios）
加速
工具

camera demo是官方提供的例子。通过摄像头实时的影像，根据已训练好的模型，判断出当前镜头里的物品分类。

参考：https://codelabs.developers.google.com/codelabs/tensorflow-for-poets-2-ios/#2

下面列举几个关键的点

数据

.tflite文件：mobilenet_v1_1.0_224.tflite (在ImageNet上用Mobilenet训练出的模型，flatbuffer格式)
labels: labels.txt （对应的所有标签）

.tflite文件其实是个FlatBuffer文件。FlatBuffer的优点和ProtoBuffer的比较，可参考：https://blog.csdn.net/chosen0ne/article/details/43033575

一般的图模型文件是.pb文件（Protocol Buffer）保存，tflite文件就是通过pb文件转换过来的。同时做了一些优化：

去掉了没有用到的图节点（因为TFLite不需要训练，）
通过连接操作成为更有效的组合操作，从而提高性能。

参考：https://www.tensorflow.org/lite/convert/

TF的源码中专门有个python脚本用于转换：

⁨tensorflow⁩/tensorflow⁩/⁨lite⁩/⁨python⁩/tflite_convert.py

使用方式：

graph_def_file, pb文件
output_file, 　　输出tflite文件
input_format, 输入文件的格式
- TENSORFLOW_GRAPHDEF, graphdef
input_shape
intput_array
inference_type　　
intput_data

IMAGE_SIZE=224
tflite_convert \
  --graph_def_file=tf_files/retrained_graph.pb \
  --output_file=tf_files/optimized_graph.lite \
  --input_format=TENSORFLOW_GRAPHDEF \
  --output_format=TFLITE \
  --input_shape=1,${IMAGE_SIZE},${IMAGE_SIZE},3 \
  --input_array=input \
  --output_array=final_result \
  --inference_type=FLOAT \
  --input_data_type=FLOAT

关键方法

1、读取tflite文件

　　tflite::FlatBufferModel::BuildFromFile(), 返回类型 std::unique_ptr<tflite::FlatBufferModel>，读取的就是tflite文件。

2、读取label文件

　　按行读取到vector中。

3、运行model

基于FlatBufferModel构建Interperter

tflite::ops::builtin::BuiltinOpResolver resolver;
tflite::InterpreterBuilder(*model, resolver)(&interpreter); // model就是FlatBufferModel

resize input tensors （Interperter->ResizeInputTensor()，重定义大小后，要调用AllocateTensors方法，更新tensors。但这个操作比较费时，在size不变的情况下，不要调用。）

int input = interpreter->inputs()[0];
std::vector<int> sizes = {1, 224, 224, 3};
interpreter->ResizeInputTensor(input, sizes);

设置input tensor的值

// 取输入tensor
float* out = interpreter->typed_tensor<float>(input);
ProcessInputWithFloatModel(in, out, image_width, image_height, image_channels);

// 向输入tensor的原始数据中填充 像素值。像素值经过标准化。
void ProcessInputWithFloatModel(
    uint8_t* input, float* buffer, int image_width, int image_height, int image_channels) {
  for (int y = 0; y < wanted_input_height; ++y) {
    float* out_row = buffer + (y * wanted_input_width * wanted_input_channels);
    for (int x = 0; x < wanted_input_width; ++x) {
      const int in_x = (y * image_width) / wanted_input_width;
      const int in_y = (x * image_height) / wanted_input_height;
      uint8_t* input_pixel =
          input + (in_y * image_width * image_channels) + (in_x * image_channels);
      float* out_pixel = out_row + (x * wanted_input_channels);
      for (int c = 0; c < wanted_input_channels; ++c) {
        out_pixel[c] = (input_pixel[c] - input_mean) / input_std;
      }
    }
  }
}

invoke，执行模型，得到的结果保存在输出张量里。

if (interpreter->Invoke() != kTfLiteOk) {
    LOG(FATAL) << "Failed to invoke!";
}

读取output tensor的值

uint8_t* quantized_output = interpreter->typed_output_tensor<uint8_t>(0);

取超过阈值的topN个预测值

GetTopN(output, output_size, kNumResults, kThreshold, &top_results);

关键类

FlatBufferModel，模型类
Interperter，解释器类
TfLiteTensor， Tensor类

Interperter类中，跟输入输出张量相关的方法：

// Tensors被定义为整型。
const std::vector<int>& inputs() const; // 所有输入张量对应的索引
const std::vector<int>& outputs() const; // 所有输出张量对应的索引
TfLiteTensor* tensor(int tensor_index); // 返回对应索引的tensor
 
template<class T>
T* typed_tensor(int tensor_index);  //取得指定tensor的原始数据的指针
T* typed_input_tensor(int index);   //取得输入tensor
T* typed_output_tensor(int index); // 取得输出tensor

训练模型的步骤参考：

https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#8

Tensorflow Lite -- camera demo

猜你喜欢