Baidu brain EdgeBoard performance evaluation to calculate the card Resnet50 / Mobile-SSD Model

ResNet model

Introduction
In the last test, we start training from scratch convolution output a three-layer tandem layer of a fully connected, as a predictor of cats and dogs classification model, this time we own a ResNet training model, and in the following three performance comparison environment

  • AIStudio CPU: 2 Cores 8GB Memory
  • AIStudio GPU: V100 16GB VMem
  • Edgeboard

Trainer

Model uses AIStudio training, training and prediction code is as follows

RESNET: https://aistudio.baidu.com/aistudio/projectdetail/67775
MOBILE: https://aistudio.baidu.com/aistudio/projectdetail/67776
in accordance with our previous practice, the export model files and param file.

The test results
we execute the forecast, ignoring the speed out pretreatment, time to spread before the model is only calculated.

For AIstudio platform, we calculate the following run-time code

label = exe.run(inference_program, feed={feed_target_names[0]: tensor_img}, fetch_list=fetch_targets)

For Edgeboard above PaddleMobile, we calculate the following run-time code

PaddleTensor tensor;
tensor.shape = std::vector({1, input_channel, input_width, input_height});
tensor.data = PaddleBuf(data, sizeof(data));
tensor.dtype = PaddleDType::FLOAT32;
std::vector paddle_tensor_feeds(1, tensor);
​
PaddleTensor tensor_out;
tensor_out.shape = std::vector({});
tensor_out.data = PaddleBuf();
tensor_out.dtype = PaddleDType::FLOAT32;
std::vector outputs(1, tensor_out);
​
predictor->Run(paddle_tensor_feeds, &outputs);

The following is a review of data from two models

Resen

Edgeboard:

CPU:

GPU:


Mobile_Net
Edgeboard:

GPU:


CPU:


Summary:
The following table is a comparison of the two models to predict the speed, from which point of view, its speed relative to the V100 GPU and even certain advantages, it was hard to believe. Personal analysis is due to several reasons

  • Paddle-mobile launch more predictive, and AIstudio full version Paddlepaddle have the advantage of efficiency compared to the start, AIstudio predict may be slow to start.
  • Throughout the forecast model batch size is equivalent to 1, not to play the advantage of the GPU.
  • The deployment of a three-year budget according to a calculation, then, GPU V100 price is about 100,000, CPU 1 Wan, EdgeBoard 5 Qian, price is still pretty high.

  

I'm making model predicts when the Meter to power in a rough estimate (limited conditions), the Meter readings vary between 0.6A-8A. 12V adapter for use in combination, Edgeboard I probably estimate the power consumption of 8W.

To 8W of power consumption, in predicting the speed of a single picture above a few times leading power GPU and the CPU. Edgeboard's performance is so I am more surprises. Some time ago had wanted to continue to transplant a large-scale network segmentation Unet try, want to continue to try to model the size of his biggest can run, but it seems Edgeboard currently does not support segmentation, there is a certain amount of regret.

In addition, I for debugging purposes, discovered there are several versions of the firmware release is not very stable, some op some problems. Edgeboard also found in two laptop computers on my network is not very stable, the situation can not ping each other often appear normal after replacing the PC, being not find out why.

Edgeboard embedded neural network is my first contact with the acceleration device. Paddle-mobile first mobile terminal is contacting the neural network I frame, a first contact is I frame based on the acceleration FPGA implementation. From what I understand this framework now only less than six months, it has released several models conversion tool, reducing the development effort and supports EasyDL this way. Although there are still some immature pit need to fill, but I believe in the next iteration of the software, it can be a good embedded prototyping platform.

Mobile-SSD model

The case of this training ourselves a Mobilenet-SSD models, increasing the input of different dimensions, the model of operational efficiency comparison

AIStudio CPU: 2 Cores 8GB Memory
AIStudio GPU: V100 16GB VMem
Edgeboard

Training model
model using the official project AIStudio provide training, training and prediction code is as follows

SSD-MobileNet: https://aistudio.baidu.com/aistudio/projectdetail/41752
accordance with our practice before, and export model files param file.

Run forecast
we predicted execution, ignoring the speed out pretreatment, time to spread before the model is only calculated.

For AIstudio platform, we calculate the following run-time code

label = exe.run(inference_program, feed={feed_target_names[0]: tensor_img}, fetch_list=fetch_targets)

For Edgeboard above PaddleMobile, we calculate the following run-time code

PaddleTensor tensor;
tensor.shape = std::vector({1, input_channel, input_width, input_height});
tensor.data = PaddleBuf(data, sizeof(data));
tensor.dtype = PaddleDType::FLOAT32;
std::vector paddle_tensor_feeds(1, tensor);
​
PaddleTensor tensor_out;
tensor_out.shape = std::vector({});
tensor_out.data = PaddleBuf();
tensor_out.dtype = PaddleDType::FLOAT32;
std::vector outputs(1, tensor_out);
​
predictor->Run(paddle_tensor_feeds, &outputs);


The following picture is predicted results, due to time constraints, did not go very detailed training model, just compare the speed of the model run.

The following table is a comparison of the prediction model velocity in different dimensions, from the point of view, with respect to the speed of the GPU V100 substantially in the same order of magnitude, far ahead and GPU

在之前的文章里我们提到,本来想继续移植一个前段时间的大尺度的分割网络Unet进行尝试,想继续试试他最大可以跑的模型大小,但似乎Edgeboard目前还不支持分割,所以我们更换了目标检测网络进行尝试。在mobilenet-SSD这个模型上,Edgeboard最大可以跑到700*700的输入维度,并且能保持在16fps之上(不包含输入图像的语出过程),基本上具有实时性。

之前我提到的,在我的两台笔记本电脑上网络不是很稳定,经常出现相互无法ping通的情况,目前经过试验之后,发现问题为板子的网卡在与不支持千兆的网卡进行通信时候,不能正确的协商,仍然使用千兆模式,使用以下命令固定为百兆即可正常连接

ethtool -s eth0 speed 100 duplex full 


Edgeboard是我第一款接触的嵌入式神经网络加速设备。Paddle-mobile也是我接触的第一个移动端神经网络框架,也是我接触的第一个基于FPGA实现的加速框架。从我了解这个框架到现在仅仅不到半年的时间,已经发布了多个模型转换工具,降低了开发难度,并且支持EasyDL这种方式。虽然目前仍然有一些不成熟的坑需要填,不过相信在软件的迭代下面,它能成为一个很好的嵌入式原型设计平台。

作者:Litchll

Guess you like

Origin www.cnblogs.com/AIBOOM/p/11765335.html