AoE: how to manage a good model?

Author: Ding Chao

Foreword

More and more businesses will use AI-related technologies, most of the AI ​​model is used to deploy in the cloud, after all, the server computing faster and easier to manage. With the terminal equipment performance, the use of AI in terminal model has greater value, can better meet the business real-time response, data privacy requirements. Drops travel bank card recognition function also plan to deploy the client, but also a lot of problems:

  1. Model upgrade difficulties, the model exists in the terminal are generally applied to software support, users can choose whether to apply the software update, version of the model will lead to divisive.
  2. Hardware adaptation problems, different terminal equipment vendors because the depth of customization factors, there will be some compatibility issues
  3. Different models running different framework for client engineer friendly enough.

To solve these problems pieces of terminal launched a smart team AoE as a solution, it will be the beginning of designing multi-model management support may be upgraded, multi-frame support, encryption model features such as infrastructure.

AoE is a good model of how management

We focused on the problems encountered, the major part of the work done 3:

  1. Try to do a number of models to cover tests to verify the model
  2. The use of the preparation of the operating environment to achieve load model
  3. To upgrade the model by dynamically update

For the following three are introduced separately.

Configuration of the operating environment

AoE SDK will summarize the five reasoning framework process, they are at the beginning of the pre-processing, execution reasoning, post-processing, release resources. For AoE integrated operating environment, the basic operation is abstract reasoning, by relying inverted design, the business depends only on the upper layer of abstraction AoE, without being concerned about access specific reasoning framework for implementation. The greatest advantage of this design is that developers can always add new framework for reasoning, without modifying the framework to achieve, so that the business development and AoE SDK to develop completely decoupled.

Users need only a simple description json file to complete the configuration of the operating environment, simplifying the user's process is more simple and efficient.

Simple configuration is as follows:

{
      "version": "1.0.0",           // 版本号
      "tag": "tag_mnist",           // 区分业务场景
      "runtime": "tensorflow",      // runtime类型
      "source": "installed",            // 安装源
      "modelDir": "mnist",              // 所在文件夹
      "modelName": "mnist_cnn_keras",   // 模型文件名
      "updateURL": "https://www.didiglobal.com"   // 升级配置链接
}

Model coverage test

For issue of hardware differences, we do during model verification tried to cover multi-model test, the model performance in the different models are recorded feedback to the model production team to help model the constant upgrade repair.

Processed comparative data generated when the taken part of the test is as follows:

Although not the same model, using the instruction may be different, but the general performance of the machine may be understood, the values ​​are for reference only. In this process, settle down a benchmark tool to help verify multi-coverage test models, the future this tool will be part of the open source model to help you verify the availability of, and the establishment of effective models compare.

Dynamic Update

AoE model by model distribution management module is divided into two types:

  1. Local models, meaning application software that comes with the model
  2. Remote model, it is through policy configuration, matching model downloaded from the server to the local model

The biggest difference between local and remote model is the model local model can not be changed, only to follow with application software updates, and remote versions of the model is through comparison between newer model, model and model by comparing the updated model for local. Both local and remote models models can co-exist, may also exist alone, in the latest edition of the drops travel in order to reduce the size of the package is not even a local model, all of the models are from the remote download.

The reason why the model is divided into two kinds, in order to ensure that the model is available and reliable, why do you say? Local models are usually only after a long time after the test as a stable version to follow APP brought online, either as the latest version, but also as a stable version later: even find the remote model later download the upgrade is not satisfied also by ash remote use of the test to stop using the remote model, ensuring high availability model.

A remote model of the business model has the ability to dynamically update to facilitate the iterative product release cycles are no longer dependent on the client. In writing assistance dynamic switch, and can even be accurate version of the model specified load.

Model Management overall structure as shown below:

Load model how to use?

Model Manager is a basic component of AoE to iOS, for example, the components to achieve in the next Loader directory . The default configuration file to support the model json format, runtime configuration of the code section would describe mnist demo configuration.

Configuration format models and model profile name and version of the remote storage address, are available through inheritance AoEModelConfigto do modify the class, the specific usage can refer to instances of squeezenet

In the open source version has AoE also provide you support a single multi-function model, to take the bank card identification For example, the entire two-step process, first find the area on the card and a digital card, the second is based on a digital picture area identify the card, so the process requires two models. tag field model configuration used open source projects primarily used to define the model relevant to function, with dir field may target specific model.

Written in the last

Guaranteed by the remote loading and gray test configuration multi-dimensional is to help model stable and safe operation, although the model for remote loading Not open source version on-line, but has been scheduled in the agenda, which is expected in September, lower will be on the line. If you, if you have thoughts on AI terminal operational environment, have questions, if you are interested in using this project, we invite you to join us.

Github Address:

Welcome star ~

QQ chat group (QQ group number: 815 254 379):

Please add group chat ~

Guess you like

Origin www.cnblogs.com/puhuichanpin/p/11491982.html