As a world-leading smart terminal technology company, OPPO has been committed to providing the best user experience for end users. To achieve this goal, we are constantly looking for ways to better leverage the latest technologies, including cloud and artificial intelligence. A typical example is the Andes Brain strategy proposed by OPPO, which is committed to making terminal devices more intelligent.
Artificial intelligence helps unleash the potential of mobile devices. On the one hand, running the AI model on the terminal device can keep user data on the mobile hardware instead of sending them to the cloud, which can better protect user privacy. On the other hand, the computing power of mobile chips is rapidly improving and can support more complex artificial intelligence models. By combining cloud platforms with mobile chips for AI model training, we can use cloud computing resources to develop high-performance machine learning models that can adapt to different mobile hardware.
In 2022, we began to implement the AI engineering strategy through StarFire. StarFire is our self-developed machine learning platform. This platform combines cloud services, computing power and terminal devices. It is one of the six core capabilities of Andes Smart Cloud. Algorithm engineers can use various advanced cloud technologies provided by StarFire to meet the development and verification requirements of the end-cloud AI model.
The development of the device-side AI model is an important link that must be solved in the engineering link. The StarFire device-cloud integrated workbench (hereinafter collectively referred to as AI Workbench) is an important carrier to undertake the development and verification of the device-side model by OPPO algorithm engineers.
During the development of the end-to-end model, due to the particularity of the end-to-end scenarios, algorithm engineers not only need to ensure the model effect and pay attention to the fast, stable and economical indicators, but also need to solve many engineering link problems, especially the development collaboration problem of the end-cloud. After investigation, we found that the work on the engineering side takes a lot of time for the algorithm. If there is no complete tool chain support, each AI development unit needs to develop tools, deploy independently, and seize resources, which also brings a lot of manual non-standard operations. In terms of security, reusability, and communication and collaboration, the efficiency is very low, which brings great troubles to algorithm development and testing. In summary, the main pain points are as follows:
-
End models generally face stringent requirements to increase running speed and reduce latency and power consumption, and require abundant lightweight methods. -
The quantitative compilation process is cumbersome, and it is impossible to perform in-depth tuning through methods such as USI Search. -
The model adaptation and upgrade optimization of the inference engine and the chip platform are frequently repeated and the cost of manual operation is high. -
The resource utilization rate of the terminal cloud during the iterative development and deployment of the terminal model is not high, which limits the iteration and deployment efficiency of the model.
In response to the above business pain points and challenges, we built StarFire AI Workbench to undertake the end-to-end model-end-cloud collaborative development link, covering the frequently used model compression, conversion and compilation, power consumption test, performance test, x86 Cloud-side simulation and other pipeline functions.
Architecture Diagram of StarFire AI Workbench
Relying on Andes Smart Cloud, StarFire has built a relatively complete pipeline for model development and deployment on the cloud. For end-side scenarios, StarFire AI Workbench deeply integrates the existing cloud workflow with end-side devices by opening up the link between the cloud side, the real machine, and the power consumption machine, and can perform one-click quantitative model compilation in the Workbench , device-side matching, model distribution, batch verification and testing, and then proceed to the next verification after optimization. Through the Workbench, a large number of repetitive operations can be reduced, and tedious steps such as environmental management and equipment management can be omitted. At the same time, the end-side equipment can be effectively shared with the help of the platform. In the following, some important features of AI Workbench will be introduced in combination with the business pain points mentioned above.
In the development process of the terminal model, due to the strict size and power constraints of the computing resources of the terminal equipment, the mobile terminal model must meet the conditions of small model size, low computational complexity, low battery power consumption, and flexible deployment of updates. How to achieve the compression of the original model without significantly reducing the accuracy has always been a research area that both academia and industry have focused on.
StarFire AI Workbench 通过集成开源和自研的技术,包括常见的模型量化,模型剪枝、模型蒸馏等,可以支持多种主流深度学习框架的模型压缩,并针对硬件做了定制优化,可适应多种业务场景。同时,我们从算法工程师便捷使用的角度出发,构建了自动化压缩流程,在平台上形成了一站式工作流,极大地提升模型压缩工作的效率,并降低端模型部署时延。
StarFire 中的模型压缩技术
经过理想的压缩之后,端模型需要面向高通和 MTK 芯片的目标平台进行量化与编译工作,算法工程师一方面需要同时学习两个平台的量化编译流程,掌握众多的参数与配置文件,另一方面独立的量化编译工具功能有限(如对噪声的优化和高精度保证),最后还需要进行不同平台不同版本的量化编译环境配置,学习和实践成本较高。
StarFire AI Workbench 模型转换功能通过高效合理的服务封装和简单清晰的界面尽可能降低不同平台量化编译工具的使用成本。
-
易用性 :统一了各版本工具的配置环境,算法工程师无需关注 SDK 的版本和环境,只需要进行页面点选配置好参数就可以完成量化-编译的转换操作。 -
全面性 :具有多种模型量化噪声分析和优化功能,提高量化模型的精度。 -
灵活性:可以点选式配置必备参数,也提供可选填的扩展参数。
AI Workbench 功能界面
功耗测试架构图
算法工程师通过 AI Workbench 提交任务;
获取相关推理引擎环境及配置信息;
量化编译任务调度;
模型结果存储至自研存储 CubeFS;
-
获取相关配置信息,根据测试任务的需求及设备情况将任务调度至对应功耗机; -
推送功耗测试所需的配置文件至端侧设备,结果指标回传至对象存储/数据库中。
端模型基于其应用场景,对性能表现有极致的追求。StarFire 平台自主搭建了端云一体的模型开发和测试链路,支持本地真机的快速接入平台,同时平台内置完全解耦的推理引擎库、脚本库、模型库和运行环境镜像,算法工程师可以自定义地选择,实现对模型库/本地存储的模型转换、编译优化、量化、端侧推理时延和内存占用的性能测试、端云性能的对比。
整体性能测试架构如下图所示:
支持工控机/本地真机接入平台,快速构建定制化的端云协同开发和测试链路;
支持多个模型在单个芯片环境+推理引擎下的端云性能测试;
支持模型转换编译、模型端侧推理结果分析、端云性能对比;
维护了 OPPO 端侧模型开发团队常用的模型库和引擎 SDK;
支持注册工控机上连接的高通/MTK 手机芯片类型;
-
多访问方式 :支持 UI 和 API 接口访问,分别面向单仿真任务的可视化快速执行和工具链 pipeline 中的自由调用。 多任务并发能力:充分利用云侧计算集群的高伸缩性和多线程服务能力,支持 API 接口多任务并发能力;对外提供 python sdk,方便 pipeline 集成。
-
Workbench 将仿真输入信息上传至文件存储; -
基于 OPPO 的虚拟机构建驻守服务,实现与其上运行的 X86 架构成像调试仿真程序交互; -
驻守服务调用 X86 仿真程序进行成像仿真,将结果回传; -
Workbench 将结果下载到挂载的 CubeFS 中; -
仿真记录利用 RDS 存储,记录每次的仿真任务编号及状态,供驻守服务查询和使用。
相对于离线服务器的模式,云化仿真可以充分利用云上可伸缩的海量计算节点,提供更高效的相机调试仿真服务能力:
-
提升仿真效率 :快速通过安第斯智能云调度虚拟机补充仿真算力,提高任务效率; -
降低仿真成本 :任务低峰期释放资源,保留最小资源池,按需使用; -
提供底层运维支撑与技术支持 :节点层面、网络层面、系统层面、应用层面,能够很好支撑仿真任务高效、平稳运行。
StarFire 作为安第斯智能云承接 OPPO AI 工程化战略的重要载体,在 AI 端云协同开发的过程中还会进行更深层次的打磨和建设,包括联邦学习框架、智能端插件、模型管理和监控等。我们也会将更多 StarFire 在 AI 工程化建设中的实践,如算力资源利用率优化、推理功能建设、数据相关建设等,进一步与大家进行交流。
OPPO 安第斯智能云(AndesBrain)是服务个人、家庭与开发者的泛终端智能云,致力于“让终端更智能”。作为 OPPO 三大核心技术之一,安第斯智能云提供端云协同的数据存储与智能计算服务,是万物互融的“数智大脑”。
本文分享自微信公众号 - 安第斯智能云(OPPO_tech)。
如有侵权,请联系 [email protected] 删除。
本文参与“OSC源创计划”,欢迎正在阅读的你也加入,一起分享。