KubeEdge Sedna v0.6 & Ianvs v0.2 major release: edge-cloud collaborative lifelong learning comprehensive upgrade

This article is shared from Huawei Cloud Community "  KubeEdge Sedna v0.6 & Ianvs v0.2 Big Release: Comprehensive Upgrade of Edge-Cloud Collaborative Lifelong Learning ", author: Cloud container big future.

This article discusses the challenges and solutions of edge intelligence applications in open world problems, focusing on KubeEdge Sedna v0.6 and Ianvs v0.2 released by KubeEdge SIG AI. These two open source projects have comprehensively improved the edge-cloud collaborative lifelong learning function and performance. Edge-cloud collaborative lifelong learning imitates human learning methods, combines cloud knowledge base and edge data, and realizes multi-task transfer learning, identification and processing of unknown tasks, and prevention of catastrophic forgetting.

This article highlights the main content of this feature upgrade, including:
  1. Support open world edge-cloud collaborative lifelong learning in unstructured data scenarios

  2. Provides a complete test suite of open source datasets, baseline algorithms and evaluation metrics

  3. For scenarios such as robot inspection and automatic driving, new unknown task recognition and processing capabilities have been developed, including new sample recognition, training data generation, multi-model joint reasoning, etc.

1. Background

Machine learning technology has experienced the ups and downs of the capital market in the past decade, but it has always maintained the momentum of technological innovation. For example, AlphaGo has defeated the world Go champion many times, AlphaFold has predicted 98.5% of the human protein structure, and the chatbot ChatGPT has an amazing dialogue ability, which has even been applied to professional consulting and research and development. It is undeniable that machine learning has demonstrated a level of intelligence beyond that of humans in a closed environment with clear rules, such as a game scene, and the risk of errors in such an environment is relatively low. Until now, this ten-year technological feast has had intermissions from time to time, but it is still continuing, not a whim.

However, behind this dazzling machine learning feast, there are also many challenges. As machine learning applications are gradually deployed to the edge, the limitations of machine learning are increasingly apparent in edge intelligence scenarios that are closer to users and face more open environments. In the past five years, relevant public reports have appeared frequently, as shown in Figure 1.

  • In 2017, Atlas, the benchmark for biped robots, fell off the stage of the international conference during the demonstration
  • In 2018, the automatic driving systems of leading companies such as U Company, T Company and G Company caused many casualties
  • In 2020, the shopping guide robot of Fuzhou Zhongfang Wanbao City fell down the escalator several meters high and knocked down two customers in front of it
  • In 2021, the biped robot Walker X accidentally fell during the demonstration at the World Artificial Intelligence Conference
  • In 2022, the wheeled robot Xiao Man Donkey drove into the wet concrete floor of the Henan University campus and got stuck in the mud and could not move forward.
  • In 2022, the video of the quadruped robot Go 1 frequently wrestling when delivering drinks will attract high attention on the Internet

Figure 1 Abnormal examples of domestic and foreign side smart devices in the past five years

From this series of cases, we can see that the edge intelligence technology in the open world (Open World) will face many long tail applications (Long Tailed Application) or abnormal cases (Corner Case), and users will find that the deployed machine learning model encounters Inputs that differ or do not match the training data, resulting in poor performance or errors in the model. For example, an object recognition model may not be able to recognize obstacles in complex road conditions, a speech recognition model may not be able to handle noisy speech, and an image generation model may not be able to generate reasonable images based on complex text, etc.

Usually, when the model is deployed in a variable side environment, it is prone to the open world problem. Users can also quantitatively judge whether the side model faces open world problems through the following benchmarking methods:

 
  • Test set evaluation : Use a test set different from the training data, especially one that contains anomalies or rare cases, to evaluate the accuracy, robustness, and generalization of the model. For example, users can use the open world image classification dataset to test image classification models.
  • Online feedback : After deploying the model, collect user feedback and comments, as well as the error rate and failure rate of the model, to monitor the actual effect of the model and user satisfaction. For example, users can use the online evaluation method to update the performance indicators of the model in real time.

The case in Figure 1 also illustrates that the open world is a general problem of edge intelligence applications, not just a special case of a specific product or service design flaw. The open world problem will exist in the field of edge intelligence for a long time, continuously and generally, for two main reasons:

  • Edge intelligence presents a demand forest situation, and side requirements are complex and diverse due to applications, hardware, and environments.
  • The field of edge intelligence is in the period of solution exploration, and the collaborative ecology of various business models and even platforms is still growing.

The open world problem is a common and persistent challenge faced by edge intelligence applications, which needs to be solved from both system and algorithm levels.

Long-tail applications or abnormal cases in the open world have two characteristics, diverse forms and scarce data. This has an impact on both system performance and algorithm performance of edge intelligence technology:

  • In terms of system performance, data is generated at the edge, while computing power is concentrated in the cloud. This means that if machine learning services are all processed on the cloud, it will be difficult to ensure data security, offline service autonomy and real-time performance; if they are all processed on the side, they will not be able to enjoy the cloud computing power, development environment and product ecology. It will reduce the quality of intelligent services and increase the cost of research and development, maintenance and sales.
  • In terms of algorithm performance, data heterogeneity and small samples brought by the open world will affect the stable operation of edge intelligent application services, further causing the problem of catastrophic forgetting (Catastrophic Forgetting).

Therefore, the open world problem is a common and persistent challenge faced by edge intelligence applications, which needs to be solved from both system and algorithm levels.

2. The version is fully upgraded

▍1. Community history solution

To solve the edge intelligence problem in the open world, we can borrow from the human learning process. Human beings can work and live normally in the open world because everyone is constantly accumulating and using the knowledge of the past and others to learn more [3]. Based on the human learning mechanism, KubeEdge SIG AI has published a formal definition of edge-cloud collaborative lifelong learning at an international academic conference [4,5]: when there are N historical training tasks in the cloud-side knowledge base, reasoning about the current task and There will be M side tasks in the future, and the knowledge base on the cloud side will be continuously updated. Among them, M can be infinitely increased, and the M reasoning tasks on the side may be different from the N historical training tasks on the cloud side knowledge base. Specifically, edge-cloud collaborative lifelong learning adopts the following technologies to deal with open world problems from two aspects of system and algorithm.

  • In terms of systems, the edge-cloud collaborative architecture ensures edge data security compliance and edge AI service offline autonomy while applying cloud resources;
  • Algorithms:
Multi-task transfer learning technology establishes different tasks for different distributions, realizes "thousands of people and thousands of faces" accurate prediction, and copes with data heterogeneity;
Incremental processing technology for unknown tasks is based on small samples to carry out continuous learning and improvement through data collection, generation and other methods, gradually realize AI engineering and automation, and deal with the problem of sample scarcity;
Cloud-side knowledge base technology to store historical knowledge, update new knowledge in the scene, and avoid catastrophic forgetting;

KubeEdge SIG AI continues to open source the research and development results of edge-cloud collaborative lifelong learning. In 2021, KubeEdge-Sedna v0.3 released the industry's first edge-cloud collaborative lifelong learning open source feature. In 2022, KubeEdge-Ianvs v0.1 will release the industry's first distributed collaborative AI benchmark and support incremental learning features. As shown in Figure 2, the lifelong learning history implementation of Sedna and Ianvs has the following advantages at the architecture and engineering levels:

  • System Modularity
The integrated system process has been decomposed into multiple modules and each module has clear functions, which is convenient for edge-cloud scheduling
Open module configuration, all modules provide skippable hotlink function
Open module algorithm configuration, all modules provide the algorithm switching function under the module interface
  • Model plug-in
Open the Estimator interface, any type of model that meets the interface can be accessed, and enables lifelong learning
Unknown task identification and other modules have built-in meta-model and sample migration, which can adaptively learn and access model behavior
  • The scene can be extended
The core of the platform and the application program are decoupled, and different applications do not interfere with each other
Open hyperparameter selection based on cloud-native methods, and different applications can realize cloud-native customization through K8S-CRD
  • Cloud Native Edge Computing
Benefit from KubeEdge's cloud-native edge computing capabilities, facilitating edge-cloud scheduling, migration, and communication of applications
Benefit from the open and flexible interfaces of K8S and CRD, which is easy to integrate with the cloud native ecosystem and rich in functions

Figure 2 KubeEdge SIG AI edge-cloud collaborative lifelong learning architecture

▍2. Bianyun collaborative lifelong learning comprehensive upgrade

This time, KubeEdge SIG AI fully upgrades edge-cloud collaborative lifelong learning to cope with open world application scenarios. The latest Sedna v0.6 and Ianvs v0.2 released by KubeEdge SIG AI provide the following enhancements:

  • Edge-cloud collaborative lifelong learning not only supports structured data scenarios, but also further supports unstructured data scenarios such as pictures and videos
  • Edge-cloud collaborative lifelong learning provides a comprehensive
  • Edge-cloud collaborative lifelong learning has advanced capabilities for unknown task identification and processing

Figure 3 KubeEdge SIG AI Edge-Cloud Collaborative Lifelong Learning Algorithm Process

The corresponding updated version of edge-cloud collaborative lifelong learning process is shown in Figure 3. The following chapters introduce the three new features respectively.

2.1 Upgrade Feature 1: Support Unstructured Data Scenarios

Edge-cloud collaborative lifelong learning needs to adapt to various scenarios in the open world. In the open world, there are not only structured data such as machine control, but also unstructured data such as audio and video. At the same time, the work facing the open world will also run through the entire edge-cloud collaborative lifelong learning process. Unknown tasks may be encountered when the edge model is running, which needs to be identified in advance, processed and updated in a timely manner to ensure service robustness and reliability to cope with the open world. A related case is shared below.

The demonstration case in Figure 4 shows the case of robot intelligent navigation, which can be used for robot delivery or industrial inspection. In this case, the semantic segmentation technology based on KubeEdge-Sedna lifelong learning was deployed in the Huawei campus. Lifelong learning can detect new categories such as slopes to deal with unknown situations, such as overcoming low obstacles, and finally realize intelligent navigation. The case demonstration video has been released at Open Source Summit Japan 2022, the first cloud-native edge computing academic seminar (KEAW'22), and the KubeEdge community open class [12-14]. The verification results show that the accuracy of the model in the Corner Case is increased by 1.78 times, and the single delivery time is reduced by 28.04%.

Figure 4 KubeEdge-Sedna cloud robot lifelong learning case

2.2 Upgrade Feature 2: Publish a Comprehensive Benchmark Test Suite
 
 

Algorithms related to lifelong learning, such as lifelong SLAM and lifelong target detection, have received attention in recent years because they can deal with edge data heterogeneity and small sample problems. However, in practice in the real world, it is necessary to further consider the realization of edge-cloud collaboration. To accelerate research and results transformation, KubeEdge SIG AI has open sourced the edge-cloud collaborative lifelong learning benchmark suite to help artificial intelligence application developers verify and select the most appropriate edge-cloud collaborative lifelong learning algorithm. The features released this time also support semantic segmentation application samples, which can be used in scenarios such as robot navigation, inspection, cleaning, and delivery.

In this version release, KubeEdge-Ianvs also provides out-of-the-box real data sets (name to be determined), baseline algorithms, and key indicators for developers to explore and use, as shown in Figure 5.

Figure 5 Edge-Cloud Collaborative Lifelong Learning Benchmark Test Suite: New Dataset

2.3   Upgrade Feature 3: New unknown task recognition and processing high-level capabilities
 

(1) Robot inspection scene:

Placeholder-Based Unknown Task Recognition Unknown task recognition is a key issue for lifelong learning in robot anomaly detection and inspection scenarios. In these scenarios, failure to detect unknown scenarios can result in severe economic losses. Traditional machine learning methods can only perform test set inference by training limited known samples. For unknown samples of new categories, they cannot effectively identify them, but treat them as known samples. Therefore, how to identify and deal with unknown samples or unknown tasks will become an important research direction of artificial intelligence in the future. KubeEdge Ianvs reproduced the CVPR2021 paper "Learning placeholders for open-set recognizes" [15] in the edge-cloud collaborative lifelong learning scenario, as shown in Figure 6. This paper proposes a placeholder method that mimics the emergence of new classes, which can convert closed-set training to open-set training. This work can advance research on unknown task identification and help us explore multiple solutions to this problem.

Figure 6 Robot inspection scene: unknown task recognition based on placeholders

(2) Robot inspection scene: Unknown task processing based on generative confrontation network

In robot anomaly detection and inspection scenarios, lifelong learning needs to deal with the identified unknown tasks. In the process of lifelong learning, the data of unknown tasks may be heterogeneous and small samples. A Generative Adversarial Network (GAN) is an advanced generative model that can generate data from the distribution of real data. KubeEdge SIG AI released a new feature on Ianvs, trying to use GAN to solve the small sample problem, but the data generated by GAN is not labeled. Self-taught learning is a method to improve classification performance by using sparsely encoded and unlabeled data to construct advanced features. Therefore, community members combine GAN with self-learning to help lifelong learning handle unknown tasks, as shown in Figure 7.

Figure 7 Robot inspection scene: Unknown task processing based on generative confrontation network

(3) Autonomous driving scenario: Unknown task processing based on multi-task joint reasoning

Autonomous driving is one of the important application areas of edge AI. It needs to study how to coordinate edge and cloud resources to provide support for autonomous driving applications. Autonomous driving has high requirements for edge AI reasoning performance. On the one hand, due to the characteristics of vehicle movement, the scenarios faced by autonomous vehicles are complex and changeable, and the applicable tasks are uncertain. Therefore, it is necessary to dynamically update the joint reasoning method according to the task relationship. On the other hand, autonomous driving also has high requirements for real-time performance, which requires us to make a trade-off between accuracy and delay. Therefore, it is very difficult for edge devices to support such applications. In the case of autonomous driving perception, many factors can affect the performance of a model trained for a certain task, and for some tasks, we have to use a suboptimal model for inference, which will greatly reduce the inference performance. Joint inference can improve perceptual performance, and this approach has been successfully applied in Sedna historical projects, such as the helmet detection example. The algorithm published on Ianvs this time adds support for multi-task joint reasoning to the lifelong learning function. This function will support edge devices represented by self-driving cars to complete high-precision neural network reasoning locally on the basis of meeting real-time requirements. Research is mainly based on heterogeneous multi-task autonomous driving perception datasets such as BDD100k.

Figure 8 Autonomous driving scenario: Unknown task processing based on multi-task joint reasoning

3. Release Note

If readers are interested in more details about this release, please refer to Sedna v0.6 and Ianvs v0.2 Release Note:

https://github.com/kubeedge/sedna/releases/tag/v0.6.0

https://github.com/kubeedge/ianvs/releases/tag/v0.2.0

In the future, KubeEdge SIG AI will release a series of articles to introduce the features of this comprehensive upgrade in detail. Readers are welcome to continue to pay attention to the community dynamics.

references

[1] Zheng, Z., Li, Y., Song, H., Wang, L., & Xia, F. (2022, October). Towards Edge-Cloud Collaborative Machine Learning: A Quality-aware Task Partition Framework. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (ACM CIKM’22), pp. 3705-3714.

[2] Zheng, Z., Luo, P., Li, Y., Luo, S., Jian, J., & Huang, Z. (2022, June). Towards lifelong thermal comfort prediction with KubeEdge-Sedna: online multi-task learning with metaknowledge base. In Proceedings of ACM e-Energy’22, pp. 263-276.

[3] B. Liu, Lifelong machine learning: a paradigm for continuous learning., Frontiers of Computer Science. 11, no. 3 (2017): 359-361., 2017.

[4] Huawei Cloud Developers. Support edge-cloud collaborative lifelong learning features, KubeEdge sub-project Sedna 0.3.0 version released! [EB/OL]. 2021-06-07. https://segmentfault.com/a/1190000040132422/en.

[5] Zheng Zimu. KubeEdge-Sedna v0.3: Towards an AI Engineering Paradigm for Automatic Customization in the Next Era [J]. Automation Expo. vol. 39., no. 344 (2022.02): pp. 72-75. 2022.

[6] Zheng, Zimu, Jie Pu, Linghui Liu, Dan Wang, Xiangming Mei, Sen Zhang, and Quanyu Dai. "Contextual anomaly detection in solder paste inspection with multi-task learning." ACM Transactions on Intelligent Systems and Technology (TIST ) 11, no. 6 (2020): 1-17.

[7] Zheng, Z., Xie, D., Pu, J., & Wang, F. (2020, June). Melody: Adaptive task definition of cop prediction with metadata for hvac control and electricity saving. In Proceedings of ACM e-Energy’20. pp. 47-56.

[8] Zheng, Z., Wang Y., Dai Q., Zheng H., Wang, D. "Metadata-driven task relation discovery for multi-task learning." In Proceedings of IJCAI (CCF-A), 2019.

[9] Zheng, Z., Chen, Q., Fan, C., Guan, N., Vishwanath, A., Wang, D., & Liu, F. "Data Driven Chiller Sequencing for Reducing HVAC Electricity Consumption in Commercial Buildings." In Proceedings of ACM e-Energy, 2018. Best Paper Award.

[10] Zheng, Z., Chen, Q., Hu, C., Wang, D., & Liu, F. "On-edge Multi-task Transfer Learning: Model and Practice with Data-driven Task Allocation." In Proceedings of IEEE TPDS (CCF-A), 2019.

[11] Chen, Q., Zheng, Z., Hu, C., Wang, D., & Liu, F. "Data-driven task allocation for multi-task transfer learning on the edge. " In Proceedings of IEEE ICDCS (CCF-B), 2019.

[12] Siqi Luo. From Ground to Space: Cloud-Native Edge Machine-Learning Case Studies with KubeEdge-Sedna [EB/OL]. 2022-12-05. https://www.youtube.com/watch?v=bIaeWGelsJE 

[13] Zheng Zimu. Bianyun Collaborative Lifelong Learning Innovation Exploration and Implementation in Smart Parks and Industrial Fields [EB/OL]. KEAW'22. 2022-11-17.  https://www.bilibili.com/video/BV1Me411N7gA/ 

[14] Zheng Zimu, Yang Haojin. KubeEdge Cloud Native Edge Computing Open Course 12 - Advanced Edge Intelligence: Adapting to Various Scenarios and Dealing with Distributed Systems [EB/OL]. 2022-12-27. https://www.bilibili  . com/video/BV1W44y1R7uB

[15] Zhou, D. W., Ye, H. J., & Zhan, D. C. (2021). Learning placeholders for open-set recognition. In Proceedings of CVPR (pp. 4401-4410). 

Extra!

cke_6464.jpeg

Huawei will hold the 8th HUAWEI CONNECT 2023 at the Shanghai World Expo Exhibition Hall and Shanghai World Expo Center on September 20-22, 2023. With the theme of "accelerating industry intelligence", this conference invites thought leaders, business elites, technical experts, partners, developers and other industry colleagues to discuss how to accelerate industry intelligence from the aspects of business, industry, and ecology.

We sincerely invite you to come to the site, share the opportunities and challenges of intelligentization, discuss the key measures of intelligentization, and experience the innovation and application of intelligent technology. you can:

  • In 100+ keynote speeches, summits, and forums, collide with the viewpoint of accelerating industry intelligence
  • Visit the 17,000-square-meter exhibition area to experience the innovation and application of intelligent technology in the industry at close range
  • Meet face-to-face with technical experts to learn about the latest solutions, development tools, and hands-on
  • Seek business opportunities with customers and partners

Thank you for your support and trust as always, and we look forward to meeting you in Shanghai.

Official website of the conference: https://www.huawei.com/cn/events/huaweiconnect

Welcome to follow the "Huawei Cloud Developer Alliance" official account to get the conference agenda, exciting activities and cutting-edge dry goods.

Click to follow and learn about Huawei Cloud's fresh technologies for the first time~

Microsoft's official announcement: Visual Studio for Mac retired The programming language created by the Chinese developer team: MoonBit (Moon Rabbit) Bjarne Stroustrup, the father of C++, shared life advice Linus also dislikes messy abbreviations, what TM is called "GenPD" Rust 1.72.0 released , the minimum supported version in the future is Windows 10 Wenxin Yiyan opens WordPress to the whole society and launches the "100-year plan" . : Crumb green language V1.0 officially released
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/10104579