New and upgraded! A new generation of cloud-native observability platform

This article is shared from the Huawei Cloud Community " Refresh and Upgrade!" A new generation of cloud-native observability platform ", author: The future of cloud containers.

Cloud native has become a trend for enterprise application modernization and digital transformation. The cloud-native architecture allows enterprise applications to have faster iteration speed, lower development complexity and better scalability. However, the ever-changing scenarios such as uncontrollable application deployment location and quantity make operation and maintenance complexity and operation and maintenance complex. The workload of maintenance personnel has greatly increased.

Compared with traditional operation and maintenance, operation and maintenance under the cloud native architecture pays more attention to the automated collection, visual presentation and intelligent decision-making of monitoring, logs, events, alarms and other data. In order to improve the operation and maintenance experience in cloud native scenarios, Huawei Cloud CCE Container Service brings a new generation of cloud native observable platform, focusing on the following four capabilities:

monitoring Center

In order to solve the problem of difficulty for cloud native users to use monitoring systems, CCE is optimized for complex scenarios of multi-service combinations, supports one-click activation of monitoring center capabilities, and provides a new one-stop visual monitoring experience from the perspective of containers, supporting clusters, nodes, Monitoring views from multiple dimensions such as workloads and Pods.

Figure 1  Monitoring Center

Alarm center

In order to solve the problems of complex Prometheus alarm statements, multiple configuration entries for different categories of alarm sources, and low configuration efficiency due to many basic alarm items, the alarm center capability is added to the CCE cluster to provide one-click configuration of container alarms based on templates. Default alarm rules can effectively cover common failure scenarios of clusters and containers.

Figure 2  Alarm Center

Log Center

Traditional log management systems in cloud-native scenarios have problems such as fragmented user experience, complex collection configuration, and log retrieval and viewing that do not fit the cloud-native conceptual model. To solve the above problems, CCE services have deeply integrated LTS log service capabilities and launched cloud-native logs. The center simplifies log collection configuration and provides a log management view based on a cloud native perspective.

Figure 3  Log Center

health Center

The rich monitoring indicators, events, and logs in cloud native scenarios can make it easier for users to locate problems, but they also virtually increase the technical threshold of operation and maintenance personnel. In order to allow more operation and maintenance personnel to quickly locate problems, the CCE service provides a health center capability. Based on the experience of Huawei Cloud container operation and maintenance experts, it conducts a comprehensive inspection of the cluster health status, discovers cluster faults and potential risks, and provides repair suggestions. .

Figure 4  Health Center

The above are the four major capabilities brought by the new generation of CCE cloud native observability platform. In the next article, we will delve into the challenges faced by customers in cloud native monitoring and focus on how the CCE monitoring center responds to such challenges, so stay tuned.

For service experience please visit

  • https://www.huaweicloud.com/product/cce.html

Related Links

  • https://support.huaweicloud.com/bulletin-cce/cce_bulletin_0066.html

  • https://bbs.huaweicloud.com/blogs/413722

Click to follow and learn about Huawei Cloud’s new technologies as soon as possible~

 

Lei Jun announced the complete system architecture of Xiaomi's ThePaper OS, saying that the bottom layer has been completely restructured. Yuque announced the cause of the failure and repair process on October 23. Microsoft CEO Nadella: Abandoning Windows Phone and mobile business was a wrong decision. Both Java 11 and Java 17 usage rates exceeded Java 8 Hugging Face was restricted from accessing. The Yuque network outage lasted for about 10 hours and has now returned to normal. Oracle launched Java development extensions for Visual Studio Code . The National Data Administration officially unveiled Musk: Donate 1 billion if Wikipedia is renamed "Weiji Encyclopedia" USDMySQL 8.2.0 GA
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/10123223