Yijiexing Cloud commercializes tens of thousands of node cloud operation and maintenance experience | Intelligent Monitoring of Light Operation and Maintenance

Editor's note:

EasyStack ECS, a new generation private cloud of EasyStack, commercializes the operation and maintenance experience of 1,000+ large and medium-sized enterprise customers and tens of thousands of node-scale cloud platforms, and realizes light operation and maintenance. It is based on a safe, stable, and efficient new-generation data center distributed cloud operating system. It separates the platform from the service through an integrated and scenario-based design concept, and realizes the evolvability and light operation and maintenance capabilities of the entire platform. In terms of light operation and maintenance, it can realize the intelligent unified operation and maintenance of ultra-large-scale cloud computing centers. It not only realizes the visualization and automation of logs, monitoring, and alarms, but also autonomously detects changes in system topology and service status, and then realizes intelligent perception Fault pre-diagnosis analysis and rapid self-healing.

This article is the intelligent monitoring article of the Easy Cloud Light Operation and Maintenance series.

 

As the scale of enterprise digital business continues to expand, the number of online business systems is increasing, and the stable operation of IT systems is becoming increasingly important. Faced with increasingly complex and changeable IT systems, companies need a unified monitoring platform covering infrastructure, system application performance and user experience management, providing unified monitoring, logging, and alarm services, and building a three-dimensional IT monitoring and operation and maintenance management system. There is no background operation to realize fault self-healing, improve the overall efficiency and service level of IT system operation and maintenance, and ensure the continuous and stable operation of the business system.

Unified monitoring, log, and alarm services, unattended intelligent operation and maintenance

Traditional private cloud monitoring management, inspection, log and other systems are built separately. Monitoring tools need to manually integrate and analyze data, and can only temporarily deal with problems encountered by the IT operation team. And EasyStack ECS, a new-generation private cloud of EasyStack, can realize the integrated and unified operation and maintenance of ultra-large-scale cloud computing centers, and productize the operation and maintenance experience of 1000+ large and medium-sized enterprise customers and tens of thousands of node-scale cloud platforms. Common problems are built in Within the product, the alarm knowledge base is constantly updated to realize the evolution of monitoring.

Yijiexing Cloud's next-generation private cloud ECS provides intelligent operation and maintenance monitoring services. In addition to providing cloud resource monitoring from a project perspective for each project, it also provides operation and maintenance personnel with a global perspective of intelligent operation and maintenance monitoring, which can monitor platform runtime Real-time monitoring of all kinds of indicators, the first time to understand the use of various resources and the operating status of various services, from fault early warning, discovery, diagnosis to processing, the entire process is automated, which greatly reduces the burden of operation and maintenance personnel Workload.

Easy Cloud ECS Intelligent Operation and Maintenance Monitoring

The new generation of ECS private cloud ECS has a complete monitoring and alarm mechanism, providing complete monitoring, log, and alarm APIs to facilitate integration with the existing systems of the enterprise. At the same time, it can set alarms for various indicators and notify the administrator of system operation failures in time And the potential risks. In addition, log management services are provided to facilitate operations and maintenance personnel to audit and troubleshoot the historical operating status of the platform.

Full stack coverage of resources, intelligent fault handling

The ECS monitoring service of Yijiexing Cloud's new generation of private cloud mainly conducts unified monitoring and management of the physical resources of the cloud platform, cloud service resources, distributed storage clusters, and control plane service status, and provides a wealth of large-screen visual displays for monitoring, covering many Item monitoring indicators fully meet the needs of users for system stability and reliability.

Real-time presentation of the integrated situation of the cloud platform: Provides a unified interface for multi-dimensional comprehensive monitoring of data center resources; situational awareness of the underlying resource data, provides an intuitive and friendly monitoring visualization display, intuitively reflects the overall operation and maintenance of applications, infrastructure, and alarms Health status, display the key data of the monitored object, and facilitate the operation and maintenance personnel to control all business applications and IT operations as a whole.

Multi-dimensional comprehensive monitoring of cloud resources

Assist operation and maintenance decision-making and capacity planning: Cloud monitoring provides users with an out-of-the-box monitoring experience. Users can log in to the cloud monitoring console to view cloud service monitoring reports, fine-grained monitoring indicators, performance, capacity, and operating status, and help Operation and maintenance decision-making and capacity planning; alarm services and automatic inspection reports can be pushed through mailboxes to ensure rapid warning when infrastructure is abnormal.

Real-time warning of infrastructure abnormalities

Fault pre-diagnosis analysis and rapid self-healing: Real-time and accurate grasp of the operating status of each business application system, autonomously detect changes in system topology and service status, and realize fault pre-diagnosis analysis and rapid self-healing based on intelligent perception.

Efficient fault location, fast self-healing

For heterogeneous multi-cloud, build a three-dimensional IT monitoring and operation and maintenance management system

With the increasing acceptance of multi-cloud by customers, more and more applications of x86 and non-x86 different platforms in customer IT resources need to support heterogeneous multi-cloud cloud platforms and provide unified service monitoring.

Yijiexing Cloud builds a heterogeneous multi-cloud cloud service platform based on a new generation of private cloud ECS, provides x86 and non-x86 heterogeneous computing capabilities for user business applications, and effectively shields the technical differences of the underlying heterogeneous resources to fully satisfy The demand of enterprise users for "diversified computing and multi-cloud form". At the same time, Yijiexing Cloud provides unified service monitoring for heterogeneous multi-clouds based on the new generation of private cloud ECS, breaking data islands, and building a three-dimensional IT monitoring and operation and maintenance management system.

Case: A large state-owned bank realizes intelligent monitoring based on the new generation private cloud ECS of Yijiexing Cloud

A large state-owned bank has total assets of more than 10 trillion yuan. The bank has promoted financial technology to the strategic height of the bank, actively promoted the construction of an Internet financial platform, adopted the new generation of private cloud ECS of EasyJet Cloud, and built a financial production cloud based on OpenStack. Since the bank cloud platform spans two locations and three centers, deploys thousands of nodes, and has more cloud platforms built in accordance with the project method, there are multiple sets of control planes and monitoring systems, and the problem of equipment and resource occupancy problems appears when they add up. The need for unified resource management, unified deployment, and unified operation and maintenance is increasingly urgent.

Yijiexing Cloud's new generation of private cloud ECS integrates centralized and unified monitoring operation and maintenance, and adopts centralized visual management in operation and maintenance: unified monitoring and management of cloud services and resource pools provided by three-center cloud systems in two places and construction , To maximize the availability of the platform. At the same time, with the help of AIOps, the summarized operation and maintenance experience is productized to realize intelligent fault event scheduling. When a certain type of failure occurs, the cloud platform automatically triggers the failure handling mechanism, and the full platform failure self-healing design fully guarantees the stable operation of the platform and improves the quality of platform management and operation and maintenance services.

Through the new generation of private cloud ECS intelligent operation and maintenance monitoring service of Yijiexingyun, it can collect and obtain monitoring indicators of private cloud resources or user-defined monitoring indicators in real time, detect service availability, and set alarms for indicators to fully control the core system. Service status and business support capabilities lay a data foundation for business and system performance analysis, IT operation and maintenance management decision-making, and ensure the smooth operation of cloud applications with a simple and efficient light operation and maintenance experience.

Guess you like

Origin blog.csdn.net/k8scaptain/article/details/107390360