Concepts related to full-link automated monitoring and microservice application monitoring

Four parts of full link monitoring: link collection, indicator collection, log collection, and in-depth analysis

 

  • Link collection includes call chain and service topology, and is a serializer for full link analysis.

  • Indicator collection is integrated into the service link, enabling the entire link to have basic monitoring capabilities.

  • The data source for log collection is also the data source for full-link analysis.

  • In-depth analysis includes offline and online modules to meet the problem location requirements of the whole link.

 

 

In the microservice architecture, different dimensions have different monitoring methods.

(1) Health check. Health check is to monitor the health status of the application itself, and check whether the service is still alive.

(2) log . Logs are the main way to troubleshoot problems. Logs can provide rich information for locating and solving problems.

(3) Call chain monitoring. Call chain monitoring can completely present all the information of a request, including service call link, time spent, etc.

(4) Indicator monitoring. Indicators are some discrete data points based on time series, which can reflect the trend of some important indicators after aggregation and calculation.

Among the above four monitoring methods, health check is a capability provided by infrastructure such as cloud platforms, and logs generally have a separate log center for log collection, storage, calculation, and query, and call chain monitoring generally has independent solutions for services The buried point, collection, calculation and query of the call.

Main reasons for choosing Prometheus for indicator monitoring:

(1) Mature community support. Prometheus is an open source monitoring software with an active community that works well with cloud-native environments.

(2) Easy to deploy and maintain. The core of Prometheus has only one binary file, and there are no other third-party dependencies. It is very convenient to deploy and maintain.

(3) Use the Pull model to pull monitoring data from each monitoring target through the HTTP Pull method. The Push model generally uses the Agent method to collect information and push it to the collector. The Agent for each service needs to configure monitoring data items and monitoring server information, which will increase the difficulty of operation and maintenance when a large number of services are used; in addition, using the Push model, During the traffic peak period, the monitoring server will receive a large number of requests and data at the same time, which will put a lot of pressure on the monitoring server, and even the service will be unavailable in severe cases.

(4) Powerful data model. The monitoring data collected by Prometheus exists in the built-in time series database in the form of indicators. In addition to the basic indicator names, custom labels are also supported. Rich dimensions can be defined through tags to facilitate the aggregation and calculation of monitoring data.

(5) Powerful query language PromQL. Through PromQL, the query, aggregation, visualization and alarm of monitoring data can be realized.

(6) Perfect ecology. Common operating systems, databases, middleware, class libraries, and programming languages, Prometheus provides access solutions, and provides client SDKs in Java/Golang/Ruby/Python and other languages, which can quickly implement custom monitoring logic.

(7) High performance. A single instance of Prometheus can process hundreds of monitoring indicators, process hundreds of thousands of data per second, and has excellent performance in data collection and query.

 

Exploration and practice of iQIYI full-link automatic monitoring platform

 

IQIYI Microservice Application Monitoring Practice Based on Prometheus

Guess you like

Origin blog.csdn.net/qq_35240226/article/details/108096049