1. Background
1.1 What is a microservice application
Microservice applications are composed of a set of autonomous services, each of which provides only one type of service, and these services work together to provide complex business functions. Compared with traditional monolithic applications, microservice applications are highly distributed. As shown in the figure below, it is a typical microservice application:
As shown in the figure above, microservices generally have the following major characteristics:
Autonomy
Robustness (Resilience)
Transparency
Automation
Alignment
In this article, we focus on the transparency feature. In microservice applications, engineers need to be notified in a timely manner when errors occur. In a microservice application, a request spans multiple microservices, and these microservices may be developed by different teams.
This requires each microservice to be transparent and observable to ensure that engineers can observe and diagnose problems during operation. In the actual system development and operation process, it is generally necessary to collect a large amount of data to ensure the health of microservices. These data generally involve:
(1) Indicators of business, operation and infrastructure
(2) Application logs
(3) Request Tracking
1.2 Observability
The four pillars of an observability platform include: Metrics, Tracing, Logging, and Visualization.
1.2.1 Index data
Indicator data generally refers to a type of data collected on a regular basis. The data type is numeric. Usually, we focus on the minimum, maximum, average, and percentile values. Indicator data can usually reflect the performance of systems and applications. For example, when the resource utilization of a certain microservice exceeds a threshold, an alarm can be triggered to help DevOps personnel diagnose problems and take further actions.
1.2.2 Tracking
Tracing is a sequence of related distributed events. Generally, we generate a unique request ID at the gateway layer, and this request ID will span all request participants, so tracking is actually a combination of a set of logs with request IDs. Each record has trace and debug information, such as entry time, latency information, etc.
1.2.3 Log
A log refers to a time-stamped event, which can contain various information, such as confirmation information of a request, error information generated by a failed request, and so on.
1.2.4 Visualization
Visualization can help you better understand the collected data and can maximize your ability to gain insight into the system. In the engineering team, in addition to automation to improve operational efficiency, the discovery of many problems requires manual intervention, so related dashboards displaying data in real time is a very useful function.
In general, building an observability platform under microservice applications is a challenging task. We need to know the global information. We need to assemble a global view from independent information to help operation and maintenance personnel grasp the overall situation. This article uses the domestic big data analysis platform Honghu to show step by step how to build an observability platform.
2. Honghu Introduction
Honghu provides a series of out-of-the-box services from data import, analysis, storage, analysis and calculation to data visualization. The capabilities of each data processing stage are relatively complete and powerful.
The data collection and import part can easily connect to data sources such as Vector/Kafka, and can also receive data push from various APIs such as standard REST API and HTTP.
The data analysis part adopts the standard SQL query language. For most technicians, it is easy to use and has common skills. The platform has more than 100 built-in scalar functions and table functions, views, lookup tables, etc.; and supports cross-library and heterogeneous Data association can meet various business analysis needs.
The visualization part provides a large number of out-of-the-box chart types, rich input selection and drilling functions, and a diverse chart collaborative editing experience.
It is worth mentioning that Honghu's self-developed time-reading modeling engine can quickly import and store heterogeneous data, and supports dynamic adjustment of data models and analysis parameters without solidifying the model and analysis process. When the business analysis scenario changes, you only need to adjust the SQL analysis statement to respond quickly.
After preliminary research and understanding, Honghu basically meets our needs for building observability scenarios.
3. Solutions
3.1 System Architecture
Based on Honghu's simple and easy-to-use data collection function, it is very easy to build the above log collection system. The solution description is as follows:
Each microservice will be dynamically deployed on the Kubernetes node, and the output of each microservice will be stored on the running node in the format of kubernetes_logs.
Vector Agent is deployed on each node in the form of DaemonSet
Honghu has a built-in Vector interface, just open the configuration
After Vector Agent parses, enriches and converts the logs, it will eventually pull the logs to Honghu continuously
3.2 Data access
Honghu has a variety of data access functions, and the built-in Vector and Kafka data access functions greatly facilitate the collection of data by enterprises, and the convenience of importing Honghu analysis platform to further mine the value of data. Based on the above acquisition system, the specific operation steps are as follows:
3.2.1 Enable Honghu Data Collection Interface
3.2.1.1 Enter: Honghu -> Data Import -> Import from External Data Sources
3.2.1.2 Configure the Vector interface and select the data set range (this article will create a datalog data set)
3.2.1.3 Select the data set and data source type, generate and download the Vector configuration template for subsequent configuration of Vector
3.2.2 Configuring and installing Vector
According to the design of Vector (as shown above), the Data Pipeline is divided into three stages: determining the data source, converting the data, and collecting the data. The following is the actual running configuration file (considering information security, some of them have been desensitized). This configuration mainly includes the determination of the collected data source, how the data is processed and converted (multi-line processing, how to further analyze and extract data, enrich the data to meet the needs of Honghu), and finally collect the data to Honghu. I believe it is not difficult for everyone to understand.
With Vector's configuration file in place, it's time to install Vector. The solution in this article needs to collect the logs of microservices running on Kubernetes. Vector will be installed with the Helm command. The specific commands are as follows:
If you need to know more about Vector Helm chart, please refer to Vector Helm Chart (https://github.com/vectordotdev/helm-charts/tree/develop/charts/vector)
3.2.3 Verify data
After the data enters Honghu, as shown in the figure below, the system will index the data and store the original data, and then the user can create a dashboard and analyze it in real time.
When our data enters Honghu, we can open Honghu for query confirmation.
Open Honghu-"Query-"Advanced Query, execute simple SQL, and you can see the real-time data:
3.3 Application scenarios
3.3.1 Statistics of all events that occur in each service
3.3.2 Statistics of external requests
When counting some indicators, it is often necessary to classify statistics by time period. Honghu's processing logic here is very convenient, as long as the time is converted to a certain extent, and then based on the converted time, direct clustering analysis is enough.
3.3.3 Statistics request processing time
Based on microservice applications, each request will span one or more containers, and the processing time of the request needs to be superimposed on the processing time of one or more containers. Our application has considered this from the beginning. When a request enters the application gateway, it will be assigned a request_id, and then the request_id will be recorded in the logs of various applications. Based on this implementation, using the window function provided by Honghu, after clustering events based on request_id, the processing time of each request can be roughly calculated.
3.3.4 API Gateway log output
In the microservice system, the gateway is very important. Currently mainstream gateways are based on various implementations of Nginx and Envoy. In our business deployment, both types of gateways will be involved. This article only covers how to monitor the Nginx gateway. In the future, I will talk about how to use Honghu to monitor the Envoy-based gateway. For Nginx, Honghu already has a ready-made case. To make a dashboard, you only need to import the Nginx configuration file of the dashboard.
Enter Honghu --> Dashboard --> New Dashboard --> Import configuration file
3.3.5 Final rendering
3.3.5.1 Nginx gateway monitoring
3.3.5.2 Application Monitoring
Four. Use experience
After going through the whole process of installation, data import, data analysis and data visualization, Honghu System gives people a refreshing feeling among the localized big data products. Compared with other similar products, the advantages of Honghu are reflected in:
The KISS principle of the product: Honghu big data system is simple in design and easy for users to use. From the perspective of data import, analysis and visualization, the system has good cohesion. In most cases, users can start analyzing data without viewing documents.
Powerful analysis functions: Honghu adopts extended SQL statements to process structured, semi-structured and unstructured data, and can use scalar functions and table functions to enrich and transform original data, helping users to dig deep into the value of data.
Adapt to the needs of data analysis business: Honghu system provides the time-reading modeling function, which satisfies a piece of original data from a business perspective and can meet the analysis needs of different business departments.
Honghu big data analysis platform, with the release of the community version and the continuous expansion of users and ecology, the product functions will inevitably become more and more abundant--Don't forget the original intention and achieve the ambition of Honghu.