Using Vector and Honghu to build an observability platform for microservice applications

1. Background

1.1 What is a microservice application

Microservice applications are composed of a set of autonomous services, each of which provides only one type of service, and these services work together to provide complex business functions. Compared with traditional monolithic applications, microservice applications are highly distributed. As shown in the figure below, it is a typical microservice application:

picture

As shown in the figure above, microservices generally have the following major characteristics:

Autonomy

Robustness (Resilience)

Transparency

Automation

Alignment

In this article, we focus on the transparency feature. In microservice applications, engineers need to be notified in a timely manner when errors occur. In a microservice application, a request spans multiple microservices, and these microservices may be developed by different teams.

This requires each microservice to be transparent and observable to ensure that engineers can observe and diagnose problems during operation. In the actual system development and operation process, it is generally necessary to collect a large amount of data to ensure the health of microservices. These data generally involve:

(1) Indicators of business, operation and infrastructure 

(2) Application logs 

(3) Request Tracking

1.2 Observability

The four pillars of an observability platform include: Metrics, Tracing, Logging, and Visualization.

1.2.1 Index data

Indicator data generally refers to a type of data collected on a regular basis. The data type is numeric. Usually, we focus on the minimum, maximum, average, and percentile values. Indicator data can usually reflect the performance of systems and applications. For example, when the resource utilization of a certain microservice exceeds a threshold, an alarm can be triggered to help DevOps personnel diagnose problems and take further actions.

1.2.2 Tracking

Tracing is a sequence of related distributed events. Generally, we generate a unique request ID at the gateway layer, and this request ID will span all request participants, so tracking is actually a combination of a set of logs with request IDs. Each record has trace and debug information, such as entry time, latency information, etc.

1.2.3 Log

A log refers to a time-stamped event, which can contain various information, such as confirmation information of a request, error information generated by a failed request, and so on.

1.2.4 Visualization

Visualization can help you better understand the collected data and can maximize your ability to gain insight into the system. In the engineering team, in addition to automation to improve operational efficiency, the discovery of many problems requires manual intervention, so related dashboards displaying data in real time is a very useful function.

In general, building an observability platform under microservice applications is a challenging task. We need to know the global information. We need to assemble a global view from independent information to help operation and maintenance personnel grasp the overall situation. This article uses the domestic big data analysis platform Honghu to show step by step how to build an observability platform.

2. Honghu Introduction

Honghu provides a series of out-of-the-box services from data import, analysis, storage, analysis and calculation to data visualization. The capabilities of each data processing stage are relatively complete and powerful.

The data collection and import part can easily connect to data sources such as Vector/Kafka, and can also receive data push from various APIs such as standard REST API and HTTP.

The data analysis part adopts the standard SQL query language. For most technicians, it is easy to use and has common skills. The platform has more than 100 built-in scalar functions and table functions, views, lookup tables, etc.; and supports cross-library and heterogeneous Data association can meet various business analysis needs.

The visualization part provides a large number of out-of-the-box chart types, rich input selection and drilling functions, and a diverse chart collaborative editing experience.

It is worth mentioning that Honghu's self-developed time-reading modeling engine can quickly import and store heterogeneous data, and supports dynamic adjustment of data models and analysis parameters without solidifying the model and analysis process. When the business analysis scenario changes, you only need to adjust the SQL analysis statement to respond quickly.

After preliminary research and understanding, Honghu basically meets our needs for building observability scenarios.

3. Solutions

3.1 System Architecture

picture

Based on Honghu's simple and easy-to-use data collection function, it is very easy to build the above log collection system. The solution description is as follows:

Each microservice will be dynamically deployed on the Kubernetes node, and the output of each microservice will be stored on the running node in the format of kubernetes_logs.

Vector Agent is deployed on each node in the form of DaemonSet

Honghu has a built-in Vector interface, just open the configuration

After Vector Agent parses, enriches and converts the logs, it will eventually pull the logs to Honghu continuously

3.2 Data access

Honghu has a variety of data access functions, and the built-in Vector and Kafka data access functions greatly facilitate the collection of data by enterprises, and the convenience of importing Honghu analysis platform to further mine the value of data. Based on the above acquisition system, the specific operation steps are as follows:

3.2.1 Enable Honghu Data Collection Interface

3.2.1.1 Enter: Honghu -> Data Import -> Import from External Data Sources

picture

3.2.1.2 Configure the Vector interface and select the data set range (this article will create a datalog data set)

picture

3.2.1.3 Select the data set and data source type, generate and download the Vector configuration template for subsequent configuration of Vector

picture

3.2.2 Configuring and installing Vector

picture

According to the design of Vector (as shown above), the Data Pipeline is divided into three stages: determining the data source, converting the data, and collecting the data. The following is the actual running configuration file (considering information security, some of them have been desensitized). This configuration mainly includes the determination of the collected data source, how the data is processed and converted (multi-line processing, how to further analyze and extract data, enrich the data to meet the needs of Honghu), and finally collect the data to Honghu. I believe it is not difficult for everyone to understand.

picture

With Vector's configuration file in place, it's time to install Vector. The solution in this article needs to collect the logs of microservices running on Kubernetes. Vector will be installed with the Helm command. The specific commands are as follows:

picture

If you need to know more about Vector Helm chart, please refer to Vector Helm Chart (https://github.com/vectordotdev/helm-charts/tree/develop/charts/vector)

3.2.3 Verify data

After the data enters Honghu, as shown in the figure below, the system will index the data and store the original data, and then the user can create a dashboard and analyze it in real time.

picture

When our data enters Honghu, we can open Honghu for query confirmation.

Open Honghu-"Query-"Advanced Query, execute simple SQL, and you can see the real-time data:

picture

picture

3.3 Application scenarios

3.3.1 Statistics of all events that occur in each service

picture

3.3.2 Statistics of external requests

When counting some indicators, it is often necessary to classify statistics by time period. Honghu's processing logic here is very convenient, as long as the time is converted to a certain extent, and then based on the converted time, direct clustering analysis is enough.

picture

3.3.3 Statistics request processing time

Based on microservice applications, each request will span one or more containers, and the processing time of the request needs to be superimposed on the processing time of one or more containers. Our application has considered this from the beginning. When a request enters the application gateway, it will be assigned a request_id, and then the request_id will be recorded in the logs of various applications. Based on this implementation, using the window function provided by Honghu, after clustering events based on request_id, the processing time of each request can be roughly calculated.

picture

3.3.4 API Gateway log output

In the microservice system, the gateway is very important. Currently mainstream gateways are based on various implementations of Nginx and Envoy. In our business deployment, both types of gateways will be involved. This article only covers how to monitor the Nginx gateway. In the future, I will talk about how to use Honghu to monitor the Envoy-based gateway. For Nginx, Honghu already has a ready-made case. To make a dashboard, you only need to import the Nginx configuration file of the dashboard.

Enter Honghu --> Dashboard --> New Dashboard --> Import configuration file

picture

3.3.5 Final rendering

3.3.5.1 Nginx gateway monitoring

picture

3.3.5.2 Application Monitoring

picture

Four. Use experience

After going through the whole process of installation, data import, data analysis and data visualization, Honghu System gives people a refreshing feeling among the localized big data products. Compared with other similar products, the advantages of Honghu are reflected in:

The KISS principle of the product: Honghu big data system is simple in design and easy for users to use. From the perspective of data import, analysis and visualization, the system has good cohesion. In most cases, users can start analyzing data without viewing documents.

Powerful analysis functions: Honghu adopts extended SQL statements to process structured, semi-structured and unstructured data, and can use scalar functions and table functions to enrich and transform original data, helping users to dig deep into the value of data.

Adapt to the needs of data analysis business: Honghu system provides the time-reading modeling function, which satisfies a piece of original data from a business perspective and can meet the analysis needs of different business departments.

Honghu big data analysis platform, with the release of the community version and the continuous expansion of users and ecology, the product functions will inevitably become more and more abundant--Don't forget the original intention and achieve the ambition of Honghu.

Guess you like

Origin blog.csdn.net/Yhpdata888/article/details/131962861