Monitoring Istio Ingress Gateways with Birdwatch Observability

1. Requirements description

In the previous article " Using Vector and Honghu to Build an Observability Platform for Microservice Applications ", I explained the basic concepts and advantages of microservices and how to use Honghu to process logs of distributed applications. This article will further discuss the problems faced by the microservice architecture, the service grid and the unique advantages of Honghu's handling of Istio Gateway.

1.1 Challenges faced by microservice architecture

1.1.1 Cloud infrastructure is not always reliable

Whether it is a public cloud or a private cloud, it consists of thousands of pieces of hardware and software. In theory, there are more or less unusable components. And our microservices are generally deployed on the cloud. When engineers build microservices, they generally assume that the infrastructure is not permanent and some of the infrastructure is even unavailable. Therefore, in the architecture, the non-permanent characteristics of cloud infrastructure must be considered forward-looking.

1.1.2 The flexibility of communication between services must be ensured

Because of the unreliability of cloud infrastructure, when designing a microservice system, it is necessary to consider the elasticity of the service itself to ensure that services can continue to be provided externally even when some infrastructure is unavailable. The industry generally has the following solutions:

  • Client load balancing: Provide multiple service endpoints, and the client decides how to call

  • Service discovery mechanism: regularly update healthy service endpoints

  • Short circuit: For abnormal services, implement a certain period of isolation

  • Restrictive measures: limit the number of connections, threads, sessions, etc.

  • Timeout: Set the timeout mechanism for the duration of the API request

  • Retry and retry control: fail retry and limit the maximum number of retries or the number of retries within a certain time range

  • Request validity period: If the request returns timeout, it will be discarded without further processing

1.1.3 Real-time understanding of system status

We need to know the call relationship between services in real time, the current load of a typical service, whether the failure is within expectations, and how the system behaves if the service is down. In general, to operate a microservice platform, in addition to the microservice itself, using indicator data, logs, and tracking to grasp the overall situation of the entire system is an essential part.

1.2 Traditional Solution: Application Libraries

In the early days, in order to solve the above problems, the industry generally used application libraries (as follows) to facilitate developers to quickly develop and implement non-functional requirements, but the disadvantage is that applications must be bound to a certain language .

  • Hystrix: for short circuit and current limiting

  • Ribbon: used for client load balancing

  • Eureka: Service Registration and Discovery

  • Zuul: Dynamic Service Proxy

1.3 The Modern Solution: Service Mesh

A service grid is a transparent, program-independent, distributed infrastructure for processing network communications. It consists of a data plane and a control plane, as shown in the following figure:

picture

After the microservice architecture integrates the service grid, it also brings further requirements for log collection and monitoring; to analyze the above topology diagram, we need to collect the corresponding Istio Proxy logs and do further correlation analysis. After understanding, this is what the Honghu platform is good at .

Two, the solution

2.1 System Architecture

picture

Compared with the previous article "Using Vector and Honghu to Build an Observability Platform for Microservice Applications", this solution introduces the concept of Istio service grid. In addition to collecting the logs of the application itself, it is necessary to collect Istio Ingress Gateway and Istio Proxy logs for insight into the global service mesh.

2.2 Data access

Honghu has a variety of data access functions, and the built-in Vector and Kafka data access functions greatly facilitate the collection of data by enterprises, and the convenience of importing Honghu analysis platform to further mine the value of data. Based on the above acquisition system, the specific operation steps are as follows:

2.2.1 Enable Honghu Data Collection Interface

2.2.1.1 Enter: Honghu -> Data Import -> Import from External Data Sources

picture

2.2.1.2 Configure the Vector interface and select the data set range

picture

2.2.1.3 Select the data set and data source type, generate and download the Vector configuration template for subsequent configuration of Vector

picture

2.2.2 Configuring and installing Vector

picture

According to the design of Vector (as shown above), the Data Pipeline is divided into three stages: determining the data source, converting the data, and collecting the data. The following is the actual running configuration file (considering information security, some of them have been desensitized).

This configuration mainly includes the determination of the collected data source, how the data is processed and converted (multi-line processing, how to further analyze and extract data, enrich the data to meet the needs of Honghu), and finally collect the data to Honghu. I believe it is not difficult for everyone to understand.

picture

With Vector's configuration file in place, it's time to install Vector. The solution in this article needs to collect the logs of microservices running on Kubernetes. Vector will be installed with the Helm command. The specific commands are as follows:

picture

If you need to know more about Vector Helm chart, please refer to Vector Helm Chart (https://github.com/vectordotdev/helm-charts/tree/develop/charts/vector)

2.3 Data processing

2.3.1 Create a data source

Honghu System provides commonly used data source types, such as json, csv, nginx, syslog, etc., to facilitate users to access data out of the box. The Istio Ingress Gateway mentioned in this article is itself implemented based on Envoy. In order to better explain the next steps, let us first introduce the format of the Istio Gateway log.

According to the document Envoy Access Log (https://istio.io/latest/docs/tasks/observability/logs/access-log/), the default format is as follows:

picture

picture

According to the above information, Envoy's logs contain a wealth of information. In order to facilitate subsequent analysis and processing, a separate data source needs to be created, as shown in the figure below.

picture

2.3.2 Field processing

Field processing is the core function of Honghu, that is, to further extract and enrich fields in the read-time mode to meet the needs of various business departments to analyze the same data from different perspectives. Then take Envoy Access Log as an example to see how Honghu does it.

2.3.2.1 Open the advanced search window

picture

It can be seen from the search window that this data contains 266279 pieces of data, because the data is not extracted when reading, and it takes 0.9s.

2.3.2.2 Click "Extract new field" to enter the field extraction wizard page

picture

2.3.2.3 Edit the regular expression to extract the required fields

Select the extraction rule as regular extraction, and the regular expression is shown in the figure below

picture

picture

Save the extraction rules according to the wizard, and you can see that a read-time extraction rule has been added to the data source type.

picture

2.3.2.4 Support multiple sustainable extraction

Honghu system supports multiple sustainable extractions, that is, on the basis of new fields, fields can be extracted and regenerated. For example, taking the UserAgent field as an example, it can be extracted again to obtain new fields such as user_agent_os, user_agent_os_version, user_agent_name, and user_agent_version. Being able to support multiple sustainable extraction of fields during reading, Honghu should be unique in this field.

picture

2.3.2.5 Extraction rules when verifying reads

Open the advanced search, we can see that the same data scale, the search time has increased to a certain extent, but the expected fields have been successfully extracted, and the required dashboard can be further analyzed and formed.

picture

2.4 Data display

2.4.1 Query Acceleration

In the previous part of data processing, we described how to use Honghu's read-time modeling mechanism to dynamically extract fields. In the data display link, there is a high probability that the same data source will be searched, which will reduce the effect of page display with a high probability. Honghu system fully considers this point and provides several mechanisms to improve search acceleration. This article will describe the pre-stored query function in more detail.

Create stored queries

Enter advanced query --> Query --> click "Save As" --> select pre-stored query, pay attention to the red box

picture

Search from stored queries

Enter advanced query --> use the saved_search table function to query from the stored query.

Judging from the results, using stored queries, the performance is improved by 4-5 times.

picture

2.4.2 Display effect

picture

picture

picture

3. Honghu Value

The Honghu platform provides a complete set of solutions from data import, secondary processing and rapid generation of dashboards, so that users can quickly build a monitoring platform. The core functions mentioned in this article are highly encapsulated and abstracted around the daily work of data engineers, data collection, receipt exploration, and data display, reducing the threshold for users.

3.1 Technical advantages

Self-built data source type: In the field of big data, there are various semi-structured and unstructured data, and each data source type has different corresponding processing logic. Honghu allows users to define unique data source types very conveniently, and can flexibly define the extraction rules of ingest time and search time for each type.

Time-reading mode: It satisfies the needs of viewing from different perspectives based on the same data source. Greatly reduce the complexity of the data acquisition end. Functionally, Honghu can satisfy the extraction rules of different mechanisms, from simple JSON rules, key-value pair rules to complex regular extraction rules.

Query acceleration: In the data display stage, considering that users are likely to query the same data at the same time, which will inevitably sacrifice performance, Honghu provides a series of query acceleration solutions to improve the effect of page display.

3.2 Application value

Accurately locate the problem and improve development efficiency: For example, in this use, we analyzed the browser types and found that there was an error in the name of a certain browser type. After reporting it to the developer, combined with Honghu’s log, the developer quickly located it. Problem, fixed the client type name error in time.

By monitoring the Envoy gateway in real time, the DevOps team can accurately grasp the relevant traffic indicators and access exception information of the inbound and outbound stations.

4. Expected improvement

This article has experienced several advanced core functions of Honghu, which can easily meet business needs. From a personal point of view, the following parts feel can be improved to improve user friendliness:

  • Regular rules for data extraction: Automatic field extraction and manual input, these two functions can be combined on one function page, reducing the need for users to switch operation pages.

  • Data extraction regular expression performance: Honghu community provides efficient regular expression matching documents to guide users to use regular expressions for data processing more efficiently and avoid regular performance pitfalls. For users, we expect more. It is highly expected that in future versions, Honghu itself will be able to analyze regular expressions to assist users in correcting expressions with low performance, thereby providing user-friendliness.

  • More cases support the best practice of accelerated query: Materialized view and pre-stored query are two advanced functions of accelerated query provided by Honghu system. During the initial use process, it is difficult to choose which solution to use to accelerate query. I believe that with the development of the community, there will be more and more practical cases to guide users in the best practice in this link.

Guess you like

Origin blog.csdn.net/Yhpdata888/article/details/132068942