Spring Cloud link tracking Sleuth+Zipkin. Follow up notes

learning target

insert image description here
With the popularity of microservice architecture, services are split according to different dimensions, and a request often needs to involve multiple services. Internet applications are built on different sets of software modules. These software modules may be developed by different teams, may be implemented using different programming languages, and may be deployed on thousands of servers across multiple different data centers. . Therefore, there is a need for tools that can help understand system behavior and analyze performance problems so that when a fault occurs, the problem can be quickly located and resolved. In a complex microservice architecture system, almost every front-end request will form a complex distributed service call link.
A complete call chain of a request may be as shown in the figure below:
insert image description here

This video follows the learning of the best
limit_2021 Link Tracking Teaching - SpringCloud Family Bucket Sleuth+Zipkin Tutorial - Intensive Lectures on Core Knowledge Points . I hope you will support the original video. This article is a refinement of notes after learning the video.

1. What is link tracking

The term "link tracking" was proposed in 2010, when Google released a Dapper paper: Dapper, a tracking system for large-scale distributed systems , which introduced the implementation principle of Google's self-developed distributed link tracking, and also It introduces how they are transparent to applications at low cost.
The paper (Chinese translation)
simply understands link tracking, which means that from the beginning to the end of a task, all the systems called during the period and the time-consuming (time span) can be fully recorded.
In fact, Dapper was just an independent call link tracking system at first, and then gradually evolved into a monitoring platform, and based on the monitoring platform, many tools were born, such as real-time early warning, overload protection, index data query, etc.
In addition to Google's Dapper, there are some other well-known products, such as Ali's Eagle Eye, Dianping's CAT, Twitter's Zipkin, Naver (the parent company of the famous social software LINE) PinPoint, and the domestic open source SkyWalking (contributed to Apache) and so on.

2. What is Sleuth

Spring Cloud Sleuth implements a distributed tracing solution for Spring Cloud. Compatible with Zipkin, HTrace and other log-based tracing systems such as ELK (Elasticsearch. Logstash, Kibana).
Spring Cloud Sleuth provides the following features:

  • 链路追踪: Through Sleuth, you can clearly see that a request has passed through those services, and you can easily sort out the calling relationship between services.
  • 性能分析: Through Sleuth, you can easily see the time-consuming of each sampling request, and analyze which service calls are time-consuming. When the time-consuming of service calls increases with the increase in the number of requests, you can provide certain services for service expansion. reminder.
  • 数据分析,优化链路: For frequent calls to a service, or parallel calls, etc., some optimization measures can be made for the business.
  • 可视化错误: For exceptions that are not caught by the program, you can check them with Zipkin.

3. Terminology

1. Span

The basic unit of work, a single call chain can be called a Span, Dapper records the name of the Span, and the ID and parent ID of each Span, rebuilding the relationship between different Spans in a tracking process, a rectangle in the figure The box is a Span, and the front end is a Span from sending a request to receiving a reply.

insert image description here

The initial span to start tracing is called the root span. The span's ID has a value equal to the trace ID.

The above picture is more abstract, we draw an easy-to-understand picture, as follows:
insert image description here

Dapper records the span names, as well as each span ID and parent spanID, to reconstruct the relationship between different spans during a trace. If a span has no parent ID it is called root span. All spans hang on a specific Trace and share a trace id.
The figure shows the specific structure of the Help.Call microservice module expansion.
insert image description here

2. Trace

A tree structure composed of a list of Spans, a Trace is considered as a complete link, and contains n multiple Spans inside. There is a one-to-many relationship between Trace and Span, and there is a parent-child relationship between Span and Span.
For example: the client invokes service A, service B, service C, and service F, and each service such as C is a Span. If another thread in service C calls D, then D is a sub-Span of C. If Another thread in service D calls E, then E is a sub-Span of D, and the link of C->D->E is a Trace. If the link tracking system is ready and the link data is available, with the help of front-end analysis and rendering tools, the effect in the figure below can be achieved: the
insert image description here
visual interface of zipkin we will learn later is as follows:
insert image description here

3. Annotation

Used to record the existence of an event in time, some core annotations are used to define the start and end of a request.

  • cs-Client Sent: The client initiates a request, and this annotation describes the start of the span;
  • sr-Server Received: The server gets the request and is ready to start processing it. If sr subtracts the cs timestamp, the network delay can be obtained;
  • ss-Server Sent: When the processing is completed (when the request is returned to the client), if ss minus the sr timestamp, the time required for the server to process the request can be obtained;
  • cr-Client Received: Indicates that the client at the end of the span has successfully received the reply from the server. If cr subtracts the cs timestamp, you can get all the time required for the client to get the reply from the server.

Formula:
Network delay = sr - cs
The time required for the server to process a request = ss - sr
All the time required for the client to obtain a reply from the server = cr - cs

4. Implementation principle

If you want to know which link of an interface has a problem, you must know which services are called by the interface and the order of calls. If you string these services together, it looks like a chain, which we call a call chain .

insert image description here
If you want to realize the call chain, you need to make an identifier for each call, and then arrange the services according to the size of the identifier, so that you can see the call sequence more clearly. Let's name the identifier spanid for the time being .

insert image description here
In the actual field, we need to know the situation of a certain request call, so only the spanid is not enough, we must make a unique identifier for each request, so that we can find out all the services called by this request according to the identifier, and we named this identifier as traceid .

insert image description here
Now according to the spanid, the order of the called services can be easily known, but the hierarchical relationship of the calls cannot be reflected. As shown in the figure below, multiple services may be called chains step by step, or they may be called by the same service at the same time.

insert image description here
So it should be recorded who called it every time, we use parentid as the name of this identification.
insert image description here

Up to now, we have known the call sequence and hierarchical relationship, but after the interface has a problem, we still cannot find the problem link. If there is a problem with a certain service, the called service must take a long time. To calculate the consumption At this time, the above three identifications are not enough, and a time stamp is also required, and the time stamp can be finer, accurate to the microsecond level.
Add the timestamp when the request was initiated .
insert image description here
Only recording the timestamp when the call is initiated is not time-consuming. It is necessary to record the timestamp when the service returns, and the time difference can only be calculated from the beginning to the end. Since the return is also recorded, write down the above three marks. Otherwise, Can't tell whose timestamp it is.

insert image description here

Although the total time-consuming from service invocation to service return can be calculated, this time includes service execution time and network delay. Sometimes we need to distinguish these two types of
time for targeted optimization. So how to calculate the network delay? We can divide the process of calling and returning into the following four events.

  • Client Sent is referred to as cs, and the client initiates a call request to the server.
  • Server Received is referred to as sr. It means that the server has received the call request from the client.
  • Server Sent is referred to as ss, which means that the server has completed the processing and is ready to return information to the client.
  • Client Received is abbreviated as cr, which means that the client F has received the return information from the server.

insert image description here

If you record the timestamp when these four events occur, you can easily calculate the time-consuming, for example:

sr minus cs is the network delay when calling
ss minus sr is the service execution time
cr minus ss is the service response delay
cr minus cs is the execution time of the entire service call

insert image description here
In fact, in addition to recording these parameters, other information can also be recorded in the span, such as the name of the calling service, the name of the called service, the return result, IP, the name of the calling service, etc. Finally, we put the span with the same parentid The information is synthesized into a large span block, and a complete call chain is completed.

The above prototype drawing is from Zhang Yinuo's drawing.

5. Environmental preparation

  • eureka-server: registration center
  • eureka-server02 : registration center
  • gateway-server : Spring Cloud Gateway service gateway
  • product-service: Commodity service, which provides an interface for querying commodities based on primary keys http://localhost:7870/product/{id} http://localhost:7870/product/listByIds for querying commodities based on multiple primary keys
  • order-service : order service, which provides an interface for querying orders based on the primary key http://localhost :9090/order/{id} and the order service calls the commodity service.

Courseware information:
Baidu Cloud-Courseware
Extraction code: c8cp
insert image description here
Open in IDEA
insert image description here
and import the project as follows:
insert image description here
Then start in order:
eureka-server
eureka-server2
gateway
order-service
product-service

Add configuration when starting eureka-server and eureka-server2

register-with-eureka: false #是否将自己注册到eureka中
    fetch-registry: false #是否从eureka中拉去信息列表

insert image description here
Note here, if there is a subproject @SpringBootApplication reporting a red problem
insert image description here
Solution: Right-click-maven-reload project
insert image description here
to verify whether Eureka starts successfully
Visit

http://localhost:8761/

insert image description here

Verify whether the product-service service is started successfully
Visit

http://localhost:7070/product/listByIds?id=1&id=2

insert image description here
Verify whether the order-service service is started successfully
Visit

http://localhost:9090/order/1

insert image description here
Verify whether the gateway is successfully started
Access

http://localhost:9000/order-service/order/1

insert image description here
access

http://localhost:9000/product-service/product/listByIds?id=1&id=2

insert image description here
found no problem

6. Introductory case

Add spring-cloud-starter-sleuth dependencies to projects that require link tracking (service gateway, product service, order service).

6.1. Add dependencies

<!-- spring cloud sleuth 依赖 -->
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>

6.2 Logging

Add the logback.xml log file to the project that needs link tracking (service gateway, commodity service, order service), the content is as follows (the output level of logback log needs to be DEBUG level): pay attention to
modify

< property name=“log.path”
value=“${catalina.base}/gateway-server/logs”/ >

in the project name.

Log core configuration:

%d{yyyy-MM-dd HH:mm:ss.SSS}
[${applicationName},%X{X-B3-TraceId:-},%X{X-B3-SpanId:-}] [%thread]%-5level %logger{50} - %msg%n

Note: The log files provided in the data have encoding problems, just copy the ones I have here
logback.xml

<?xml version="1.0" encoding="UTF-8"?>
<!-- 日志级别从低到高分为TRACE < DEBUG < INFO < WARN < ERROR < FATAL,如果设置为WARN,
则低于WARN的信息都不会输出 -->
<!-- scan: 当此属性设置为true时,配置文件如果发生改变,将会被重新加载,默认值为true -->
<!-- scanPeriod: 设置监测配置文件是否有修改的时间间隔,如果没有给出时间单位,默认单位是毫秒。
当scan为true时,此属性生效。默认的时间间隔为1分钟。 -->
<!-- debug: 当此属性设置为true时,将打印出logback内部日志信息,实时查看logback运行状态。默
认值为false。 -->
<configuration scan="true" scanPeriod="10 seconds">
    <!-- 日志上下文名称 -->
    <contextName>my_logback</contextName>
    <!-- name的值是变量的名称,value的值是变量定义的值。通过定义的值会被插入到logger上下文
中。定义变量后,可以使“${}”来使用变量。 -->
    <property name="log.path" value="${catalina.base}/gateway-server/logs"/>
    <!-- 加载 Spring 配置文件信息 -->
    <springProperty scope="context" name="applicationName"
                    source="spring.application.name" defaultValue="localhost"/>
    <!-- 日志输出格式 -->
    <property name="LOG_PATTERN" value="%d{yyyy-MM-dd HH:mm:ss.SSS}
[${applicationName},%X{X-B3-TraceId:-},%X{X-B3-SpanId:-}] [%thread] %-5level
%logger{50} - %msg%n"/>
    <!--输出到控制台-->
    <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
        <!--此日志appender是为开发使用,只配置最底级别,控制台输出的日志级别是大于或等于此级
别的日志信息-->
        <filter class="ch.qos.logback.classic.filter.ThresholdFilter">
            <level>DEBUG</level>
        </filter>
        <encoder>
            <pattern>${LOG_PATTERN}</pattern>
            <!-- 设置字符集 -->
            <charset>UTF-8</charset>
        </encoder>
    </appender>
    <!-- 输出到文件 -->
    <!-- 时间滚动输出 level为 DEBUG 日志 -->
    <appender name="DEBUG_FILE"
              class="ch.qos.logback.core.rolling.RollingFileAppender">
        <!-- 正在记录的日志文件的路径及文件名 -->
        <file>${log.path}/log_debug.log</file>
        <!--日志文件输出格式-->
        <encoder>
            <pattern>${LOG_PATTERN}</pattern>
            <charset>UTF-8</charset> <!-- 设置字符集 -->
        </encoder>
        <!-- 日志记录器的滚动策略,按日期,按大小记录 -->
        <rollingPolicy
                class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <!-- 日志归档 -->
            <fileNamePattern>${log.path}/debug/log-debug-%d{yyyy-MM-dd}.%i.log</fileNamePattern>
            <timeBasedFileNamingAndTriggeringPolicy
                    class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                <maxFileSize>100MB</maxFileSize>
            </timeBasedFileNamingAndTriggeringPolicy>
            <!--日志文件保留天数-->
            <maxHistory>15</maxHistory>
        </rollingPolicy>
        <!-- 此日志文件只记录debug级别的 -->
        <filter class="ch.qos.logback.classic.filter.LevelFilter">
            <level>DEBUG</level>
            <onMatch>ACCEPT</onMatch>
            <onMismatch>DENY</onMismatch>
        </filter>
    </appender>
    <!-- 时间滚动输出 level为 INFO 日志 -->
    <appender name="INFO_FILE"
              class="ch.qos.logback.core.rolling.RollingFileAppender">
        <!-- 正在记录的日志文件的路径及文件名 -->
        <file>${log.path}/log_info.log</file>
        <!--日志文件输出格式-->
        <encoder>
            <pattern>${LOG_PATTERN}</pattern>
            <charset>UTF-8</charset>
        </encoder>
        <!-- 日志记录器的滚动策略,按日期,按大小记录 -->
        <rollingPolicy
                class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <!-- 每天日志归档路径以及格式 -->
            <fileNamePattern>${log.path}/info/log-info-%d{yyyy-MM-dd}.%i.log</fileNamePattern>
            <timeBasedFileNamingAndTriggeringPolicy
                    class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                <maxFileSize>100MB</maxFileSize>
            </timeBasedFileNamingAndTriggeringPolicy>
            <!--日志文件保留天数-->
            <maxHistory>15</maxHistory>
        </rollingPolicy>
        <!-- 此日志文件只记录info级别的 -->
        <filter class="ch.qos.logback.classic.filter.LevelFilter">
            <level>INFO</level>
            <onMatch>ACCEPT</onMatch>
            <onMismatch>DENY</onMismatch>
        </filter>
    </appender>
    <!-- 时间滚动输出 level为 WARN 日志 -->
    <appender name="WARN_FILE"
              class="ch.qos.logback.core.rolling.RollingFileAppender">
        <!-- 正在记录的日志文件的路径及文件名 -->
        <file>${log.path}/log_warn.log</file>
        <!--日志文件输出格式-->
        <encoder>
            <pattern>${LOG_PATTERN}</pattern>
            <charset>UTF-8</charset> <!-- 此处设置字符集 -->
        </encoder>
        <!-- 日志记录器的滚动策略,按日期,按大小记录 -->
        <rollingPolicy
                class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <fileNamePattern>${log.path}/warn/log-warn-%d{yyyy-MM-dd}.%i.log</fileNamePattern>
            <!-- 每个日志文件最大100MB -->
            <timeBasedFileNamingAndTriggeringPolicy
                    class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                <maxFileSize>100MB</maxFileSize>
            </timeBasedFileNamingAndTriggeringPolicy>
            <!--日志文件保留天数-->
            <maxHistory>15</maxHistory>
        </rollingPolicy>
        <!-- 此日志文件只记录warn级别的 -->
        <filter class="ch.qos.logback.classic.filter.LevelFilter">
            <level>WARN</level>
            <onMatch>ACCEPT</onMatch>
            <onMismatch>DENY</onMismatch>
        </filter>
    </appender>
    <!-- 时间滚动输出 level为 ERROR 日志 -->
    <appender name="ERROR_FILE"
              class="ch.qos.logback.core.rolling.RollingFileAppender">
        <!-- 正在记录的日志文件的路径及文件名 -->
        <file>${log.path}/log_error.log</file>
        <!--日志文件输出格式-->
        <encoder>
            <pattern>${LOG_PATTERN}</pattern>
            <charset>UTF-8</charset> <!-- 此处设置字符集 -->
        </encoder>
        <!-- 日志记录器的滚动策略,按日期,按大小记录 -->
        <rollingPolicy
                class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <fileNamePattern>${log.path}/error/log-error-%d{yyyy-MM-dd}.%i.log</fileNamePattern>
            <timeBasedFileNamingAndTriggeringPolicy
                    class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                <maxFileSize>100MB</maxFileSize>
            </timeBasedFileNamingAndTriggeringPolicy>
            <!--日志文件保留天数-->
            <maxHistory>15</maxHistory>
            <!-- 日志量最大 10 GB -->
            <totalSizeCap>10GB</totalSizeCap>
        </rollingPolicy>
        <!-- 此日志文件只记录ERROR级别的 -->
        <filter class="ch.qos.logback.classic.filter.LevelFilter">
            <level>ERROR</level>
            <onMatch>ACCEPT</onMatch>
            <onMismatch>DENY</onMismatch>
        </filter>
    </appender>
    <!-- 对于类路径以 com.example.logback 开头的Logger,输出级别设置为warn,并且只输出到控
制台 -->
    <!-- 这个logger没有指定appender,它会继承root节点中定义的那些appender -->
    <!-- <logger name="com.example.logback" level="warn"/> -->
    <!--通过 LoggerFactory.getLogger("myLog") 可以获取到这个logger-->
    <!--由于这个logger自动继承了root的appender,root中已经有stdout的appender了,自己这边
又引入了stdout的appender-->
    <!--如果没有设置 additivity="false" ,就会导致一条日志在控制台输出两次的情况-->
    <!--additivity表示要不要使用rootLogger配置的appender进行输出-->
    <logger name="myLog" level="INFO" additivity="false">
        <appender-ref ref="CONSOLE"/>
    </logger>
    <!-- 日志输出级别及方式 -->
    <root level="DEBUG">
        <appender-ref ref="CONSOLE"/>
        <appender-ref ref="DEBUG_FILE"/>
        <appender-ref ref="INFO_FILE"/>
        <appender-ref ref="WARN_FILE"/>
        <appender-ref ref="ERROR_FILE"/>
    </root>
</configuration>

Note here that the project name is consistent with the log name
insert image description here
insert image description here

insert image description here

6.3 Access

Before accessing, the project that added the log needs to be restarted.
insert image description here
After restarting, it is found that the file is generated
insert image description here

access:

http://localhost:9000/order-service/order/1

The result is as follows:
The service gateway prints information:
insert image description here

[gateway-server,f82de2d68493f0cb,f82de2d68493f0cb]

Commodity service printing information
insert image description here

[product-service,f82de2d68493f0cb,93fc5e739328a597]

Order Service Print Information

[order-service,f82de2d68493f0cb,a94f977efaf08f05]

insert image description here
From the printed information, we can know that
the traceId of the entire link is: f82de2d68493f0cb, and
the spanId is: 93fc5e739328a597 and a94f977efaf08f05.

Viewing log files is not a good method. When there are more and more microservices, there will be more and more log files, and the query work will become more and more troublesome. Spring officially recommends using Zipkin for link tracking. Zipkin can aggregate logs and perform visual display and full-text search.

Seven, use Zipkin for link tracking

7.1. What is Zipkin

insert image description here
ZIPKIN official website
ZIPKIN is an open source distributed real-time data tracking system (Distributed Tracking System) developed and contributed by Twitter. It is designed based on the paper of Google Dapper. Its main function is to gather real-time monitoring data of various heterogeneous systems.

It can collect the tracking data of the request link on each server, and assist us to query the tracking data through the Rest API interface, realize the real-time monitoring of the distributed system, timely discover the delay increase problem in the system and find out the system performance bottleneck root. In addition to the development-oriented API interface, it also provides convenient UI components. Each service reports timing data to Zipkin, and Zipkin generates a dependency graph based on the call relationship, which helps us intuitively search for tracking information and analyze request link details . Zipkin provides pluggable data storage methods: In-Memory, MySql, Cassandra, and Elasticsearch.

There are other more mature implementations of distributed tracking systems, such as Naver's PinPoint, Apache's HTrace, Ali's Hawkeye Tracing, JD's Hydra, Sina's Watchman, Meituan Dianping's CAT, Apache's SkyWalking, etc.
insert image description here

7.2 Working principle

insert image description here
There are four components that make up Zipkin:

  • Collector: The collector component processes the tracking information sent from the external system and converts the information into the Span format internally processed by Zipkin to support subsequent storage, analysis, display and other functions.
  • Storage : storage component, which processes the tracking information received by the collector, stores the information in memory by default, and can modify the storage strategy to use other storage components, supporting MySQL, Elasticsearch, etc.
  • Web UI: UI components, upper-level applications implemented based on API components, provide web pages to display call chains and system dependencies in Zipkin, etc.
  • RESTful API: The API component provides an interface for the web interface to query the data in the storage.

Zipkin is divided into two ends, one is the Zipkin server and the other is the Zipkin client. The client is also the application of the microservice. The client will configure the URL address of the server. Once a call between services occurs, it will be configured in the microservice. The Sleuth listener in the service listens and generates corresponding Trace and Span information to send to the server. There are two ways of sending, one is sending by message bus such as RabbitMQ, and the other is sending by HTTP message.

7.3 Server Deployment

The server is an independent executable jar package, the official download address: ZIPKIN download address , use the java -jar zipkin.jar command to start, the default port is 9411. The jar package we downloaded is: zipkin-server-2.20.1-exec.jar.
The jar package is included in the data:
insert image description here

The startup command is as follows:

java -jar zipkin-server-2.20.1-exec.jar

insert image description here
After startup as follows:
insert image description here

access:

http://localhost:9411/ 

The results are as follows:
the latest version of the interface.
insert image description here
The interface of the previous version.
insert image description here

7.4 Client Deployment

(1). Add dependencies

Add spring-cloud-starter-zipkin dependencies to projects that require link tracking (service gateway, commodity service, order service).

<!-- spring cloud zipkin 依赖 -->
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-zipkin</artifactId>
</dependency>

(2). Configuration file

Configure the Zipkin server address and data transmission method in projects that require link tracking (service gateway, product service, order service). The default configuration is as follows.

spring:
  zipkin:
    base-url: http://localhost:9411/ # 服务端地址
    sender:
      type: web # 数据传输方式,web 表示以 HTTP 报文的形式向服务端
  sleuth:
    sampler:
      probability: 1.0 # # 收集数据百分比,默认 0.1(10%)

Here type:web is sent in the form of Http message, and will be changed to other methods later.
insert image description here
Restart the project after making changes.
The current call link is as follows:
client -> gateway -> order-service -> product-service

(3). Access

access:

http://localhost:9000/order-service/order/1 

The result is as follows:
insert image description here

The operation of the new version is as follows: visit:

http://localhost:9411/ 

Filter click search results according to time as follows:
insert image description here
After clicking, the results are as follows:

insert image description here
We visit the order or inventory several times, here zipkin will also generate several pieces of data
insert image description here
Click the corresponding tracking information to view the details of the request link.
insert image description here
View the timestamp
insert image description here
and also download the data as JSON
insert image description here

Support conditional query
insert image description here

Through dependencies, you can view the dependencies of services in the link. And there is also a dynamic display of service scheduling, with a small ball moving, which is very user-friendly.
insert image description here

The operation of the old version is as follows:
visit:

http://localhost:9411/ 

Click to find the results as follows:
insert image description here
Click on the corresponding tracking information to view the details of the request link.
insert image description here
Through dependencies, you can view the dependencies of services in the link.
insert image description here

At present, all our configurations are gone as long as the restart is gone, then we need to store the tracking service.

7.5 Storing trace data

Zipkin Server stores tracking data in memory by default. This method is not suitable for production environments. Once the server is shut down and restarted or the service crashes, the historical data will disappear. Zipkin supports modifying the storage strategy to use other storage components, such as MySQL, Elasticsearch, etc.

7.5.1 MySQL

(1). Database script

Open the MySQL database, create the zipkin library, and execute the following SQL script.
Official website address: official website address

The script is in the data, just run it in mysql to
insert image description here
create a database

CREATE DATABASE zipkin;
USE zipkin;

After executing the script, 3 tables appear
insert image description here

(2). Deploy Zipkin server

Here you cannot start the server directly through the default form of the jar, but you need to add startup parameters and redeploy the server:
Official website address: Zipkin server for MySql

java -jar zipkin-server-2.20.1-exec.jar --STORAGE_TYPE=mysql --MYSQL_HOST=localhost --MYSQL_TCP_PORT=3306 --MYSQL_USER=root --MYSQL_PASS=root --MYSQL_DB=zipkin

insert image description here

(3). Test

access:

http://localhost:9000/order-service/order/1 

View the database results as follows:
insert image description here

In MySQL mode, every time the server is started, the server will load the link information from the database and display it to the web interface.
We use CTRL + C, esc to exit Zipkin, and then access the zipkin server at this time, and find that it cannot be accessed. Is the previous data lost? Let’s restart the zipkin server to see
insert image description here
Access

http://localhost:9411/zipkin/

It was found that the previous access information was not lost because it was saved in mysql
insert image description here

7.5.2 RabbitMQ

In the previous course, we have learned the detailed use of RabbitMQ, so we won't go into details here, just start and use it directly.

The current link mode is:
client

(1). Start the RabbitMQ server

Start the virtual machine and start the RabbitMQ server with the following command.

systemctl start docker
docker start mq

access

ip:15672

Enter the graphical interface of mq and find that there is no queue at this time
insert image description here

(2). Deploy Zipkin server

Add startup parameters and redeploy the server:
official website address: official website deployment-mq

java -jar zipkin-server-2.20.1-exec.jar --STORAGE_TYPE=mysql --MYSQL_HOST=localhost --MYSQL_TCP_PORT=3306 --MYSQL_USER=root --MYSQL_PASS=root --MYSQL_DB=zipkin --RABBIT_ADDRESSES=192.168.10.101:5672 --RABBIT_USER=guest --RABBIT_PASSWORD=guest --RABBIT_VIRTUAL_HOST=/ --RABBIT_QUEUE=zipkin

The startup parameters include the configuration of MySQL and RabbitMQ, which is based on MQ and stores link information to MySQL, as shown in the figure below:

insert image description here

(3). Check the queue

Visit: http://192.168.10.101:15672/#/queues You can see that the zipkin queue has been created.
insert image description here
Here, in order to see the process of message consumption, we stop the server and clear the tables of the zipkin database in mysql.
insert image description here
insert image description here

(4). Clients add dependencies

Official website documents: Official documents
are added in gateways, order services, and product services

<!-- 消息队列通用依赖 -->
        <dependency>
            <groupId>org.springframework.amqp</groupId>
            <artifactId>spring-rabbit</artifactId>
        </dependency>

(5). Client configuration file

application.yml

spring:
  application:
    name: gateway-server # 应用名称
  cloud:
    gateway:
      discovery:
        locator:
          # 是否与服务发现组件进行结合,通过 serviceId 转发到具体服务实例。
          enabled: true                  # 是否开启基于服务发现的路由规则
          lower-case-service-id: true    # 是否将服务名称转小写
  zipkin:
    base-url: http://localhost:9411/ # 服务端地址
    sender:
      # ype: web # 数据传输方式,web 表示以 HTTP 报文的形式向服务端
      type: rabbit # 数据传输方式,web 表示以 HTTP 报文的形式向服务端
  sleuth:
    sampler:
      probability: 1.0 # # 收集数据百分比,默认 0.1(10%)
  rabbitmq:
    host: 192.168.10.101
    port: 5672
    username: guest
    password: guest
    virtual-host: /
    listener:
      direct:
        retry:
          enabled: true
          max-attempts: 5
          initial-interval: 5000
      simple:
        retry:
          enabled: true
          max-attempts: 5
          initial-interval: 5000

insert image description here
Restart the service after configuration

(6).Test

Close the Zipkin server first, visit:

http://localhost:9000/order-service/order/1 

The client has written link tracking data into the queue:

After starting the Zipkin server, the messages in the queue are piled up.
insert image description here
Now we start the Zipkin server again.

java -jar zipkin-server-2.20.1-exec.jar --STORAGE_TYPE=mysql --MYSQL_HOST=localhost --MYSQL_TCP_PORT=3306 --MYSQL_USER=root --MYSQL_PASS=root --MYSQL_DB=zipkin --RABBIT_ADDRESSES=192.168.10.101:5672 --RABBIT_USER=guest --RABBIT_PASSWORD=guest --RABBIT_VIRTUAL_HOST=/ --RABBIT_QUEUE=zipkin

Messages in the queue are consumed after startup
insert image description here

Link tracking data is stored in MySQL.
insert image description here

7.5.3 ElasticSearch

We have learned the detailed use of Elasticsearch in the previous course, so we won’t go into details here, just start and use it directly.

(1). Start the Elasticsearch cluster

To start the cluster, visit:

http://192.168.10.101:9200/ 

The result is as follows:
insert image description here

Start the head plug-in, visit: http://192.168.10.101:9100/ The results are as follows:
insert image description here

(2). Deploy Zipkin server

Add startup parameters and redeploy the server:
Official website address: Zipkin server

java -jar zipkin-server-2.20.1-exec.jar --STORAGE_TYPE=elasticsearch --ES_HOSTS=http://192.168.10.101:9200/,http://192.168.10.102:9200/,http://192.168.10.103:9200/ --RABBIT_ADDRESSES=192.168.10.101:5672 --RABBIT_USER=guest --RABBIT_PASSWORD=guest --RABBIT_QUEUE=zipkin

The startup parameters include the configuration of Elasticsearch and RabbitMQ, which is based on MQ and stores link information to Elasticsearch.

(3). View the index library

Visit: http://192.168.10.101:9100 You can see that the zipkin index library has been created.
insert image description here

(4). Clients add dependencies

Official website documents: official documents

<!-- 消息队列通用依赖 -->
<dependency>
    <groupId>org.springframework.amqp</groupId>
    <artifactId>spring-rabbit</artifactId>
</dependency>

(5). Client configuration file

spring:
 zipkin:
   base-url: http://localhost:9411/ # 服务端地址
   sender:
     type: rabbit
   rabbitmq:
     queue: zipkin                  # 队列名称
 rabbitmq:
   host: 192.168.10.101             # 服务器 IP
   port: 5672                       # 服务器端口
   username: guest                  # 用户名
   password: guest                  # 密码
   virtual-host: /                  # 虚拟主机地址
   listener:
     direct:
       retry:
         enabled: true              # 是否开启发布重试
         max-attempts: 5            # 最大重试次数
         initial-interval: 5000     # 重试间隔时间(单位毫秒)
     simple:
       retry:
         enabled: true              # 是否开启消费者重试
         max-attempts: 5            # 最大重试次数
         initial-interval: 5000     # 重试间隔时间(单位毫秒)
 sleuth:
   sampler:
     probability: 1.0               # 收集数据百分比,默认 0.1(10%)

(6).Test

access:

http://localhost:9000/order-service/order/1 

View the index library results as follows:
insert image description here

8. Use ELK to analyze and track data

ELK is a complete set of log collection and display solutions provided by elastic company. It is the acronym for three products, namely ElasticSearch, Logstash and Kibana.

  • Elasticsearch ES for short : a real-time distributed search and analysis engine, which can be used for full-text search, structured search and analysis. A search engine based on the full-text search engine Apache Lucene, written in Java language.
  • Logstash : A data collection engine with real-time transmission capabilities, which collects, parses, and sends various data to ES. Written in Ruby language.
  • Kibana : provides analysis and visualization web platform for Elasticsearch. It can search in the Elasticsearch index, interact with data, and generate various dimension tables and graphs.
  • Beats : A collective term for a group of lightweight acquisition programs written in the Go language. The following are the 5 kinds of beats officially supported by elastic. In fact, the great open source power has already created dozens or even hundreds of kinds of beats, which are only unexpected by you, and can’t be done without beats:
    • Filebeat: Collect files and directories, mainly for collecting log data.
    • Winlogbeat: Data collection specifically for Windows' event log.
    • Metricbeat: To collect indicators, indicators can be system or many middleware products, mainly used to monitor the performance of the system and software.
    • Packetbeat: Through network packet capture and protocol analysis, monitoring and data collection of some request-response system communications can collect a lot of information that cannot be collected by conventional methods.
    • Heartbeat: Connectivity detection between systems, such as connectivity monitoring of icmp, tcp, http and other systems.

8.1 Environment preparation

We have already learned the detailed use of ELK in the previous course, so we won't go into details here, just start and use it directly. The ELK version used in this article is uniformly 7.5.2.

  • The Elasticsearch cluster addresses used in this article are:
    192.168.10.101:9200
    192.168.10.102:9200
    192.168.10.103:9200
  • The address of Logstash used in this article is:
    192.168.10.101:9250
  • The address of Kibana used in this article is:
    192.168.10.101:5601

The content of the configuration file log-to-es.conf specified when Logstash is running is as follows:

# 数据入口
input {
 tcp {
 mode => "server"
 host => "192.168.10.101"
 port => 9250
 }
}
# 处理数据
filter {
 # 获取 @timestamp 的值并加上 8*60*60(北京时间比 logstash 中@timestamp 晚了 8 小
时),然后赋值给变量 timestamp。
 ruby { 
 code => "event.set('timestamp', event.get('@timestamp').time.localtime + 
8*60*60)"
}
 # 将 timestamp 值重新赋值给 @timestamp
 ruby {
 code => "event.set('@timestamp', event.get('timestamp'))"
 }
 # 删除变量 timestamp
 mutate {
 remove_field => ["timestamp"]
 }
}
# 数据出口
output {
 elasticsearch {
 hosts => ["192.168.10.101:9200", "192.168.10.102:9200", 
"192.168.10.103:9200"]
 index => "applog"
 }
}

8.2 Add dependencies

Add logstash-logback-encoder dependencies to projects that require link tracking (service gateway, product service, order service).

<!-- logstash 编码依赖 -->
<dependency>
    <groupId>net.logstash.logback</groupId>
    <artifactId>logstash-logback-encoder</artifactId>
    <version>6.3</version>
</dependency>

8.3 Log configuration

Add logstash to output data in JSON format in projects that require link tracking (service gateway, commodity service, order service).
logback.xml

<?xml version="1.0" encoding="UTF-8"?>
<configuration scan="true" scanPeriod="10 seconds">
    
   ...
    <!-- 为 Logstash 输出 JSON 格式数据 -->
    <appender name="LOGSTASH_PATTERN"
class="net.logstash.logback.appender.LogstashTcpSocketAppender">
        <!-- 数据输出目的地 -->
        <destination>192.168.10.101:9250</destination>
        <!-- 日志输出编码 -->
<encoder
class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
            <providers>
                <pattern>
                    <pattern>
                       {
                       "severity": "%level",
                       "service": "${springAppName:-}",
                       "trace": "%X{X-B3-TraceId:-}",
                       "span": "%X{X-B3-SpanId:-}",
                       "exportable": "%X{X-Span-Export:-}",
                       "pid": "${PID:-}",
                       "thread": "%thread",
                       "class": "%logger{40}",
                       "rest": "%message"
                       }
                    </pattern>
                </pattern>
            </providers>
        </encoder>
    </appender>
    <!-- 日志输出级别及方式 -->
    <root level="DEBUG">
        <appender-ref ref="CONSOLE"/>
        <appender-ref ref="LOGSTASH_PATTERN"/>
        <appender-ref ref="DEBUG_FILE"/>
        <appender-ref ref="INFO_FILE"/>
        <appender-ref ref="WARN_FILE"/>
        <appender-ref ref="ERROR_FILE"/>
    </root>
    
   ...
</configuration>

8.4 View index library

Visit: http://192.168.10.101:9100 You can see that the applog index library has been created.
insert image description here

8.5 Testing

Visit: http://localhost:9000/order-service/order/1 View the index library results are as follows:
insert image description here
Visit: http://192.168.10.101:5601/ Kibana home page
insert image description here
Add applog index library.
insert image description here
No time filter is used.
insert image description here
The search gateway results are as follows:
insert image description here

Guess you like

Origin blog.csdn.net/sinat_38316216/article/details/129926266