How to implement simple distributed link function?

Why you need link tracking

Why do you need link tracking? In a microservice environment, services call each other, and there may be complex service interactions such as A->B->C->D->C. Then a method is needed to completely record a request link, otherwise troubleshooting It’s hard to figure out the problem, and the request logs can’t be completely stringed together.

How to implement link tracking

Suppose we start from the user request interface. Each request needs to have a unique request id (we will mark it as traceId for the time being) to identify the request. Then when the interface receives the request and calls subsequent services or mq, the traceId can always be Pass it on, and the log can be printed out, thus realizing a simple link function.

How to generate a unique requestId for each request

In a microservice environment, user requests generally pass through the gateway first, and the gateway then forwards the request to each service. There are various microservice gateways, such as Nginx, Zuul, Spring Cloud Gateway, Kong, Traefik, etc. Assume that there is such a link, user request -> nginx -> zuul -> service-a, service-b, etc. (here we use Eureka serves as the service registration center, uses Feign to implement mutual calls between microservices, and uses Zuul as the service front-end gateway). The calls are roughly as follows:

In this case, starting from the nginx request, we need to identify the traceId of this request, and then the traceId can be passed to the service service layer. So based on such a link, how do we design a link tool?

Nginx

nginx has built-in variables since version 1.11.0  $request_id. The principle is to generate a string of 32-bit random strings. Although it cannot be compared to uuid, the probability of repetition is very small and can be used as uuid. Each user request will generate one  $request_id, which can be used as our traceId.

If setting, first set the nginx log format, which supports  $request_id:

 
 
log_format access '$remote_addr $request_time $body_bytes_sent $http_user_agent $request $status $request_id'

The commonly used built-in variables of nginx and their meanings are as follows:

  • $remote_addr: client address, such as: 172.16.11.1
  • $remote_user: client user name
  • $time_local: access time and time zone, 20/Dec/2022:10:47:58 +0800
  • $request: Requested URI and HTTP protocol, "GET/HTTP/1.1"
  • $status: HTTP request status, 304
  • $body_bytes_sent: The size of the file content sent to the client
  • $request_time: the total time of the entire request
  • $request_id: the id of the current request

Secondly, when nginx forwards the request, add the traceId header:

 
 
location / {
proxy_set_header traceId $request_id;
}

Zuul

After traceId is forwarded to zuul through nginx, there is a problem of header loss when zuul route forwards. We can customize a zuul pre-filter and pass the header in the filter. The code is relatively simple:

 
 
@Component
public class TraceIdPreFilter extends ZuulFilter {
private static final String TRACE_ID = "traceId";
@Override
public String filterType() {
return "pre";
}
@Override
public int filterOrder() {
return 0;
}
@Override
public boolean shouldFilter() {
return true;
}
@Override
public Object run() {
RequestContext requestContext = RequestContext.getCurrentContext();
HttpServletRequest request = requestContext.getRequest();
requestContext.addZuulRequestHeader(TRACE_ID, request.getHeader(TRACE_ID));
return null;
}
}

Service service layer

The service service layer has to do several things:

  • Receive the traceId forwarded by the temporary zuul
  • Log file configuration, log supports output traceId
  • After the service receives the traceId, when calling other services, the traceId needs to be passed on.

First, let's look at the code level. How to receive the traceId forwarded by temporary storage zuul. We need to use filters and MDC (the key placed in the MDC can be output in the log). We create a filter to receive the traceId forwarded by zuul, and set the traceId to the MDC so that our log file can output the traceId:

 
 
public class TraceIdFilter implements Filter {
private static final String TRACE_ID = "traceId";
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain filterChain) throws IOException, ServletException {
try {
HttpServletRequest httpRequest = (HttpServletRequest) request;
String traceId = httpRequest.getHeader(TRACE_ID);
TraceIdHelper.setTraceId(traceId);
filterChain.doFilter(request, response);
} finally {
// 清除MDC的traceId值,确保当次请求不会影响其他请求
TraceIdHelper.clearTraceId();
}
}
@Override
public void init(FilterConfig filterConfig) throws ServletException {
}
@Override
public void destroy() {
}
}
@UtilityClass
public class TraceIdHelper {
public static final String TRACE_ID = "traceId";
private static final ThreadLocal<String> TRACE_ID_THREAD_LOCAL = new ThreadLocal<>();
/**
* 设置traceId,为空时初始化一个
* @param traceId
*/
public void setTraceId(String traceId) {
if (StringUtils.isBlank(traceId)) {
traceId = UUID.randomUUID().toString();
}
TRACE_ID_THREAD_LOCAL.set(traceId);
MDC.put(TRACE_ID, traceId);
}
/**
* 清除traceId
*/
public void clearTraceId() {
TRACE_ID_THREAD_LOCAL.remove();
MDC.remove(TRACE_ID);
}
/**
* 获取traceId
* @return
*/
public String getTraceId() {
return TRACE_ID_THREAD_LOCAL.get();
}
}

Filter registration:

@Configuration
public class TraceIdConfig {
@Bean
public FilterRegistrationBean<TraceIdFilter> loggingFilter() {
FilterRegistrationBean<TraceIdFilter> registrationBean = new FilterRegistrationBean<>();
registrationBean.setFilter(new TraceIdFilter());
// 设置过滤的URL模式
registrationBean.addUrlPatterns("/*");
return registrationBean;
}
}

Let’s take a look at our log file configuration (logback.xml):

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property name="LOG_PATTERN"
value="%d{yyyy-MM-dd} %d{HH:mm:ss.SSS} [%highlight(%-5level)] [%boldYellow(%X{traceId})] [%boldYellow(%thread)] %boldGreen(%logger{36} %F.%L) %msg%n">
</property>
<property name="FILE_LOG_PATTERN"
value="%d{yyyy-MM-dd} %d{HH:mm:ss.SSS} [%-5level] [%X{traceId}] [%thread] %logger{36} %F.%L %msg%n">
</property>
<property name="FILE_PATH" value="/wls/app/applogs/service-a.%d{yyyy-MM-dd}.%i.log" />
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>${LOG_PATTERN}</pattern>
</encoder>
</appender>
<appender name="FILE"
class="ch.qos.logback.core.rolling.RollingFileAppender">
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>${FILE_PATH}</fileNamePattern>
<!-- keep 15 days' worth of history -->
<maxHistory>15</maxHistory>
<timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
<!-- 日志文件的最大大小 -->
<maxFileSize>10MB</maxFileSize>
</timeBasedFileNamingAndTriggeringPolicy>
</rollingPolicy>
<encoder>
<pattern>${FILE_LOG_PATTERN}</pattern>
</encoder>
</appender>
<logger name="com.example.service.controller" level="debug"></logger>
<root level="info">
<appender-ref ref="STDOUT"/>
<appender-ref ref="FILE"/>
</root>
</configuration>

The traceId is configured in the xml configuration file, so that the traceId can be seen in the log being output.

Finally, when calling between services, we need to pass the traceId to the next microservice, which requires the use of feign's interceptor:

@Component
public class TraceIdFeignInterceptor implements RequestInterceptor {
@Override
public void apply(RequestTemplate requestTemplate) {
// spring的上下文对象
ServletRequestAttributes requestAttributes = (ServletRequestAttributes) RequestContextHolder.getRequestAttributes();
if (requestAttributes != null) {
// 前面的过滤器已经获取并设置了 traceId,这里就可以直接获取了
requestTemplate.header(TraceIdFilter.TRACE_ID, TraceIdHelper.getTraceId());
}
}
}

message queue layer

假设 service-a 发送一条 mq 消息后,service-b 消费到了,那么需要将消费链路也串起来怎么做呢?我们以 rocketmq 举例,rocketmq 提供了 UserProperty 可以发送带属性的消息,这样通过 UserProperty 我们便能实现 traceId 的传递。比如消息发送时:

Message msg = new Message("SequenceTopicTest",// topic
"TagA",// tag
("Hello RocketMQ " + i).getBytes("utf-8") // body
);
msg.putUserProperty("traceId", TraceIdHelper.getTraceId()); //设置 traceId

消息消费时:

String traceId = msgs.get(0).getUserProperty("traceId");
TraceIdHelper.setTraceId(traceId);

dubbo 如何传递 traceId

dubbo 的 spi 机制可以很方便的让我们来实现各种拓展,比如 dubbo 提供的 provider、consumer 过滤器,我们可以分别实现一个 provider、consumer 得到过滤器。

服务消费者那里,我们可以自定义一个 consumer 过滤器,过滤器中先通过 TraceIdHelper.getTraceId() 获取到 traceId 后再通过 dubbo 提供的 setAttachment("traceId", TraceIdHelper.getTraceId()) 将 traceId 传递下去。

同样地,服务提供者那里,我们可以自定义一个 provider 过滤器,首先通过 dubbo 提供的 getAttachment 获取到 traceId,之后再使用封装好的 TraceIdHelper.setTraceId 将 traceId 暂存即可,这里代码就不写了。

多线程时如何继续传递 traceId

我们的工具类 TraceIdHelper 注意看使用的 ThreadLocal 进行的 traceId 暂存,就会存在多线程环境下,子线程取不到 traceId 也就说子线程的日志没法打印出 traceId 的问题,解决思路的话有几种,

  • 可以自定义 ThreadPoolTaskExecutor,线程 run 执行前先将 traceId 设置进去,缺点是比较麻烦
  • 使用阿里提供的开源套件 TransmittableThreadLocal(使用线程池等会池化复用线程的执行组件情况下,提供ThreadLocal值的传递功能,解决异步执行时上下文传递的问题)

总结

链路工具的实现会用到多个组件,每个组件都需要不同的配置:

  • nginx:配置 $request_id,转发时配置 $request_id header
  • zuul:配置前置过滤器,进行 traceId 向下游透传
  • 服务层:log 日志文件配置,用到 MDC 来打印输出 traceId;用到了过滤器和 Feign 拦截器来实现 traceId 透传
  • mq:消息队列要想实现 traceId 传递,如 rocketmq 需要用到 UserProperty
  • 多线程:多线程时子线程可能会获取不到 traceId,可以自定义 ThreadPoolTaskExecutor 或者 使用阿里提供的开源套件 TransmittableThreadLocal

Guess you like

Origin blog.csdn.net/qq_41221596/article/details/133440935