How to achieve cross-language, non-intrusive traffic recording based on eBPF?

Testing is an important part of product release and launch. However, as the business scale and complexity continue to increase, more and more functions need to be returned every time it goes online, which brings tremendous pressure to the testing work. Against this backdrop, more and more teams are starting to use traffic playback to perform regression tests on services.

Before building the traffic playback capability, we must record the traffic of online services. Usually, different implementation methods should be selected based on comprehensive consideration of requirements on traffic characteristics, implementation costs, and intrusiveness to services.

65e14fce145f07c3c8cf283e67c09149.jpeg

For Java and PHP languages, there are relatively mature solutions jvm-sandbox-repeater and rdebug in the industry, which can basically achieve low-cost and non-intrusive traffic recording; but Go language lacks available tools such as jvm or libc In the middle layer, the existing solution sharingan needs to modify the official Go source code and invade the business code, which has a greater risk of stability; and with the upgrade of the official Go version, continuous maintenance iterations are required, and the use and maintenance costs are high.

In view of Didi's multilingual technology stack, we found through research that a cross-language , non-intrusive traffic recording solution can be implemented through eBPF, which greatly reduces the use and maintenance costs of traffic recording.        

Principle of Traffic Recording

recording content

The downstream dependent service needs to be mocked during traffic playback, so a complete recorded traffic needs to include not only the request/response of the ingress call, but also the request/response of the dependent service invoked when processing this request.

5d02c205072f4e1e180fffc09b163837.jpeg

Implementation ideas

Before introducing the traffic recording solution, let's look at a request processing process (simplified):

a823246c46434390182904d7b1dc3aa8.jpeg

Observing the above process, we found that the general process of the target service processing a request is as follows:

  • First, call accept to get a caller's connection;

  • The second step is to read the request data by calling recv on this connection, and parse the request;

  • In the third step, the target service starts to execute business logic. During the process, one or more dependent services may need to be invoked. For each dependent service call, the target service needs to establish a connection with the dependent service through connect, and then send a request through send on this connection. Data, receive dependent service responses through recv;

  • Finally, the target service returns the response data to the caller through send.

In order to achieve traffic recording, we need to save all the request and response data in the graph. Traditional traffic recording methods need to track all methods involving sending/receiving data, such as service framework, RPC framework, and dependent service sdk, and collect and save the data. Due to the variety of frameworks and SDKs, a lot of code transformation and development work is required, and the cost is difficult to control.

Here we consider a more general way: tracking socket-related operations , such as accept, connect, send, recv, etc. In this way, we can achieve a more general traffic recording method without caring about the application layer protocols, frameworks, SDKs, etc. used in the business.

However, since the recording location is at a lower level, less context information can be obtained, and the data sent and received by each socket is not enough. We need to concatenate the original data with other information to assemble a complete flow.

different requests

Most of the requests processed by online services are concurrent, and there will be multiple requests intertwined at the same time. We recorded that the original data is scattered. How to merge the data of the same request and distinguish the data of different requests? By analyzing the actual request processing process, we can easily find:

1. Normally, each request is processed in a separate thread.   

f33d542b7394861751afd4d647ef41df.jpeg

2. In order to improve the processing speed, it is possible to create sub-threads to call dependent services concurrently.

d3d3557bf64b4b97a6b248cf8664226c.jpeg

In fact, sub-threads may also create sub-threads to form the thread relationship shown in the following figure:

40838f2070359f24b3102226d6f9cfb2.png

For this scenario involving sub-threads, we only need to merge the data of the sub-threads into the request processing thread. Each request corresponds to a request processing thread and a series of sub-threads. Finally, we can distinguish different requests according to the thread ID .

Differentiate data types

Each flow contains two types of data: the request and response of the ingress call, and the request and response of the downstream dependent call. We need to distinguish when traffic recording. By observing the request processing process, it is not difficult for us to find the rules:

1. The request and response of the entry call are received and sent on the socket obtained by accept, the data of recv is request, and the data of send is response.

2. The request and response of downstream dependent calls are received and sent on the socket obtained by connect, the data of send is request, and the data of recv is response; different sockets correspond to different downstream calls.

Therefore, we can distinguish different data types and different downstream dependent calls  according to the socket type and identification .

Realization of traffic recording

Considering that most of the services are already on the cloud, the solution needs to support containerized deployment. The eBPF program runs in the kernel, and all containers on the same host share the same kernel, so the eBPF program only needs to be loaded once to record the data of all processes. The overall plan is as follows:              

fe49d9e56ccfcbbd43d7e5d5568df865.png

  • Recording agent : Deployed in the same container as the target process, find the pid of the target process to be recorded according to the process name, (1) control the recording server to turn on/off recording; (7) receive raw data from the recording server, parse it into complete traffic, ( 8) Save to log file.

  • Recording server : deployed on the host machine, responsible for (2, 3) loading/mounting eBPF program, (6) reading raw data from eBPF Map.

  • eBPF program : responsible for (5) reading the original data from the mounted function and writing it into the eBPF Map when the target process (4) sends and receives data.

Select an insertion point

According to the previous discussion, the socket operations we need to track include:

  • accept and connect are used to distinguish socket types.

  • send and recv are used to capture sent and received data.

  • close is used to identify the end of the call.

For the Go language, it is also necessary to obtain the goroutine id that performs the above socket operation and track the parent-child relationship of the goroutine.

Before developing an eBPF program, you need to select a suitable eBPF program mounting location. Different eBPF program types have different contexts that can be obtained, and different bpf-helper functions that can be called. The data we need to record is only TCP and UDP, so it can be mounted to the following functions of the kernel through kprobe:

  • inet_accept

  • inet_stream_connect

  • inet_sendmsg

  • inet_recvmsg

  • inet_release

In order to track the relationship between goroutines, we can mount the uprobe to the runtime.newproc1 function of the Go runtime, and obtain the corresponding goroutine information from callergp and newg.

Develop eBPF programs

Although traffic recording involves multiple kernel functions, the process is basically the same. The following takes recording socket data as an example to introduce in detail.

Function signature:

int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)

Parameter Description:

  • sock socket pointer

  • msg the data to send

  • size The length of the data to send

return value:

  • The length of the sent data is returned on success, and an error code is returned on failure.

Since the length of the actually sent data can only be obtained when the function returns, we need to develop two programs to complete the following work respectively:

  • Log function arguments and context at function entry

  • Record the actual sent data content when the function returns

Function entry eBPF program:

SEC("kprobe/inet_sendmsg")
int BPF_KPROBE(inet_sendmsg_entry, struct socket *sock, struct msghdr *msg)
{
    struct probe_ctx pctx = {
        .bpf_ctx = ctx,
        .version = EVENT_VERSION,
        .source = EVENT_SOURCE_SOCKET,
        .type = EVENT_SOCK_SENDMSG,
        .sr.sock = sock,
    };
    int err;




    // 过滤掉不需要录制的进程
    if (pid_filter(&pctx)) {
        return 0;
    }




    // 读取 socket 类型信息
    err = read_socket_info(&pctx, &pctx.sr.sockinfo, sock);
    if (err) {
        tm_err2(&pctx, ERROR_READ_SOCKET_INFO, __LINE__, err);
        return 0;
    }




    // 记录 msg 中的数据信息
    err = bpf_probe_read(&pctx.sr.iter, sizeof(pctx.sr.iter), &msg->msg_iter);
    if (err) {
        tm_err2(&pctx, ERROR_BPF_PROBE_READ, __LINE__, err);
        return 0;
    }




    // 将相关上下文信息保存到 map 中
    pctx.id = bpf_ktime_get_ns();
    err = save_context(pctx.pid, &pctx);
    if (err) {
        tm_err2(&pctx, ERROR_SAVE_CONTEXT, __LINE__, err);
    }
    return 0;
}

The function returns an eBPF program:

SEC("kretprobe/inet_sendmsg")
int BPF_KRETPROBE(inet_sendmsg_exit, int retval)
{
    struct probe_ctx pctx = {
        .bpf_ctx = ctx,
        .version = EVENT_VERSION,
        .source = EVENT_SOURCE_SOCKET,
        .type = EVENT_SOCK_SENDMSG,
    };
    struct sock_send_recv_event event = {};
    int err;




    // 过滤掉不需要录制的进程
    if (pid_filter(&pctx)) {
        return 0;
    }




    // 如果发送失败, 跳过录制数据
    if (retval <= 0) {
        goto out;
    }




    // 从 map 中读取提前保存的上下文信息
    err = read_context(pctx.pid, &pctx);
    if (err) {
        tm_err2(&pctx, ERROR_READ_CONTEXT, __LINE__, err);
        goto out;
    }




    // 构造 sendmsg 报文
    event.version = pctx.version;
    event.source = pctx.source;
    event.type = pctx.type;
    event.tgid = pctx.tgid;
    event.pid = pctx.pid;
    event.id = pctx.id;
    event.sock = (u64)pctx.sr.s;
    event.sock_family = pctx.sr.sockinfo.sock_family;
    event.sock_type = pctx.sr.sockinfo.sock_type;




    // 从 msg 中读取数据填充到 event 报文, 并通过 map 传递到用户空间
    sock_data_output(&pctx, &event, &pctx.sr.iter);




out:
    // 清理上下文信息
    err = delete_context(pctx.pid);
    if (err) {
        tm_err2(&pctx, ERROR_DELETE_CONTEXT, __LINE__, err);
    }
    return 0;
}

get goid

For the Go language, we need to perform data concatenation according to the goroutine id when sending and receiving data. How to obtain it in the eBPF program? By analyzing the source code of go, we found that the goroutine id is stored in struct g, and the current g pointer can be obtained through getg().

getg function:

// getg returns the pointer to the current g.
// The compiler rewrites calls to this function into instructions
// that fetch the g directly (from TLS or from the dedicated register).
func getg() *g

According to the function comment, the current g pointer is placed in the thread local storage (TLS), and the code calling getg() is rewritten by the compiler. In order to find the implementation of getg(), we see that getg is called in the runtime.newg function, disassemble it, and find that the pointer of g is stored at the memory address of the fs register -8:

671f66efe5ac93dc7e91884cc765a42d.png

Next, we find the goid field in struct g (located in runtime/runtime2.go):

type g struct {
    .... 此处省略大量字段
    goid         int64
    .... 此处省略大量字段
}

After getting the pointer of g, just add the offset of the goid field to get the goid. At the same time, considering that the goid offset may be different between different go versions, we can finally obtain the current goid in the eBPF program like this:

static __always_inline
u64 get_goid()
{
      struct task_struct *task = (struct task_struct *)bpf_get_current_task();
      unsigned long fsbase = 0;
      void *g = NULL;
      u64 goid = 0;
      bpf_probe_read(&fsbase, sizeof(fsbase), &task->thread.fsbase);
      bpf_probe_read(&g, sizeof(g), (void*)fsbase-8);
      bpf_probe_read(&goid, sizeof(goid), (void*)g+GOID_OFFSET);
      return goid;
}

problems encountered

Although the eBPF program can be developed in C language, it is quite different from the ordinary C language development process, which adds many restrictions.

The following are the more critical problems and solutions encountered during development:

  • Global variables, constant strings or arrays are not allowed and can be stored in a map.

  • Function call is not supported, it can be solved by inline inlining.

  • The stack space cannot exceed 512 bytes, and the map of array type can be used as a buffer if necessary.

  • It cannot directly access the memory in user mode and kernel mode, but through related functions of bpf-helper.

  • The number of instructions in a single program cannot exceed 1,000,000. Try to keep the logic of the eBPF program as simple as possible, and complete complex processing in the user mode program.

  • The cycle must have a clear upper limit for the number of times, and it cannot be judged only by runtime.

  • Structure members must be memory-aligned, otherwise some memory may be uninitialized, causing verifier to report an error.

  • After the code is optimized by the compiler, the verifier may falsely report the memory access out-of-bounds problem. You can add an if judgment to the code to help the verifer identify it. If necessary, it can be solved by inline assembly.

  • ....

With the gradual improvement of clang and the kernel's support for ebpf, many problems are gradually being solved, and the subsequent development experience will become smoother.

Security Mechanism

In order to ensure the security of traffic data and reduce the impact of data desensitization on the performance of online machines, we choose to encrypt during the traffic collection phase:          

6675a03cc0aac4d310212fa092060368.jpeg

Summarize

This article introduces the application of eBPF in the direction of traffic recording, hoping to help you reduce the implementation and access costs of traffic recording, and quickly build traffic playback capabilities. Due to space constraints, many details of traffic recording cannot be shared. The plan is to open source the solution in the future. Welcome to continue to pay attention to the Didi open source project. For more information about the application scenarios of eBPF, interested students can also read " EBPF Kernel Technology in Didi Cloud's Native Landing Practice " to learn more.

Limited to the technical level of the author, there are inevitably some mistakes and omissions in the article. You can leave a message in the comment area to make corrections, and look forward to more exchanges and discussions in the future.

 END 

Author and department introduction 

The author of this article, Wang Chaofeng, is from Didi’s online car-hailing travel technology team. As the online car-hailing business research and development team, travel technology has built an end-user experience platform, C-end user product ecology, B-end transportation capacity supply ecology, travel safety ecology, and services. Govern the ecology and core security system to create a travel platform that is safe, reliable, efficient, convenient, and user-reliable.

Job Offers

We are recruiting for the backend of the team and testing requirements. Interested partners are welcome to join. You can scan the QR code below and submit your resume directly. We look forward to your joining!

R & D Engineer

Job description:

1. Responsible for background research and development of related business systems, including business architecture design, development, complexity control, and improvement of system performance and research and development efficiency;

2. With business sense, through continuous technical research and innovation, iteratively improves the core data of the business together with products and operations.

251066c45fcd0baf36f051708961bf7e.png

Test Development Engineer

Job description: 

1. Build a quality assurance system applicable to the online car-hailing business, formulate and promote the implementation of relevant quality technical solutions, and continue to ensure business quality;

2. In-depth understanding of the business, establish communication with various roles in the business, summarize business problems and pain points, create value for the business in an all-round way, and work without fixed boundaries;

3. Improve business code quality and delivery efficiency by applying relevant quality infrastructure;

4. Precipitate efficient testing solutions, and provide generalized solutions to support landing applications in other business lines;

5. Solve difficult problems and complex technical problems in business quality assurance;

6. Forward-looking exploration in the field of quality technology.

1b35e80d52dd3cd0ad79fb26da54648f.png

Guess you like

Origin blog.csdn.net/DiDi_Tech/article/details/132095008