[Activity Review] Understanding eBPF starts with these examples

The eighth Elixir Meetup sponsored by Tubi ended successfully at the end of May. Three senior Elixir users,  Horvo , Scott  and Yang Miao, shared the relevant applications and practices of Elixir with 660+ functional programming enthusiasts online and offline. This article reviews the sharing "Understanding eBPF starts from these examples" brought by Yang Miao.

Welcome to pay attention to the public account of Bitu Technology to learn about the latest information and activities of Elixir.


What is eBPF?

eBPF is a technology in the Linux kernel, and its full name is Extended Berkeley Packet Filter. The classic BPF (Berkeley Packet Filter) was originally designed for network traffic filtering (for example, the tcpdump command uses BPF for packet filtering). Classic BPF appeared in 1992, was merged into the Linux system in 1997, and there were no significant developments for many years after that.

In early 2014, Alexei Starovoitov implemented eBPF. Different from classic BPF, eBPF is a more advanced extension, such as expanding the instruction set, introducing high-level language support, introducing security mechanisms, JIT, Maps, etc. Therefore, eBPF has a wider range of application scenarios, such as performance analysis, security audit, application tracking, etc.

Principles of eBPF

eBPF works by inserting a virtual machine in the kernel that can execute secure, restricted code. These codes can be used to filter and modify network packets, monitor and analyze system performance, control application behavior, and more.

The eBPF program is event-driven and runs when the kernel or application passes a certain hook point, that is, it can be loaded and executed through user-space tools and APIs, or automatically executed through triggers such as kernel events and network traffic.

If the previous definition of eBPF is not intuitive enough, we can simply understand it this way: You are provided with an entry point where you can put a piece of code that can run safely in the kernel to perform some processing (with limited processing of course). Friends who have studied Java may think of AOP (that is, Aspect Oriented Programming) when seeing this. The principle of eBPF is similar to Java AOP in thought. AOP in Java allows us to add a "pointcut" (usually a method, marked with annotations) in normal business code, and then we can add "aspect logic" to the "pointcut".

An example of a UDST

The full name of USDT is Userland Statically Defined Tracing, that is, statically defined tracing in user space. It is a static trace point defined in a user space program to collect and analyze performance data and behavior information of the program while the program is running. Many programs have added support for USDT, such as Erlang, Ruby, Java, MySql, etc. Let's look at a code example for monitoring Erlang garbage collection (this program needs to be executed using bpftrace, bpftrace is a high-level language based on eBPF, which makes it easy to write eBPF programs):

usdt:beam.smp:gcminorstart
{
@start[srt(arg0)] = nsecs;
}
usdt:beam.smp:gcminorend
{
@usecs= hist((nsecs -@start[str(arg0)]) / 1000);
delete(@start[str(arg0)]);
}
END
{
clear(@start);
}

Explain these codes:

usdt:beam.smp:gc_minor__start: Indicates the probe point triggered at the beginning of garbage collection, used to record the current timestamp (in nanoseconds)

· usdt:beam.smp:gc_minor__end: Indicates the detection point triggered at the end of garbage collection, used to calculate the time-consuming of the garbage collection process. Among them, @start and @usecs are BPF Maps for saving and statistics. @start is an associative array indexed by arg0 of type string, holding the timestamp when each garbage collection starts. @usecs is a histogram (Histogram) used to record the time-consuming (in microseconds) of the garbage collection process and divide it into multiple intervals for statistics and aggregation

· The end function calculates the time stamp difference, gets the garbage collection process time-consuming, and records it in @usecs, and the delete call deletes the current garbage collection start timestamp from the @start array

· In the end block, use the clear function to clear the @start array for the next garbage collection monitoring.

Introduction to uprobes

In the previous USDT example, we learned that the user program needs to have a Hook before we can do things, that is, the detection points are predefined. If the user does not provide it, we can also use uprobes for tracking and detection.

Uprobes are user-level dynamic tracing that allow developers to insert probes at the entry and exit of arbitrary functions in user-space programs to collect information about function calls and returns at runtime. These probes can be used for debugging, profiling performance, and security auditing. The principle is to insert a breakpoint at the code instruction. When this instruction is executed, it can be transferred to the handler we specified for execution, and then operations such as reading and modifying parameters can be performed in the handler. In the whole eBPF tracing architecture, the position of uprobes is as follows:

Auto-Instrumentation for Go is an uprobes-based project that can be used to trace Go applications. The following code demonstrates reading a Go request:

Introduction to Traffic control (TC) Hooks

Traffic Control (TC) is a network traffic control mechanism in the Linux kernel, which is used to control the speed limit, delay, and packet loss of network traffic. TC Hooks, as the name suggests, are TC hooks. We can insert eBPF programs when network data passes through the TC chain to achieve purposes like monitoring.

The TC framework contains many components, including classifier (Classifier), queue (Queue), queue scheduler (Queueing Discipline), etc. TC Hooks can act on any node in these components, the most commonly used is the queue scheduler (Queueing Discipline), as shown in the figure below:

Usually, our software application will have many modules, and a user's request will go through multiple service calls. If we want to build a Request Flow, what should we do? The most common method is TraceId, which continuously passes down a TraceId to get a Request Flow. However, this approach requires code modification. If there are many applications, or even multi-language, the cost of modification and maintenance will be high. If we dive to the next level and use TC Hook to do tricks, the cost can be greatly reduced (about N times M to M), and the logic is shown in the following figure:

The following code demonstrates how to use eBPF Hook HTTP request (the program needs to be executed using bcc, bcc is a tool set that provides an easier-to-use API, making it easier to write eBPF programs):

As shown in the above code, hang the eBPF code into the kernel for execution through TC (the above sniff.c is executed in the kernel), read the data in the kernel, and transfer the data to the user program, and the user program can parse the original data. The output of the above program is as follows:

wonderful interaction

Security concerns about eBPF

At the Elixir Meetup site, we saw the power of eBPF, which can execute code in the kernel; naturally, we are also concerned about its security issues. How does eBPF ensure security, and what to do if it panics?

First of all, most eBPF programs need corresponding permissions to load, such as root, CAP_EBPF. Secondly, the Kernel will use the verifier to check, and the verifer will ensure the memory safety of eBPF and the program will terminate (termination), and can only call the specified method, as shown in the following figure:

Some open source projects about eBPF

eBPF has been called the biggest change in the operating system in the past few decades. In fact, many companies have eBPF practices, and there are many open source projects about eBPF in the community. Here are some:

· Katran [Network - Layer 4 Load Balancer]: Facebook's open source high-performance layer 4 load balancer. Katran is a Cpp library and eBPF program for building a high performance Layer 4 load balancing forwarding plane

Cilium [network/security/observability]: Isovalent open source project providing eBPF-driven networking, security and observability. It is specifically designed to bring the benefits of BPF to the Kubernetes world to address the new scalability, security and visibility requirements of container workloads

· BCC [Development Tools]: is an efficient kernel tracing and manipulation toolkit based on eBPF, which includes several useful command-line tools and examples. BCC simplifies writing BPF programs in C, includes a wrapper around LLVM, and front-end bindings for Python and Lua

· bmc-cache(memcache+eBPF): Through eBPF, the throughput of Memcached is increased by 18 times. In short, bmc-cache uses eBPF to do kernel-bypass, and at the same time enables Memcached performance to grow better with the increase of CPU.

If you are interested in eBPF and want to learn and research in depth, I also recommend you to read this technical blog www.ebpf.top/ .

The above sharing comes from the live audience Deming, thank you for your contribution to the Elixir community!

Other application cases of Elixir in Tubi

Using Elixir/OTP to build a multimedia E2E processing platform
Application of Ruby ideas in Elixir projects
A performance problem hidden in the Elixir code base for 7 years

Join Tubi to grow and become stronger together!

Hot jobs: https://tubi.tech/careers/

WeChat Official Account: [Activity Review] Understanding eBPF starts with these examples

Guess you like

Origin blog.csdn.net/weixin_49193714/article/details/131576224