1 Introduction:

Recently, I started to be responsible for ebpf-related projects, so I reviewed the relevant knowledge of ebpf.

The advantage of ebpf is to avoid the painful kernel development work. If there is no ebpf, if we want to do kernel visualization, we must develop the kernel driver, but if we have ebpf, the developed program can be well isolated from the kernel , to avoid the collapse of the kernel.

Learn to review previous blogs:

ebpf c learning_Preface--Lei's Blog-CSDN Blog

Related information:

https://ebpf.io/summit-2020-slides/eBPF_Summit_2020-Lightning-Lorenzo_Fontana-Debugging_the_BPF_Virtual_Machine.pdf

https://qmo.fr/docs/talk_20200202_debugging_ebpf.pdf

What knowledge do you need to master to develop ebpf programs

1. Knowledge about linux kernel

2. Master common ebpf tools

3. Familiar with libbpf

4. Have certain C language development skills

ebpf is mainly composed of two parts of the program, one part is the kernel program, which needs to be compiled into bytecode using clang, and the other part is the user mode program, which is responsible for loading the bytecode

ebpf compilation uses clang + llvm virtual machine to embed into the kernel

The points explored in this note are:

1. Where can we use ebpf to insert piles

2. How do user state programs communicate with bpf programs?

In the follow-up, we will continue to gradually explore the content and explore the application of ebpf in various fields such as cloud native

2. Where can we use ebpf to insert piles

Here we use kprobe as an example:

bpftrace -l "kprobe:*"

In this way, all the insertion points can be listed. For example, if we want to insert sys_write, we can use it in the following c program

SEC("kprobe/__x64_sys_write")

3. How to communicate between user state and kernel state programs

1. File communication

2. Use ringBuffer

I see that ringBuffer communication can be used in libbpf, and it may even support epoll multiplexing. I will study it if necessary in the future, because I mainly use golang and ebpfc communication in my work, so I only look at it for the time being.

LIBBPF_API struct ring_buffer *
ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, void *ctx,
		 const struct ring_buffer_opts *opts);
LIBBPF_API void ring_buffer__free(struct ring_buffer *rb);
LIBBPF_API int ring_buffer__add(struct ring_buffer *rb, int map_fd,
				ring_buffer_sample_fn sample_cb, void *ctx);
LIBBPF_API int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms);
LIBBPF_API int ring_buffer__consume(struct ring_buffer *rb);
LIBBPF_API int ring_buffer__epoll_fd(const struct ring_buffer *rb);

There is also ring_buffer__epoll_fd in the code, how to use these APIs to realize the communication between the user and the kernel mode in the follow-up study

3. Use maps

map is similar to shared memory

Case code:

Kernel ebpf program part:

// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
/* Copyright (c) 2020 Facebook */
#include <unistd.h>
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>

char LICENSE[] SEC("license") = "Dual BSD/GPL";

int my_pid = 0;

struct{
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 8192);
    __type(key, pid_t);
    __type(value, __u64);
}mapTest SEC(".maps");

SEC("kprobe/__x64_sys_write")
int handle_tp(void *ctx)
{
    pid_t pid = bpf_get_current_pid_tgid() >> 32;
    __u64 ts = bpf_ktime_get_ns();
    int ret = bpf_map_update_elem(&mapTest, &pid, &ts, BPF_ANY);
    char msg[] = "hello:%d;ret:%d;pidpr:%p\n";
    bpf_trace_printk(msg, sizeof(msg), pid, ret, &pid
);



    return 0;
}

Here, our program triggers this function when a program uses a function such as write, and then we get the process id and system time and put them in the map. The map definition:

struct{
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 8192);
    __type(key, pid_t);
    __type(value, __u64);
}mapTest SEC(".maps");

After putting it in the map, it will be written to the file

#define DEBUGFS "/sys/kernel/debug/tracing/trace"

The user mode program will keep reading this file, and traverse the map after reading the content

Compile the c program to bytecode:

/usr/bin/clang-14 -g -O2 -target bpf  -D__TARGET_ARCH_x86_64 -c data.c -o kernel_write.o

User part program:

It mainly loads bytecode, and reads the content written into files and maps by some programs of ebpf kernel

#include <bpf/bpf.h>
#include "common.h"
/* SPDX-License-Identifier: GPL-2.0 */
static const char *__doc__ = "Simple XDP prog doing XDP_PASS\n";


#define DEBUGFS "/sys/kernel/debug/tracing/"

static void bump_memlock_rlimit(void)
{
    struct rlimit rlim_new = {
            .rlim_cur	= RLIM_INFINITY,
            .rlim_max	= RLIM_INFINITY,
    };


    if (setrlimit(RLIMIT_MEMLOCK, &rlim_new)) {
        fprintf(stderr, "Failed to increase RLIMIT_MEMLOCK limit!%s\n", strerror(errno));
        exit(1);
    }
}


void read_trace_pipe(struct bpf_object *obj)
{
    int trace_fd;

    trace_fd = open(DEBUGFS "trace_pipe", O_RDONLY, 0);
    printf("%s\n", DEBUGFS "trace_pipe");
    if (trace_fd < 0) {
        printf("%s\n", strerror(errno));
        return;
    }
    struct bpf_map * map = bpf_object__find_map_by_name(obj, "mapTest");
    if (!map) {
        return;
    }

    int map_fd = bpf_map__fd(map);
     pid_t* look_key = NULL;
     pid_t next_key = 0;
     pid_t value = 0;
    while (1) {
        static char buf[4096];
        ssize_t sz;

        sz = read(trace_fd, buf, sizeof(buf) - 1);
        if (sz > 0) {
            buf[sz] = 0;
            puts(buf);
            while (bpf_map_get_next_key(map_fd, look_key, &next_key) != -1) {
                printf("%d\n", next_key);
                if (look_key != NULL) {
                    printf("look:%d\n", *look_key);
                }
                look_key = &next_key;
            }
            sleep(4);

        }



    }
    if (map_fd < 0) {
        return;
    }
//    pid_t pid = getpid();
//    printf("true pid:%d\n", pid);
//    unsigned long long value1 = 10;
//    int nn = bpf_map_update_elem(map_fd, &pid, (const void *) (&value1), BPF_ANY);
//    printf("nn:%d\n", nn);



//    int result = bpf_map_lookup_elem(map_fd, (const void *) &pid, (void *) &value1);

//        printf("pid:%d\n", pid);

}


int main(int argc, char **argv)
{
    char msg[255];

    bump_memlock_rlimit();


    struct bpf_object * obj = bpf_object__open_file("./kernel_write.o", NULL);
    if (libbpf_get_error(obj)) {
        fprintf(stderr, "open object file error!\n");
        return -1;
    }

    struct bpf_map * prog = bpf_object__find_program_by_name(obj, "handle_tp");
    if (!prog) {
        fprintf(stderr, "ERROR: finding a prog in obj file failed\n");
        return -1;
    }

    int ret = bpf_object__load(obj);
    if (ret != 0) {
        fprintf(stderr, "load object file error!%s\n", libbpf_strerror(errno, msg, sizeof(msg)));
        return -1;
    }

    struct bpf_link* link = bpf_program__attach(prog);
    if (libbpf_get_error(link)) {
        fprintf(stderr, "ERROR: bpf_program__attach failed\n");
        return -1;
    }


    read_trace_pipe(obj);

    return 0;
}

How to use c language to develop ebpf program