Network TCP watermark based on Rust and BPF technology

 Network watermarks are used in security fields such as anti-DDoS attacks and network traffic. The principle is to add characteristic information to messages. The added fields based on TCP options will not be removed by firewalls and other modifications. This article uses BPF and Rust to achieve efficient implementation specific description. 1  See [] for the source code.  The image watermark for intellectual property protection is not listed here.

  • BPF driver is developed using c
  • Userland configuration and loader use rust and libbpf-rs to speed up development

Dependency encapsulation

Based on libbpf-rs development, libbpf-rs provides API abstraction for development, including resource abstraction of driver object and its static program, communication, and mounted program

At the same time, libbpf-sys encapsulates unsafe libbpf, libelf, and zlib, and the core is a static link library developed based on libbpf, a part of the kernel. Detailed API that also loads and parses the dynamically linked libelf and zlib drivers used by elf files

The project directory is generated based on libbpf-cargo scaffolding, and the build command calls gen and make to complete

Project directory structure

Code for the skeleton directory automatically generated by libbpf-cargo

netoken\
  src\
    bpf\                  //驱动
      .output\            //脚手架自动生成
        netoken.skel.rs  //=>libbpf_rs
      netoken.c              ^
      vmlinux.h               |
    main.rs  //=>libbpf_rs    |
build.rs     //=>libbpf_cargo-+
Cargo.toml
libbpf-rs\
  .git
  libbpf-rs\
  libbpf-cargo\

The specific location of libbpf-xxx can be configured in Cargo.toml

[dependencies]
libbpf-rs = { path = "../libbpf-rs/libbpf-rs" }
[build-dependencies]
libbpf-cargo = { path = "../libbpf-rs/libbpf-cargo" }

Skeleton code generation process

  1. User project build.rs->libbpf-cargo.rs SkeletonBuilder()  .bpf.c => .output/ .skel.rs
  2. The *.skel.rs process generated by the user project:
    1. obj: DATA [u8]
    2. SkelBuilder()->OpenSkel()->*Skel(), progs/maps/links
    3. build_skel_config()->ObjectSkeletonConfigBuilder::new(DATA)
    4. builder.name().map("").prog("handle_").build()
  3. libbpf-rs provides common tools for users
    1. object/skelton
    2. prog/map/link
    3. iter/print/query/util
    4. perf/ringbuf
  4. libbpf-cargo automatically generates skel when providing cargo build, gen/make
  5. Execute the *.skel.rs process
    1. TcSkelBuilder.open()->OpenTcSkel(obj, config)
    2. OpenTcSkel.load() >> bpf_object__load_skeleton(config)
    3. OpenTcSkel.load()->TcSkel(obj,config, Tclinks())
    4. OpenTcSkel.progs()->OpenTcProgs(obj)
    5. OpenTcSkel.maps()->OpenTcMaps()
    6. OpenTcSkel.data()->ffi::c_void()
    7. TcSkel.attach() >> bpf_object__attach_skeleton(config)
    8. TcSkel.links = TcLinks(handle_tc)
  6. libbpf.h typedef struct bpf_object_skeleton bos
    • libbpf encapsulates the kernel bpf_xxx into 3 structs and 4 phase APIs, hiding many details of the kernel
    • error/print/open_opts/
    • bpf_object_open_[buffer/mem/xattr]xxx
    • bpf_object__load/__next/__set/pin/unpin
    • bpf_[program/map/link]__set/load/fd/xxx;__attach_xxx
    • bpf_map__[set/get/find_map]
    • bpf_perf/kprob/uprobe/tracepoint/link_xdp/tc_hook

User Mode Loading Process

  1. Parse the network card name in the command line parameters and get the network card number in the system
  2. Use the generated skelBuilder to gradually obtain the memory file descriptor fd of the driver object
  3. Use the generated hookBuilder to create and hook yourself in the network entry and exit egress/ingress. This program only uses egress
  4. egress is loaded into the kernel and starts executing
  5. Query the information on the egress of tc

fn main() -> Result<()> {
    // 命令行参数解析
    let opts = Command::parse();

    // 去掉系统的内存限制
    bump_memlock_rlimit()?;

    // 从Skel里获取信息,从opts里获取ifindex
    let builder = netokenSkelBuilder::default();
    let open = builder.open()?;
    let skel = open.load()?;
    let fd = skel.progs().handle_tc().fd();
    let ifidx = nix::net::if_::if_nametoindex(opts.iface.as_str())? as i32;

    let mut tc_builder = TcHookBuilder::new();
    tc_builder
        .fd(fd)
        .ifindex(ifidx)
        .replace(true)
        .handle(1)
        .priority(1);

    // 挂载驱动到TC的egress接口上
    let mut egress = tc_builder.hook(TC_EGRESS);

    // 执行attach,驱动开始工作
    if opts.attach {
        if let Err(e) = egress.attach() {
            bail!("failed to attach egress hook {}", e);
        }
    }
    
    // 执行destory,销毁驱动
    if opts.destroy {
        if let Err(e) = egress.detach() {
            println!("failed to detach egress hook {}", e);
        }

        if let Err(e) = egress.destroy() {
            println!("failed to destroy {}", e);
        }
    }

    // 执行query,查询执行驱动的id
    match egress.query() {
        Err(e) => println!("failed to find egress hook: {}", e),
        Ok(prog_id) => println!("found egress hook prog_id: {}", prog_id),
    }

    // 执行监听perf,收到数据后进入handle_event,收到错误丢包进入handle_lost_events
    let perf = PerfBufferBuilder::new(skel.maps_mut().events())
        .sample_cb(handle_event)
        .lost_cb(handle_lost_events)
        .build()?;

    // 100ms一次的轮询驱动,有事件后进入上面的event
    loop {
        perf.poll(Duration::from_millis(100))?;
    }

    Ok(())
}

Similar to the implementation based on the tc tool

$ tc qdisc add dev xxx 
$ tc filter [add|change|replace] dev xxx 
$ tc qdisc show dev xxx

drive process

  1. Check whether the input parameter struct __sk_buff is a tcp message, and analyze and judge from layer 2 to layer 4 in turn
  2. Check whether the tcp packet is a handshake syn packet. The syn package carries options for the negotiation function between the two parties
  3. Read the policy map to obtain tokens under different policies
  4. Add the token option to the syn package
// 驱动入口,数据已由内核组装为__sk_buff
SEC("tc")
int handle_tc(struct __sk_buff* ctx) {
    struct pkthdr pkt;

    RET_IF(pkt_check(ctx, &pkt) != RET_OK);
    RET_IF(pkt.tcp->syn != 1 || pkt.tcp->ack != 0);
    update_token_by_policy();
    RET_IF(extend_options_token(ctx, &pkt, epp_token) != RET_OK);

    return TC_ACT_OK;
}

// 检查是否tcp包
BPF_INLNE int pkt_check(struct __sk_buff* ctx, struct pkthdr* pkt) {
    pkt->data = (void*)(long)ctx->data;
    pkt->data_end = (void*)(long)ctx->data_end;
    pkt->eth = pkt->data;
    pkt->ipv4 = pkt->data + sizeof(struct ethhdr);

    RET_ERR_IF(pkt->eth + 1 > (struct ethhdr*)(pkt->data_end));
    RET_ERR_IF(pkt->eth->h_proto != bpf_constant_htons(ETH_P_IP));
    RET_ERR_IF(pkt->ipv4 + 1 > (struct iphdr*)(pkt->data_end));
    RET_ERR_IF(pkt->ipv4->protocol != IPPROTO_TCP);
    pkt->tcp = pkt->data + sizeof(struct ethhdr) + (pkt->ipv4->ihl * 4);
    RET_ERR_IF(pkt->tcp + 1 > (struct tcphdr*)(pkt->data_end));

    return RET_OK;
}

// 增加tcp的options的token option,对网卡以支持的offload计算checksum的,注释掉加快执行
BPF_INLNE int extend_options_token(struct __sk_buff* ctx, struct pkthdr* pkt, u64 token) {
    u32 data_end = ctx->len; // 非线性包总长
    u16 sz = sizeof(token);
    pkt->ipv4->tot_len = bpf_htons(pkt->ipv4->ihl * 4 + pkt->tcp->doff * 4 + sz);
    pkt->tcp->doff = pkt->tcp->doff + sz / 4;

    RET_IF(bpf_skb_change_tail(ctx, ctx->len + sz, 0));
    RET_IF(bpf_skb_store_bytes(ctx, data_end, &token, sizeof(token), 0));

    RET_IF(bpf_l3_csum_replace(ctx, IP_CSUM_OFFSET, 0, bpf_constant_htons(sz), 0));
    // RET_IF(bpf_l4_csum_replace(ctx, TCP_CSUM_OFFSET, 0, sz / 4, BPF_F_PSEUDO_HDR | sizeof(u8)))

    u16 csum = bpf_csum_diff(0, 0, (u32*)&token, sizeof(token), 0); // 2 tcp pseudo
    // RET_IF(bpf_l4_csum_replace(ctx, TCP_CSUM_OFFSET, 0, csum, 0));

    update_metrics();

    return RET_OK;
}

Extended xdp

libbpf-rs only provides an attach_xdp interface, and other xdp are independently implemented in the libbpf project

If you need advanced functions of af_xdp, you can try  libxdp-rs  , which is developed by Tencent employees and mainly uses the functions of xdp-tools of rust binding , among which is the independently developed libxdp

A brief introduction to other rust bpf projects such as aya

The complexity of libpf-rs is low, the development starting point is also low, and there are few unsafe codes. Aya and redbpf are higher-dimensional, with great abilities and ambitions, so the difficulty is naturally greater

First use rust to write the driver, there is no std, more unsafe and MaybeUninit, I am afraid in my heart.

In addition, there are not many successful development cases. If the business complexity is high, it can be considered. But don't worry, after kernel 5.20 joins rust, you can start.

The driver can reflect the expressiveness of rust

#![no_std] // 
#![no_main] // 

use aya_bpf::{ macros::xdp, bindings::xdp_action, programs::XdpContext,
    maps::{HashMap, PerfEventArray}, };
use aya_log_ebpf::info;
use myapp_common::PacketLog;

#[map(name = "EVENTS")]  // map macro
static mut EVENTS: PerfEventArray<ip_src> =
    PerfEventArray::<ip_src>::with_max_entries(1024, 0);

#[xdp(name="myapp")] // hook点用macro实现,很rust
pub fn myapp(ctx: XdpContext) -> u32 {
    // match匹配xdp_ctx也很直观
    match unsafe { try_myapp(ctx) } {
        Ok(ret) => ret,
        Err(_) => xdp_action::XDP_ABORTED,
    }
}

unsafe fn try_myapp(ctx: XdpContext) -> Result<u32, u32> {
    // 方便的printk宏
    info!(&ctx, "received a packet");

    unsafe {
        EVENTS.output(&ctx, &ip_src, 0); // 隐去ip_src的解析过程...
    }
    Ok(xdp_action::XDP_PASS)
}

#[panic_handler] // hook在异常上
fn panic(_info: &core::panic::PanicInfo) -> ! {
    unsafe { core::hint::unreachable_unchecked() }
}

user state

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    let mut bpf = Bpf::load(include_bytes_aligned!(
        "../../target/bpfel-unknown-none/release/myapp"
    ))?;
    let program: &mut Xdp = bpf.program_mut("xdp").unwrap().try_into()?;
    program.load()?;
    program.attach(&opt.iface, XdpFlags::default())
        .context("failed to attach the XDP program with default flags")?;

    ...

    let mut perf_array = AsyncPerfEventArray::try_from(bpf.map_mut("EVENTS")?)?;

    // 下面还可以封装简化下
    for cpu_id in online_cpus()? { // iter vec<u32> cpus
        let mut buf = perf_array.open(cpu_id, None)?;

        task::spawn(async move {
            let mut buffers = (0..10)
                .map(|_| BytesMut::with_capacity(1024))
                .collect::<Vec<_>>();

            loop {
                let events = buf.read_events(&mut buffers).await.unwrap();
                for i in 0..events.read {
                    let buf = &mut buffers[i];
                    let ptr = buf.as_ptr() as *const ip_src;
                    let data = unsafe { ptr.read_unaligned() };
                    let src_addr = net::Ipv4Addr::from(data.ip_src);
                    println!("LOG: SRC {}, ACTION {}", src_addr, data.action);
                }
            }
        });
    }
    signal::ctrl_c().await.expect("failed to listen for event");
    Ok::<_, anyhow::Error>(())
}

The above is just a small part, there are more supports

  • Probes
  • Tracepoints
  • Socket Programs
  • Classifiers
  • Cgroups
  • PDP
  • LSM

The following is used as API reference

APIs of libbpf-rs

    struct bpf_map_skeleton { *name, **map }
    struct bpf_prog_skeleton { *name, **prog }
    struct bpf_object_skeleton { **obj, *maps(skel), *progs(skel) }
   
    int bpf_object__open_skeleton(bos *s, *opts);
    int bpf_object__load_skeleton(bos *s);
    int bpf_object__attach_skeleton(bos *s);
    void bpf_object__detach_skeleton(bos *s);
    void bpf_object__destroy_skeleton(bos *s);

build.rs automatically generates the code template tc.skel.rs

Finally returned to the user TcSkel, including progs, maps, fields in the data map Generate call->libbpf-cargo::lib.rs::SkeletonBuilder().build_generate(&skel) according to the map in bpf.c

pub struct 
    TcSkelBuilder.ObjectBuilder,
    OpenTcSkel.OpenObject,
    TcSkel.Object,

    OpenTcProgs.OpenObject,
    TcProgs.Object,

    OpenTcMaps.OpenObject,
    TcMaps.Object,

    TcLinks.Option<Link>,

TcSkelBuilder.open()->OpenTcSkel(obj, config)
OpenTcSkel.load() >> bpf_object__load_skeleton(config)
OpenTcSkel.load()->TcSkel(obj,config, Tclinks())
    OpenTcSkel.progs()->OpenTcProgs(obj)
    OpenTcSkel.maps()->OpenTcMaps()
    OpenTcSkel.data()->ffi::c_void()

TcSkel.attach() >> bpf_object__attach_skeleton(config)
TcSkel.links = TcLinks(handle_tc)

TcProgs.handle_tc
TcMaps.[ports、data、rodata]

bindings.rs automatically generated by rust-bindgen

From 10 header files, generate process::Command::new("make") bpf.h libbpf.h btf.h xsk.h bpf_helpers.h bpf_helper_defs.h bpf_tracing.h bpf_endian.h bpf_core_read.h with build.rs rules libbpf_common.h

libbpf-rs functions

Provides a range of tools

tc.rs

TcHookBuilder-> TcHook
    tc_builder
        .fd(fd)
        .ifindex(ifidx)
        .replace(true)
        .handle(1)
        .priority(1);

TcHook
    tc_hook

skeleton.rs

SkelConfig encapsulates map and prog,

Wrapper of libbpf_sys::bpf_object_skeleton

_data/_string_pool for life cycle obj and memory holding

Layout of progs/maps for memory destruction drop

ObjectSkeletonConfigBuilder.build()->ObjectSkeletonConfig()
    libbpf_sys::bpf_object_skeleton()
    .build_maps(s, string_pool)->maps_layout
    .build_progs(s, string_pool)->progs_layout

// libbpf_sys::bpf_object_skeleton  wrap
/// * ensure lifetimes are valid for dependencies (pointers, data buffer)
/// * free any allocated memory on drop
pub struct ObjectSkeletonConfig<'a> {
    inner: bpf_object_skeleton,
    obj: Box<*mut bpf_object>,
    maps: Vec<MapSkelConfig>,
    progs: Vec<ProgSkelConfig>,
    /// Layout necessary to `dealloc` memory
    maps_layout: Option<Layout>,
    /// Same as above
    progs_layout: Option<Layout>,
    /// Hold this reference so that compiler guarantees buffer lives as long as us
    _data: &'a [u8],
    /// Hold strings alive so pointers to them stay valid
    _string_pool: Vec<CString>,
}

query.rs

for prog in ProgInfoIter::default() {
     println!("{}", prog.name);
[Program/Map/Btf/Link] Info

program.rs

OpenProgram.[set_[map_ifindex/fd/...]]
Program.[name/sectyion/fd/key_size/value_size/lookup/delte/update/pin/unpin]]

maps.rs

OpenMap.[set_[prog_type/attach_type/ifindex/flags]]
Map.[name/sectyion/fd/pin/unpin/attach[cgroup/perf/uprobe/trace/xdp]]
Link.[open/update_prog/dsconnet/pin/fd/detach]
bpf_link_type.[xdp/perf_event/cgroup/raw/trace]

object.rs

OpenObject: open but not load [bpf_object/maps/progs/name/map/prog/load] Populate obj.maps/obj.progs
Object: open and loaded object
ObjectBuilder.[name/debug/opts/open_file/open_mem]->OpenObject::new()

ringbuf.rs

RingBuffer.[ring_buffer/poll/consume]
RingBufferBuilder.[RingBufferCallback/add/build]->RingBuffer()

util.rs

str_to_cstring/path_to_cstring/c_ptr_to_string
roundup/num_possible_cpus
parse_ret/parse_ret_i32/parse_ret_usize

libbpf-cargo functions

main.rs

clap Command:[Build/Gen/Make] corresponds to the next 3 files, main is just api

lib.rs

Provide automatic build and gen for user project build.rs

    SkeletonBuilder::new().source(SRC).build_and_generate(&skel)
    build()->build::build_single()
    generate()->gen::gen_single()

make.rs

  • batch build and gen
  • build::build()
  • gen::gen()
  • Finally Command::new("cargo").arg("build")

build.rs

build_single() for user project->compile_one()->Command

build() 用于cargo ->compile()->compile_one()
    extract_libbpf_headers_to_disk()
    check_progs/check_clang()/

gen.rs

    gen->gen_single->gen_skel(debug, name, obj_file, output, rustfmt_path)->
    gen_skel_contents()
        open_bpf_object()
        gen_skel_c_skel_constructor()->libbpf_rs::skeleton::**ObjectSkeletonConfigBuilder**::new(DATA); # skeleton.rs
        map/prog/datasec
            gen_skel_xxx_defs()?; gen_skel_xxx_getter()?; gen_skel_link_getter()
            gen_skel_attach()->libbpf_sys::bpf_object__attach_skeleton(

metadata.rs

cargo时的 to_compile
get()->target_dir, metadata.target_directory.into_std_path_buf()
    轮询所有package后,if id == &package.id 
    get_package() 

1

netoken


Guess you like

Origin blog.csdn.net/zmule/article/details/126549532