Upgrading of Audit in CWPP

The technical background is the CWPP (Cloud Workload Protection Platforms) cloud workload protection platform, which is usually based on agents that run permanently in computers, collect security-related data and events, and send them to cloud-based service analysis to notify users of corresponding potential security issues threaten.

One of the subsystems, audit, is a passive defense security audit system used to collect and record behavior events that occur in the system, kernel, and user processes. The system can reliably collect information on any security-related event such as log file access, network access, user commands, system calls, and system security events.

With the passage of time, the development of the kernel, its existing problems become more and more prominent

  1. High performance overhead: the payment card security standard for the service is enabled, the system throughput rate is reduced by 30%, the system call overhead is reduced by 50%, the context switching is reduced by 10%, and the others are reduced by about 2-10%.
  2. Cannot flexibly adjust performance: just start the service but do not execute it, the system throughput rate will drop by 20%, the system call overhead will drop by 35%, and the others will drop by about 2-10%
  3. The support for docker containers is insufficient, and the monitoring of docker and its services cannot achieve the effect of security discovery

audit system subsystem

  • auditd: The audit daemon is responsible for writing audit messages to disk generated through the audit kernel interface and triggered by application and system activity.
  • auditctl: Controls log generation parameters and kernel settings for the audit interface, as well as a ruleset for determining which events to track
  • aureport: create custom reports
  • ausearch: Search for specific events in the audit log
  • audispd: notify other applications instead of writing them to the audit log on disk

Alternative 5 technologies compared

Release year acquisition framework performance programmable control stability built into the kernel
2004 Linux Audit Difference no high yes
2009 SystemTap good yes Low no
2006 LTTng good no Low no
2009 Perf/ftrace excellent no high yes
2014 BPF excellent yes high yes

Reasons for low performance

Audit is independent of data sources other than kprobe and tracepoint. It implements monitoring by inserting custom hook functions into the kernel source code such as syscall and file operations. It was added to the kernel in 2004, and its performance is poor compared to other technologies.

Advantages of BPF technology

  • Stable: Pass validators, prevent bug-caused kernel panics
  • Installation-free: no installation required, dynamic loading and unloading
  • Programmable control: support developers to insert custom code logic

Disadvantages of BPF

  • A higher version of the Linux kernel is required, V3.x is not fully functional, and V2.x has no BPF support.
  • Lower versions of Linux use the original audit or perf implementation

A solution to replace Audit with BPF

  1. Use BPF to implement the existing linux audit subsystem functions. See the next section for implementation details
  2. Choose to install according to the system version, the old system uses the original audit or perf, V4.x and above systems use BPF
  3. Provide a self-monitoring system, monitor and configure performance indicators, and execute corresponding mitigation measures after the indicators are exceeded
  4. Provide a real-time notification mechanism, and when encountering configuration items, notify other access systems to coordinate and complete corresponding mitigation measures
  5. Provide monitoring within docker and aggregate event capabilities between multiple dockers, especially events such as cgroup and network, and enhance the audit function of docker

BPF implements details of linux audit subsystem

                   Agent        
策略下发  |  事件聚合 |   通知协同  |  性能监测 
                    BPF        
    Ring Buffer ↑    |     ↓ BPF Map    
         System call | syscall 系统调用    挂载点
     VFS/File system | 文件操作    挂载点
                  硬件驱动        
  1. For audit monitoring syscall and file operations, BPF also adds the same access monitoring
  2. The agent sets the configuration policy to the BPF driver through the BPF map, and dynamically controls the monitoring behavior
  3. The BPF driver sends events to the monitoring application agent in real time through RingBuffer
  4. Agent is divided into 4 parts: policy delivery, event aggregation, notification coordination, and performance monitoring

Host audit system process

                  ↓
                策略下发
在后台服务中,管理员配置设计策略并下发
                  ↓
    主机agent接收策略,写入BPF map
                  ↓
        主机BPF驱动执行策略
                  ↓
BPF驱动收集系统调用syscal和文件系统操作事件
                事件聚合
在后台服务中,聚合分类这些事件
                  ↑
在后台服务中,收集多台主机上报的审计事件
                  ↑
        由agent上报发送给后台服务
                  ↑
            事件内容写入Ring Buffer
                  ↑
由BPF驱动收集系统调用syscal和文件系统操作事件
                  ↑
        主机agent根据下发策略要求
                通知协同
将聚合分类安全事件通知其他安全服务,如威胁狩猎分析
                  ↓
        其他安全服务生成对应的安全策略
                  ↓
        将安全策略通知系统审计服务
                  ↓
    优化更新主机审计系统自身的安全策略
                性能监测
    主机agent采集主机自身性能指标
                  ↓
    根据下发策略扩大或减小监测挂载点
                  ↓
    动态控制审计系统的对系统的负荷

Beneficial effects after upgrading and replacing

  1. High-performance collection of security information
  2. Dynamically loadable probes to dynamically allocate system overhead
  3. Compatible with the technology stack of the old and new systems, adapting to the old system (the old system does not support new technologies)
  4. Real-time detection of security threats, timely reporting or emergency handling
  5. Can cooperate with other security services of the system

Guess you like

Origin blog.csdn.net/zmule/article/details/126574104