Expert sharing | How to build an Alluxio audit log analysis system

Author of this article: Geng Yuntao

Author of this article: Ge Dali

Big data technologies are changing with each passing day, and data construction concepts such as data lakes, data middle platforms, logical data lakes, data weaving, and data orchestration continue to emerge and deepen. Sort out to form a unified view and unified standards to achieve the goal of data governance at the business level.

For any kind of data construction concept, data security is an unavoidable topic, and the guarantee of data security is also the entry threshold for whether a data platform can be truly delivered.

Data security generally unfolds from four dimensions: authentication, authorization, auditing, and encryption. The identity of the user is ensured through authentication, and strict authorization can protect the privacy of data access, audit data access records, discover possible illegal access, and ensure the basic security of data storage persistence and data transmission links through encryption.

Each dimension of data security is a very broad topic. This article mainly shares some practical practices of Alluxio auditing and some precautions from the perspective of auditing.

As a cloud-native data orchestration platform, Alluxio can realize efficient unified data management and orchestration when facing heterogeneous infrastructure environments (local, hybrid cloud, public cloud), and serve big data and AI applications. Alluxio has been widely used in data In the construction of data platforms such as lake, data middle platform, logical data lake, and data weaving, it provides a lot of assistance in assisting enterprises/organizations to solve problems such as multi-platform hybrid architecture, data fragmentation, and platform adaptation complexity caused by technological changes. Good help.

Alluxio often exists as a unified data access interface service layer on the new data platform architecture, relying on Alluxio's orchestration capabilities to realize the unified interface and metadata of the data storage platforms scattered in different regions and different technologies to the upper layer computing/application. Unification. At the same time, based on Alluxio, it can further build the unified data governance capability, effectively solve the problem of data fragmentation, and ensure a unified view and unified standard of data from the application perspective.

As the unified access interface service layer of the data platform, Alluxio will carry all the reading and writing behaviors of data. As a data access portal, Alluxio records users' data access behaviors and supports subsequent auditing of these user behaviors. become particularly important.

Before starting to explain the construction of the Alluxio audit log analysis system, let’s ask two questions to see if you can answer them immediately:
1. Are the 3PB data in the data lake frequently accessed? In the past week , Has there been any change in the access frequency of these data?
2. Which users are using these data? Are these uses legal? If you want to change the data content, format, and persistence cycle, who will be affected?

The answers to the above two questions are hidden in the audit log. The audit log can perfectly record the information requested by each user every time. This information is the fact that has occurred and is an effective verification and supplement to the management configuration. With the audit log, in order to really answer the above two questions, an analysis system is needed. The analysis system is used to analyze and display the indicators hidden in the above questions. The overall architecture of the audit log analysis system is as follows:

1. Alluxio Audit Log

Alluxio's audit logs exist in the nature of data sources throughout the system. Alluxio supports the audit log function, and the Leader Master records user behavior through the audit log, so as to meet the needs of auditing, problem retrospective analysis, user behavior analysis, and hot and cold data analysis. Although the audit log is only recorded in the Leader Master, because the Alluxio Master may be switched, any Master may generate an audit log, which needs to be collected and analyzed.

2. Audit log analysis system

The audit log analysis system mainly includes log collection, log storage, log analysis and display, and possible log management modules. Among them, the log collection needs to be adapted to the data source. Alluxio’s audit log is a log file, so it is generally implemented by deploying the collection agent next to the Alluxio Master; the log storage module can be implemented in a variety of technologies, but it must meet the detailed view of the log , the storage and computing needs of log statistical analysis; the log analysis and display module is responsible for providing visualization capabilities, which can trigger data query and display based on user needs; the log management module is mainly responsible for solving possible data through data cleaning, conversion, etc. Quality problem, if there is no quality problem in the log, this module can be omitted.

The technical solution of the audit log analysis system can have a variety of technical options, and there are mainly two commonly used ones:

  • Flume+Hive/HDFS+Impala/Presto + custom analysis and display system

Flume is responsible for implementing the log collection agent and collecting audit logs to HDFS; Hive/HDFS+Impala/Presto is responsible for log storage and query calculation; the custom analysis and display system realizes various analysis and display needs of logs on demand.

The advantage of this solution is that it is more flexible and open. Hive/HDFS storage can carry more customized data structures and storage formats, as well as more flexible data governance and data calculation/analysis needs.

Because it needs to be written to Hive/HDFS and requires corresponding data conversion, etc., the timeliness of this solution is relatively poor, and it is generally used in timed batch mode, such as hour-level analysis.

  • ELK(Elasticsearch+Logstash+Kibana)

ELK is a widely used open source log management solution. Logstash is responsible for implementing the log collection agent, collecting audit logs to Elasticsearch; Elasticsearch, as a very good search engine, can realize query and calculation requirements such as search and analysis of written logs; Kibana Provides the visual analysis capability of logs, and calls related interfaces of ES to realize log search and DashBoard display.

The advantage of the ELK solution is that all components are open source, and no additional development at the system level is required. It only needs to be configured according to the log format and analysis needs. Moreover, ELK has better timeliness, because Elasticsearch is also a very good OLAP engine. After the data is written, various analysis needs of the data can be realized.

At the same time, the ELK solution is also limited by the fixed component selection. Elasticsearch can support the needs of search and OLAP analysis, but it cannot support some batch analysis, algorithm analysis, etc. At the same time, if you want to change the data structure, rebuild Indexing is also more troublesome to implement.

To sum up, the two solutions have advantages and disadvantages. How to choose the technical solution of the audit log analysis system will generally combine the construction status of the existing platform and the needs of the audit log analysis scenario. It is recommended to follow the following principles:

  • Make full use of existing technical components and introduce new technical components as little as possible. The introduction of new components will bring a series of new workloads from resources to operation and maintenance. After all, there is a lot of work to be done from deployment to operation and maintenance monitoring. Reusing existing technology systems as much as possible can effectively reduce resource consumption and operation and maintenance costs. the complexity.
  • Integrate the storage and analysis of audit logs with the overall construction of the data platform. Audit logs are also very important data of an enterprise/organization. In particular, the analysis of audit logs is not just for log archiving and playback, but data assets with rich analysis value. Therefore, it is recommended to integrate with the overall construction of the data platform Together, as part of an enterprise/organization's data lake and data warehouse construction.

Deployment details

  1. Alluxio enables the audit log function
alluxio.master.audit.logging.enabled=true

2. Log file format

###日志路径
alluxio_home/logs/master_audit.log

2022-09-21 06:24:11,736 INFO  AUDIT_LOG - succeeded=true        allowed=true    ugi=root,root (AUTH=SIMPLE)     ip=/172.31.19.156:54274 cmd=getFileInfo src=/default_tests_files/BASIC_NON_BYTE_BUFFER_CACHE_ASYNC_THROUGHd
st=null perm=root:root:rw-r--r--        executionTimeUs=90                                                                                                                                                                
2022-09-21 06:24:11,737 INFO  AUDIT_LOG - succeeded=true        allowed=true    ugi=root,root (AUTH=SIMPLE)     ip=/172.31.19.156:54274 cmd=getFileInfo src=/default_tests_files/BASIC_CACHE_ASYNC_THROUGH      dst=null  p
erm=root:root:rw-r--r-- executionTimeUs=82

3. Download and configure flume

pro.sources = s1                                                                                                                                                                                                          
pro.channels = c1                                                                                                                                                                                                         
pro.sinks = k1                                                                                                                                                                                                            

pro.sources.s1.type = exec                                                                                                                                                                                                
pro.sources.s1.command = tail -F  -c +0 /mnt/alluxio/log/master_audit.log                                                                                                                                                

pro.sources.s1.interceptors = i1                                                                                                                                                                                          
pro.sources.s1.interceptors.i1.type = search_replace                                                                                                                                                                      

#多个空格替换成 | 统一数据分割符                                                                                                                                                                                 
pro.sources.s1.interceptors.i1.searchPattern = \\s+                                                                                                                                                                       
pro.sources.s1.interceptors.i1.replaceString = |                                                                                                                                                                          
pro.sources.s1.interceptors.i1.charset = UTF-8                                                                                                                                                                            

pro.channels.c1.type = memory                                                                                                                                                                                             
pro.channels.c1.capacity = 1000                                                                                                                                                                                           
pro.channels.c1.transactionCapacity = 100                                                                                                                                                                                 


pro.sinks.k1.hdfs.useLocalTimeStamp = true                                                                                                                                                                                
pro.sinks.k1.type = hdfs                                                                                                                                                                                                  
pro.sinks.k1.hdfs.path = hdfs://ip-172-31-25-105.us-west-2.compute.internal:8020/flume/daytime=%Y-%m-%d                                                                                                                   
pro.sinks.k1.hdfs.filePrefix = events-                                                                                                                                                                                    
pro.sinks.k1.hdfs.fileType = DataStream                                                                                                                                                                                   
pro.sinks.k1.hdfs.round = true                                                                                                                                                                                            
pro.sinks.k1.hdfs.roundValue = 10                                                                                                                                                                                         
pro.sinks.k1.hdfs.minBlockReplicas=1                                                                                                                                                                                      
pro.sinks.k1.hdfs.roundUnit = minute                                                                                                                                                                                      


pro.sources.s1.channels = c1                                                                                                                                                                                              
pro.sinks.k1.channel = c1

4. Hive creates external partition table

CREATE EXTERNAL TABLE IF NOT EXISTS auditlogs(
day string,
time string,
loglevel string,
logtype string,
reserved string,
succeeded string,
allowed string,
ugi string,
ugitype string,
ip string,
cmd string,
src string,
dst string,
perm string,
exetime string
)PARTITIONED BY (daytime string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' 
STORED AS TEXTFILE;

5. Copy data to hive external table

hdfs dfs -cp /flume/daytime=2022-09-21 /user/hive/warehouse/logsdb.db/auditlogs

6. Hive query results

 

Precautions:

The Alluxio audit log format is key=value, and the data separator is not uniform (including one space character and multiple space characters), so flume needs to process the audit log through custom interceptors, remove redundant fields, and unify the separator. The test cases only use the search_replace interceptor built into Flume.

Further reading: Alluxio
audit log format: https://docs.alluxio.io/os/user/stable/cn/operation/Security.html Flume
HDFS Sink: https://flume.apache.org/FlumeU serGuide.html # hdfs-sink Flime Interceptors
: https://flume.apache.org/FlumeUserGuide.html#flume-interceptors

If you want to learn more about Alluxio's dry articles, popular events, and expert sharing, click to enter [Alluxio Think Tank] :

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/5904778/blog/5591039