Logstash Aggregate使用聚合问题

问题:Logstash Aggregate使用聚合,发现数组中的数据会错乱覆盖

原因:过滤器默认是多线程运行,所以聚合数据会错乱

官网原话:https://www.elastic.co/guide/en/logstash/current/plugins-filters-aggregate.html#plugins-filters-aggregate-description

Description
The aim of this filter is to aggregate information available among several events (typically log 
lines) belonging to a same task, and finally push aggregated information into final task event.

You should be very careful to set Logstash filter workers to 1 (-w 1 flag) for this filter to work 
correctly otherwise events may be processed out of sequence and unexpected results will occur.

翻译:

说明

此筛选器的目的是在多个事件(通常是日志)中聚合可用的信息

(行)属于同一任务,并最终将聚合信息推送到最终任务事件中。

您应该非常小心地将Logstash filter workers设置为1(-w 1标志),以使此筛选器工作

正确无误,否则事件可能会按顺序处理,并会出现意外结果。

官网有指出需要设置过滤器线程数为1,否则会有问题

解决方案:

启动的时候加上-w 1

示例:

logstash -w 1 -f logstash.conf

聚合示例:

filter {
	#这里做聚合
     aggregate {
        task_id => "%{id}"
        code => "
            map['id'] = event.get('id')
            #input中的type字段,用于判断
            map['type'] = event.get('type')
            map['name'] = event.get('name')
            map['test_list'] ||=[]
            map['tests'] ||=[]
			#判断是否为空
            if (event.get('test_id') != nil)
				#用于去重,也可以在sql语句中去重
                if !(map['test_list'].include? event.get('test_id'))  
                    map['test_list'] << event.get('test_id')        
                    map['tests'] << {
                        'test_id' => event.get('test_id'),
                        'test_name' => event.get('test_name')
                    }
                end
            end
            event.cancel()
        "
        
        push_previous_map_as_event => true
        timeout => 5
    }
}

参考来源:
Logstash同步mysql一对多数据到ES(踩坑日记系列):

https://blog.csdn.net/menglinjie/article/details/102984845
发布了39 篇原创文章 · 获赞 44 · 访问量 27万+

猜你喜欢

转载自blog.csdn.net/u011974797/article/details/105529448