Use 1. ELK unified log management platform for third in -logstash grok plug
In this post, mainly on the following several points knowledge and practical experience, for your reference:
1. About JAVA application's log content standards:
2. How to use grok plug logstash to complete the split message field:
3. regularly delete Es Index:
1. About JAVA application's log content standards:
Recently the company has been the main push ELK this project, and I am operation and maintenance personnel ELK this project. So there will be a lot of experience ELK output for the project; Since our company's business systems to JAVA-based language development, especially in Spring Cloud, Spring Boot and other frameworks based. How to log standardized business system architects need to consider is the issue of research and development. Currently we log specification defines ELK as follows:
<pattern>[%date{ISO8601}][%level] %logger{80} [%thread] Line:%-3L [%X{TRACE_ID}] ${dev-group-name}
${app-name} - %msg%n</pattern>
|时间|日志级别|类文件|线程数|代码发生行|全局流水号|开发团队|系统名称|日志信息
时间:记录日志产生时间;
日志级别:ERROR、WARN、INFO、DEBUG;
类文件:打印类文件名称;
线程名:执行操作线程名称;
代码发生行:日志事件发生在代码中位置;
全局流水号:贯穿一次业务流程的全局流水号;
开发团队: 系统开发的团队名称
系统名称:项目名称组建名
INFO: 记录详细日志信息
比如一个业务系统的日志输出标准格式如下:
[2019-06-2409:32:14,262] [ERROR] com.bqjr.cmm.aps.job.ApsAlarmJob [scheduling-1] []
tstteam tst Line:157 - ApsAlarmJob类execute方法,'【测试系统预警】校验指标异常三次预警'预警出错:nested
exception is org.apache.ibatis.exceptions.PersistenceException: ### Error
querying database. Cause: java.lang.NullPointerException ### Cause:
java.lang.NullPointerException org.mybatis.spring.MyBatisSystemException:
nested exception is
2. How to use grok plug logstash to complete the split message field:
Now we log all fields in accordance with the standard output, but in kibana interface or a message field. Message must now be decomposed to achieve each field can be searched by each field;
ELK log our platform architecture: all business systems installed filebeat log collection software, the log collection intact to KAFKA cluster, and then sent by kafka cluster to cluster logstash, re-export from the logstash ES cluster to cluster by cluster ES kibana then output to display and search. Why middle logstash software will be used, mainly because logstash software has powerful text processing capabilities, such as plug-grok. Text formatted output can be realized;
logstash software has built a lot of regular expression template can be used to match the nginx, httpd, syslog and other log;
#logstash默认grok语法模板路径:
/usr/local/logstash-6.2.4/vendor/bundle/jruby/2.3.0/gems/logstash-patterns-core-4.1.2/patterns
#logstash自带的grok语法模板:
[root@SZ1PRDELK00AP005 patterns]# ll
total 116
-rw-r--r-- 1 root root 271 Jun 24 16:05 application
-rw-r--r-- 1 root root 1831 Apr 13 2018 aws
-rw-r--r-- 1 root root 4831 Apr 13 2018 bacula
-rw-r--r-- 1 root root 260 Apr 13 2018 bind
-rw-r--r-- 1 root root 2154 Apr 13 2018 bro
-rw-r--r-- 1 root root 879 Apr 13 2018 exim
-rw-r--r-- 1 root root 10095 Apr 13 2018 firewalls
-rw-r--r-- 1 root root 5338 Apr 13 2018 grok-patterns
-rw-r--r-- 1 root root 3251 Apr 13 2018 haproxy
-rw-r--r-- 1 root root 987 Apr 13 2018 httpd
-rw-r--r-- 1 root root 1265 Apr 13 2018 java
-rw-r--r-- 1 root root 1087 Apr 13 2018 junos
-rw-r--r-- 1 root root 1037 Apr 13 2018 linux-syslog
-rw-r--r-- 1 root root 74 Apr 13 2018 maven
-rw-r--r-- 1 root root 49 Apr 13 2018 mcollective
-rw-r--r-- 1 root root 190 Apr 13 2018 mcollective-patterns
-rw-r--r-- 1 root root 614 Apr 13 2018 mongodb
-rw-r--r-- 1 root root 9597 Apr 13 2018 nagios
-rw-r--r-- 1 root root 142 Apr 13 2018 postgresql
-rw-r--r-- 1 root root 845 Apr 13 2018 rails
-rw-r--r-- 1 root root 224 Apr 13 2018 redis
-rw-r--r-- 1 root root 188 Apr 13 2018 ruby
-rw-r--r-- 1 root root 404 Apr 13 2018 squid
#其中有一个java的模板,已经内置了很多java类、时间戳等
[root@SZ1PRDELK00AP005 patterns]# cat java
JAVACLASS (?:[a-zA-Z$_][a-zA-Z$_0-9]*\.)*[a-zA-Z$_][a-zA-Z$_0-9]*
#Space is an allowed character to match special cases like 'Native Method' or 'Unknown Source'
JAVAFILE (?:[A-Za-z0-9_. -]+)
#Allow special <init>, <clinit> methods
JAVAMETHOD (?:(<(?:cl)?init>)|[a-zA-Z$_][a-zA-Z$_0-9]*)
#Line number is optional in special cases 'Native method' or 'Unknown source'
JAVASTACKTRACEPART %{SPACE}at %{JAVACLASS:class}\.%{JAVAMETHOD:method}\(%{JAVAFILE:file}(?::%{NUMBER:line})?\)
# Java Logs
JAVATHREAD (?:[A-Z]{2}-Processor[\d]+)
JAVACLASS (?:[a-zA-Z0-9-]+\.)+[A-Za-z0-9$]+
JAVAFILE (?:[A-Za-z0-9_.-]+)
JAVALOGMESSAGE (.*)
# MMM dd, yyyy HH:mm:ss eg: Jan 9, 2014 7:13:13 AM
CATALINA_DATESTAMP %{MONTH} %{MONTHDAY}, 20%{YEAR} %{HOUR}:?%{MINUTE}(?::?%{SECOND}) (?:AM|PM)
# yyyy-MM-dd HH:mm:ss,SSS ZZZ eg: 2014-01-09 17:32:25,527 -0800
TOMCAT_DATESTAMP 20%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{HOUR}:?%{MINUTE}(?::?%{SECOND}) %{ISO8601_TIMEZONE}
CATALINALOG %{CATALINA_DATESTAMP:timestamp} %{JAVACLASS:class} %{JAVALOGMESSAGE:logmessage}
# 2014-01-09 20:03:28,269 -0800 | ERROR | com.example.service.ExampleService - something compeletely unexpected happened...
TOMCATLOG %{TOMCAT_DATESTAMP:timestamp} \| %{LOGLEVEL:level} \| %{JAVACLASS:class} - %{JAVALOGMESSAGE:logmessage}
[root@SZ1PRDELK00AP005 patterns]#
#但是仅靠默认的这个模板还是不能匹配我们公司自定义的日志内容,所以我还自己写了一个
[root@SZ1PRDELK00AP005 patterns]# cat application
APP_DATESTAMP 20%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{HOUR}:?%{MINUTE}(?::?%{SECOND})
THREADS_NUMBER (?:[a-zA-Z0-9-]+)
GLOBAL_PIPELINE_NUMBER (?:[a-zA-Z0-9-]+)
DEV_TEAM (?:[a-zA-Z0-9-]+)
SYSTEM_NAME (?:[a-zA-Z0-9-]+)
LINE_NUMBER (Line:[0-9]+)
JAVALOGMESSAGE (.*)
APPLOG \[%{APP_DATESTAMP:timestamp}\] \[%{LOGLEVEL:loglevel}\] %{JAVACLASS:class} \[%{THREADS_NUMBER:threads_number}\] \[%{GLOBAL_PIPELINE_NUMBER:global_pipeline_number}\] %{DEV_TEAM:team} %{SYSTEM_NAME:system_name} %{LINE_NUMBER:linenumber} %{JAVALOGMESSAGE:logmessage}
# 然后就是配置logstash
[root@SZ1PRDELK00AP005 patterns]# cat /usr/local/logstash/config/yunyan.conf
input {
kafka {
bootstrap_servers => "192.168.1.12:9092,192.168.1.14:9092,192.168.1.15:9092"
topics_pattern => "elk-tst-tst-info.*"
group_id => "test-consumer-group"
codec => json
consumer_threads => 3
decorate_events => true
auto_offset_reset => "latest"
}
}
filter {
grok {
match => {"message" => ["%{APPLOG}","%{JAVALOGMESSAGE:message}"]} #注意这里的APPLOG就是我上面自定义的名字
overwrite => ["message"]
}
}
output {
elasticsearch {
hosts => ["192.168.1.19:9200","192.168.1.24:9200"]
user => "elastic"
password => "111111"
index => "%{[@metadata][kafka][topic]}-%{+YYYY-MM-dd}"
workers => 1
}
}
#output {
# stdout{
# codec => "rubydebug"
# }
#}
#一般建议调试的时候,先输出到stdout标准输出,不要直接输出到es.等在标准输出确认已经OK了,所有的格式化字段都可以分别输出了,再去输出到ES;
# 如何将grok的正则表达式写好,有一个在线的grok表达式测试地址: http://grokdebug.herokuapp.com/
After the output of a standard log contents, you can key: the value of the search query format, for example, can enter in the search bar: loglevel: ERROR search only the contents of the log level of ERROR;
3. regularly delete Es Index:
The configuration index is defined inside logstash output plug, such as by day index, with the index name is later -% {+ YYYY-MM- dd}, if you want to change in accordance with the month index, that is -% {+ YYYY -MM}. Index different content, should also be defined in different ways. For example, the operating system like a log, not much changes every day, you can follow the month to divide the index. But the business system itself, log, because the more logs generated per day, is more suitable for use by the day of the index. Because for elasticsearch, index too large will also affect performance, if too many indexes can affect performance. The main performance bottleneck in the CPU elasticsearch
me in the course of operation and maintenance ELK this project, discovered since the index file is too large, too many indexes, but our es data node cpu configuration is too low, causing the collapse of ES cluster. There are several ways to solve this problem, first delete the index is timed to no avail, and the second is to optimize the parameters of the ES index. The second point I have yet to practice, practice and then follow-up summary documents; first regularly delete method to manually delete the index and the index to write about.
#/bin/bash
#指定日期(7天前)
DATA=`date -d "1 week ago" +%Y-%m-%d`
#当前日期
time=`date`
#删除7天前的日志
curl -u elastic:654321 -XGET "http://192.168.1.19:9200/_cat/indices/?v"|grep $DATA
if [ $? == 0 ];then
curl -u elastic:654321 -XDELETE "http://127.0.0.1:9200/*-${DATA}"
echo "于 $time 清理 $DATA 索引!"
fi
curl -u elastic:654321 \-XGET "http://192.168.1.19:9200/_cat/indices/?v"|awk '{print $3}'|grep elk >> /tmp/es.txt
#手工删除索引,将索引名输出到一个文本文件,然后通过循环删除的方法
for i in `cat /tmp/es.txt`;do curl -u elastic:654321 -X DELETE "192.168.1.19:9200/$i";done
Well, today temporarily to write here. Recent work particularly busy, hard to find time to update the technology blog. Basically to work overtime late at night or early in the morning up to blog updates about, a lot of time working really hard task for taking the time to update the blog. Also thank you for your continued attention.