用logstash的grok插件,匹配下面格式的log
2018-04-17 23:42:10.335 INFO [main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat initialized with port(s): 8080 (http)
用表达式
%{DATA:day} %{DATA:time} %{DATA:level} \[%{DATA:thread}\] %{DATA:className} \: %{GREEDYDATA:msg}
匹配出来的结果是
{ "day": [ [ "2018-04-17" ] ], "time": [ [ "23:42:10.335" ] ], "level": [ [ " INFO 7304 ---" ] ], "thread": [ [ "main" ] ], "className": [ [ "s.b.c.e.t.TomcatEmbeddedServletContainer" ] ], "msg": [ [ "Tomcat initialized with port(s): 8080" ] ] }
可以看到时间被分成了两个字段,官方的表达式没有匹配中国时间的,于是我想到能不能自定义正则表达式,后来终于找到了。下面是我改进的表达式:
(?<fullTime>\S{10} \S{12}) %{DATA:level} \[%{DATA:thread}\] %{DATA:className} \: %{GREEDYDATA:msg}
结果时间字段完美解析了:
{ "fullTime": [ [ "2018-04-17 23:42:10.335" ] ], "level": [ [ " INFO" ] ], "thread": [ [ "main" ] ], "className": [ [ "s.b.c.e.t.TomcatEmbeddedServletContainer" ] ], "msg": [ [ "Tomcat initialized with port(s): 8080" ] ] }