elasticsearch Learning (Five) - distributed log collection ELK (ElasticSearch + Logstash + Kibana)

A, ELK Introduction

1.1, the traditional log collection problems

Note: Do not log stored in the database inside, because the database is to store data permanently, the log is not so important, can be stored in mongDB, redis. In addition to what's what payment orders, the callback parameters, documents and other important logs can be stored in the database .

-> If you want to locate the log Bug, logs are usually spread over different storage devices, if the management of tens of hundreds of servers, log on to the traditional way in order to check the log of each machine.

-> then upgraded to a centralized log management, but statistics and retrieve logs is a troublesome thing, the general use of Linux command grep, awk, and wc and so can achieve retrieval and statistics , but for more demanding queries , sorting and statistics very long time or the server is not convenient.

-> Finally, there has been ELK distributed log collection.

E.g:

 

1.2 What is the ELK distributed log collection system

ELK of: elasticsearch + + Kibana Logstash composition.

(1) ElasticSearch Introduction: is a distributed based on Lucene open source search and analysis services. It characterized distributed, zero-configuration, automatic discovery, automatic index sheet, RESTful style interface;

(2) Logstash Introduction: is an open source data processing pipeline, the pipeline usually do ES data, the log can be collected, filtered and analyzed. Can load data into elasticsearch , logstash with a web interface, search and display all the logs. c / s architecture, Client terminal installed on the host need to collect logs, Server side is responsible for each node the received log filter, modify, etc. and sent to a elasticsearch up ;

(3) Kibana: ES-based front-end presentation tool browser can do for the ES + Logstash interface .

1.3, ELK principle of distributed log collection

(1) Each server node cluster installation Logstash log collection system plug-ins;

(4) Each input to the server node Logstash logs in;

(2) Logstash formatted log json format , in the form of day, day create different indexes output to the ElasticSearch;

(3) browsers Kibana query log information.

Schematic:

 1.4, ELK build steps

Build a sequence of steps : ES -> Logstash -> Kibana

Two, Logstash principles and environmental structures

2.1, Logstash principle

Logstash事件处理有三个阶段:inputs (接收)→ filters (处理)→ outputs(转发日志)。支持系统日志、错误日志、应用日志、webserver日志等可以抛出去的所有日志类型。

2.2、三个阶段了解

(1)Input:输入数据到logstash

  • 一些常用的输入为:
  • file:从文件系统的文件中读取,类似于tial -f命令
  • syslog:在514端口上监听系统日志消息,并根据RFC3164标准进行解析
  • redis:从redis service中读取
  • beats:从filebeat中读取


(2)Filters:数据中间处理,对数据进行操作:

  • 一些常用的过滤器为:
  • grok:解析任意文本数据,Grok 是 Logstash 最重要的插件。它的主要作用就是将文本格式的字符串,转换成为具体的结构化的数据,配合正则表达式使用。内置120多个解析语法。
  • mutate:对字段进行转换。例如对字段进行删除、替换、修改、重命名等。
  • drop:丢弃一部分events不进行处理。
  • clone:拷贝 event,这个过程中也可以添加或移除字段。
  • geoip:添加地理信息(为前台kibana图形化展示使用)
  • Outputs:outputs是logstash处理管道的最末端组件。一个event可以在处理过程中经过多重输出,但是一旦所有的outputs都执行结束,这个event也就完成生命周期。

(3)一些常见的outputs为

  • elasticsearch:可以高效的保存数据,并且能够方便和简单的进行查询。
  • file:将event数据保存到文件中。
  • graphite:将event数据发送到图形化组件中,一个很流行的开源存储图形化展示的组件。
  • Codecs:codecs 是基于数据流的过滤器,它可以作为input,output的一部分配置。Codecs可以帮助你轻松的分割发送过来已经被序列化的数据。
  • 一些常见的codecs:
  • json:使用json格式对数据进行编码/解码。
  • multiline:将汇多个事件中数据汇总为一个单一的行。比如:java异常信息和堆栈信息。

2.3、Logstash环境搭建

Logstash环境安装:

  • 1、上传logstash安装包(资料)
  • 2、解压tar –zxvf  logstash-6.4.3.tar.gz
  • 3、在config目录下放入mayikt01.conf 读入并且读出日志信息
GET /mymayikt/user/_search
{
  "from": 0,
  "size": 2, 
  "query": {
    "match": {
      
        "car": "奥迪"
      }
  }
}

 

 

 

发布了52 篇原创文章 · 获赞 116 · 访问量 5万+

Guess you like

Origin blog.csdn.net/RuiKe1400360107/article/details/103916811