Introduction and construction of ELK log analysis system (super detailed)

Table of contents

1. Introduction to ELK log analysis system

2. Introduction to Elasticsearch

2.1 Overview of Elasticsearch

3. Introduction to Logstash

4. Introduction to Kibana

Five, ELK working principle

6. Deploy the ELK log analysis system

 6.1 ELK Elasticsearch cluster deployment (operate on Node1, Node2 nodes)

6.2 Deploy Elasticsearch software

6.3. Create data storage path and authorize

6.4 Whether to start elasticsearch successfully  

6.5 View node information  

 6.6 Install the Elasticsearch-head plugin

6.6.1 Compile and install node

6.6.2 Install phantomjs (front-end framework)

6.6.3 Install Elasticsearch-head data visualization tool

6.6.4 Modify the Elasticsearch main configuration file

6.6.5 Start elasticsearch-head service

6.6.6 View Elasticsearch information through Elasticsearch-head

6.6.7 Insert Index

6.7 ELK Logstash deployment (operating on Apache nodes)

6.7.1 Changing the hostname

6.7.2 Install Apache service (httpd)

6.7.3 Install Java environment

6.7.4 Install logstash

6.7.5 Testing Logstash

6.7.6 Define the logstash configuration file


1. Introduction to ELK log analysis system

ELK is a whole composed of three open source software including but not limited to Elasticsearch (es for short), Logstash, and Kibana. These three software synthesize ELK. It is a complete set of solutions for data extraction (Logstash), search analysis (Elasticsearch), and data display (Kibana), so it is also called ELK stack.

ELK is a general term for a free and open source log analysis architecture technology stack, the official website https://www.elastic.co/cn. Contains three basic components, namely Elasticsearch, Logstash, Kibana.

2. Introduction to Elasticsearch

2.1 Overview of Elasticsearch

Provides a full-text search engine for distributed users.

 ElasticSearch: It is a distributed storage and retrieval engine developed based on Lucene (a full-text retrieval engine architecture), used to store various logs. Elasticsearch is developed in Java and can communicate with Elasticsearch through a RESTful web interface through a browser. Elasticsearch is a real-time, distributed, scalable search engine that allows full-text, structured searches. It is typically used to index and search large volumes of log data, and can also be used to search many different types of documents. 

3. Introduction to Logstash

as a data collection engine. It supports dynamic collection of data from various data sources, and performs operations such as filtering, analyzing, enriching, and unifying the format on the data, and then stores it in a location specified by the user, and generally sends it to Elasticsearch. Logstash is written in Ruby language and runs on Java Virtual Machine (JVM). It is a powerful data processing tool that can realize data transmission, format processing, and format output. Logstash has a powerful plug-in function, which is often used for log processing.

Relative input (data collection) filter (data filtering) output (data output)

4. Introduction to Kibana

Kibana is usually deployed together with Elasticsearch. Kibana is a powerful data visualization Dashboard for Elasticsearch. Kibana provides a graphical web interface to browse Elasticsearch log data, which can be used to summarize, analyze and search for important data.

Five, ELK working principle

(1) Deploy Logstash on all servers that need to collect logs; or centralize the management of logs on the log server first, and deploy Logstash on the log server.

(2) Logstash collects logs, formats and outputs logs to the Elasticsearch cluster.

(3) Elasticsearch indexes and stores the formatted data.

(4) Kibana queries the data from the ES cluster to generate charts and displays the front-end data.

Summary: logstash, as a log collector, collects data from data sources, filters and formats the data, and then stores it in Elasticsearch, and kibana visualizes the logs.

6. Deploy the ELK log analysis system

Node1 node (2C/4G): node1/192.168.237.21 Elasticsearch Kibana

Node2 node (2C/4G): node2/192.168.237.22 Elasticsearch

Apache node: apache/192.168.237.23 Logstash Apache

 6.1 ELK Elasticsearch cluster deployment (operate on Node1, Node2 nodes)

两台node都做
vim /etc/hosts
192.168.10.13 node1 
192.168.10.14 node2   

Note: version issue

java -version #If not installed,

yum -y install java openjdk version "1.8.0_131"

OpenJDK Runtime Environment (build 1.8.0_131-b12)

OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)  

It is recommended to use jdk

6.2 Deploy Elasticsearch software

(1)安装elasticsearch—rpm包
#上传elasticsearch-5.5.0.rpm到/opt目录下
cd /opt
rpm -ivh elasticsearch-5.5.0.rpm 

(2)加载系统服务
systemctl daemon-reload    
systemctl enable elasticsearch.service

(3)修改elasticsearch主配置文件
cp /etc/elasticsearch/elasticsearch.yml /etc/elasticsearch/elasticsearch.yml.bak
vim /etc/elasticsearch/elasticsearch.yml
--17--取消注释,指定集群名字
cluster.name: my-elk-cluster
--23--取消注释,指定节点名字:Node1节点为node1,Node2节点为node2
node.name: node1
--33--取消注释,指定数据存放路径
path.data: /data/elk_data
--37--取消注释,指定日志存放路径
path.logs: /var/log/elasticsearch/
--43--取消注释,改为在启动的时候不锁定内存
bootstrap.memory_lock: false
--55--取消注释,设置监听地址,0.0.0.0代表所有地址
network.host: 0.0.0.0
--59--取消注释,ES 服务的默认监听端口为9200
http.port: 9200
--68--取消注释,集群发现通过单播实现,指定要发现的节点 node1、node2
discovery.zen.ping.unicast.hosts: ["node1", "node2"]

grep -v "^#" /etc/elasticsearch/elasticsearch.yml

6.3. Create data storage path and authorize

mkdir -p /data/elk_data
chown elasticsearch:elasticsearch /data/elk_data/

6.4 Whether to start elasticsearch successfully  

systemctl start elasticsearch.service
netstat -antp | grep 9200

6.5 View node information  

browser access

http://192.168.237.21:9200/

http://192.168.237.22:9200/

View nodes Node1, Node2 

http://192.168.237.21:9200/_cluster/health?pretty

http://192.168.237.22:9200/_cluster/health?pretty

 View the health status of the cluster, and you can see that the status value is green (green), indicating that the node is running healthily.

 #Using the above method to view the status of the cluster is not user-friendly. You can manage the cluster more conveniently by installing the Elasticsearch-head plug-in.

 6.6 Install the Elasticsearch-head plugin

After Elasticsearch version 5.0, the Elasticsearch-head plug-in needs to be installed as an independent service, and needs to be installed using the npm tool (a package management tool for NodeJS).
To install Elasticsearch-head, you need to install the dependent software node and phantomjs in advance.
node: is a JavaScript runtime environment based on the Chrome V8 engine.
phantomjs: is a webkit-based JavaScript API, which can be understood as an invisible browser, and it can do anything that a webkit-based browser can do.

6.6.1 Compile and install node

#上传软件包 node-v8.2.1.tar.gz 到/opt
yum install gcc gcc-c++ make -y

cd /opt
tar zxvf node-v8.2.1.tar.gz

cd node-v8.2.1/
./configure
make && make install

6.6.2 Install phantomjs (front-end framework)

#上传软件包 phantomjs-2.1.1-linux-x86_64.tar.bz2 到
cd /opt
tar jxvf phantomjs-2.1.1-linux-x86_64.tar.bz2 -C /usr/local/src/
cd /usr/local/src/phantomjs-2.1.1-linux-x86_64/bin
cp phantomjs /usr/local/bin

6.6.3 Install Elasticsearch-head data visualization tool

#上传软件包 elasticsearch-head.tar.gz 到/opt
cd /opt
tar zxvf elasticsearch-head.tar.gz -C /usr/local/src/
cd /usr/local/src/elasticsearch-head/
npm install

6.6.4 Modify the Elasticsearch main configuration file

vim /etc/elasticsearch/elasticsearch.yml 
...... --末尾添加以下内容-- 
http.cors.enabled: true                  #开启跨域访问支持,默认为
 false http.cors.allow-origin: "*"       #指定跨域访问允许的域名地址为所有

systemctl restart elasticsearch

6.6.5 Start elasticsearch-head service

#必须在解压后的 elasticsearch-head 目录下启动服务,进程会读取该目录下的 gruntfile.js 文件,否则可能启动失败。
cd /usr/local/src/elasticsearch-head/
npm run start &


netstat -natp |grep 9100

6.6.6 View Elasticsearch information through Elasticsearch-head

6.6.7 Insert Index

#通过命令插入一个测试索引,索引为 index-demo,类型为 test。

//输出结果如下:curl -X PUT 'localhost:9200/index-demo/test/1?pretty&pretty' -H 'content-Type: application/json' -d '{"user":"zhangsan","mesg":"hello world"}'

Visit http://192.168.237.21:9100/ with the browser to view the index information, and you can see that the index is divided into 5 fragments by default, and there is a copy. Click "Data Browse", you will find the index created on node1 as index-demo, and the related information of type test.

6.7 ELK Logstash deployment (operating on Apache nodes)

6.7.1 Changing the hostname

hostnamectl set-hostname apache

6.7.2 Install Apache service (httpd)

yum -y install httpd
systemctl start httpd

6.7.3 Install Java environment

yum -y install java
java -version

6.7.4 Install logstash

#上传软件包 logstash-5.5.1.rpm 到/opt目录下
cd /opt
rpm -ivh logstash-5.5.1.rpm                           
systemctl start logstash.service                      
systemctl enable logstash.service

ln -s /usr/share/logstash/bin/logstash /usr/local/bin/

6.7.5 Testing Logstash

Common options for Logstash commands: -f: Through this option, you can specify the configuration file of Logstash, and configure the input and output streams of Logstash according to the configuration file. -e: Obtained from the command line, the input and output are followed by a string, which can be used as the configuration of Logstash (if it is empty, stdin is used as input and stdout is used as output by default). -t: Test whether the configuration file is correct, and then exit.

Define input and output streams: #Input uses standard input, output uses standard output (similar to pipes)

logstash -e 'input { stdin{} } output { stdout{} }'
......
www.baidu.com										#键入内容(标准输入)
2020-12-22T03:58:47.799Z node1 www.baidu.com		#输出结果(标准输出)
www.sina.com.cn										#键入内容(标准输入)
2017-12-22T03:59:02.908Z node1 www.sina.com.cn		#输出结果(标准输出)

//执行 ctrl+c 退出

#使用 rubydebug 输出详细格式显示,codec 为一种编解码器
logstash -e 'input { stdin{} } output { stdout{ codec=>rubydebug } }'
......
www.baidu.com										#键入内容(标准输入)


//The result is not displayed on the standard output, but sent to Elasticsearch, you can visit http://192.168.10.13:9100/ with a browser to view index information and data browsing.

#使用 Logstash 将信息写入 Elasticsearch 中
logstash -e 'input { stdin{} } output { elasticsearch { hosts=>["192.168.10.13:9200"] } }'
			 输入				输出			对接
......
www.sina.com.cn										#键入内容(标准输入)

 

6.7.6 Define the logstash configuration file

The Logstash configuration file basically consists of three parts: input, output, and filter (optional, use as needed).

input: Indicates collecting data from data sources, common data sources such as Kafka, log files, etc.

filter: Indicates the data processing layer, including data formatting, data type conversion, data filtering, etc., and supports regular expressions

output: Indicates that the data collected by Logstash is processed by the filter and then output to Elasticsearch.

#The format is as follows:

input {...}

filter {...}

output {...}

#In each section, multiple access methods can also be specified. For example, to specify two log source files, the format is as follows:

input { file { path =>"/var/log/messages" type =>"syslog"}

file { path =>"/var/log/httpd/access.log" type =>"apache"} }

Modify the Logstash configuration file to collect system logs /var/log/messages and output them to elasticsearch.

chmod +r /var/log/messages					#让 Logstash 可以读取日志

vim /etc/logstash/conf.d/system.conf
input {
    file{
        path =>"/var/log/messages"						#指定要收集的日志的位置
        type =>"system"									#自定义日志类型标识
        start_position =>"beginning"					#表示从开始处收集
    }
}
output {
    elasticsearch {										#输出到 elasticsearch
        hosts => ["192.168.10.13:9200"]					#指定 elasticsearch 服务器的地址和端口
        index =>"system-%{+YYYY.MM.dd}"					#指定输出到 elasticsearch 的索引格式
    }
}

systemctl restart logstash 

Browser access http://192.168.237.22:9100/ to view index information

Guess you like

Origin blog.csdn.net/m0_71888825/article/details/132054574