ELK+filebeat enterprise-level log analysis system

1. Overview of ELK log analysis system

1. Introduction to ELK

LK is the abbreviation of three open source software, respectively: Elasticsearch, Logstash, Kibana, they are all open source software. A new FileBeat has been added, which is a lightweight log collection and processing tool (Agent). Filebeat takes up less resources and is suitable for collecting logs on each server and transferring them to Logstash. This tool is also recommended by the official.

Elasticsearch is an open source distributed search engine that provides three functions of collecting, analyzing and storing data.

Its features are: distributed, zero configuration, automatic discovery, automatic index fragmentation, index copy mechanism, restful style interface, multiple data sources, automatic search load, etc. It is mainly responsible for indexing and storing logs to facilitate retrieval and query by business parties.

Logstash is mainly a tool for collecting, analyzing, and filtering logs, and supports a large number of data acquisition methods.

The general working method is c/s architecture, the client is installed on the host that needs to collect logs, and the server is responsible for filtering and modifying the received logs of each node and sending them to elasticsearch.

It is a middleware for log collection, filtering, and forwarding. It is mainly responsible for collecting and filtering various logs of various business lines in a unified manner, and forwarding them to Elasticsearch for further processing.

Kibana is also an open source and free tool. Kibana can provide a log analysis friendly web interface for Logstash and ElasticSearch, which can help summarize, analyze and search important data logs.
insert image description here

2. Reasons for using ELK

Logs mainly include system logs, application logs, and security logs. System operation and maintenance and developers can use logs to understand server software and hardware information, check errors in the configuration process and the reasons for errors. Frequent analysis of logs can help you understand the server's load, performance security, and take timely measures to correct errors.

Often we can use tools such as grep and awk to analyze the logs of a single machine basically, but when the logs are scattered and stored on different devices. If you manage tens or hundreds of servers, you are still viewing logs using the traditional method of logging into each machine in turn. Does this feel cumbersome and inefficient. As a matter of urgency, we use centralized log management, such as open source syslog, to collect and summarize logs on all servers. After centralized management of logs, the statistics and retrieval of logs has become a more troublesome thing. Generally, we can use Linux commands such as grep, awk and wc to achieve retrieval and statistics, but for more demanding queries, sorting and statistics, etc. And the huge number of machines is still a bit powerless to use this method.

Generally, a large-scale system is a distributed deployment architecture. Different service modules are deployed on different servers. When a problem occurs, in most cases, it is necessary to locate the specific server and service module based on the key information exposed by the problem, and build a set of centralized A log system can improve the efficiency of locating problems.

3. Basic features of complete log system

Collection: Ability to collect log data from multiple sources

Transmission: It can stably analyze and filter log data and transmit it to the storage system

storage: store log data

Analysis: Support UI analysis

Warning: can provide error reporting, monitoring mechanism

4. Working principle of ELK

(1) Deploy Logstash on all servers that need to collect logs; or centralize the management of logs on the log server first, and deploy Logs tash on the log server.

(2) Logstash collects logs, formats and outputs logs to the Elasticsearch cluster.

(3) Elasticsearch indexes and stores the formatted data.

(4) Kibana queries the data from the ES cluster to generate charts and displays the front-end data.

2. Operation steps of ELK log analysis system cluster deployment

Environment preparation:
insert image description here
all servers close the firewall and SElinux

systemctl stop firewalld
setenforce 0  

1. ELK Elasticsearch cluster deployment (operated on Node1 and Node2 nodes)

1.1. Change the host name, configure domain name resolution, and view the Java environment

Node1节点:hostnamectl set-hostname node1
Node2节点:hostnamectl set-hostname node2
 
vim /etc/hosts
192.168.229.90 node1
192.168.229.80 node2
 
java -version   #如果没有安装,yum -y install java
openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-b12)
OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)

1.2. Deploy Elasticsearch software
(1) Install elasticsearch—rpm package
#Upload elasticsearch-5.5.0.rpm to the /opt directory

cd /opt
rpm -ivh elasticsearch-5.5.0.rpm 

(2) Loading system services

systemctl daemon-reload
systemctl enable elasticsearch.service 

(3) Modify the elasticsearch main configuration file

cp /etc/elasticsearch/elasticsearch.yml /etc/elasticsearch/elasticsearch.yml.bak
vim /etc/elasticsearch/elasticsearch.yml
--17--取消注释,指定集群名字
cluster.name: my-elk-cluster
--23--取消注释,指定节点名字:Node1节点为node1,Node2节点为node2
node.name: node1
--33--取消注释,指定数据存放路径
path.data: /data/elk_data
--37--取消注释,指定日志存放路径
path.logs: /var/log/elasticsearch/
--43--取消注释,改为在启动的时候不锁定内存
bootstrap.memory_lock: false
--55--取消注释,设置监听地址,0.0.0.0代表所有地址
network.host: 0.0.0.0
--59--取消注释,ES 服务的默认监听端口为9200
http.port: 9200
--68--取消注释,集群发现通过单播实现,指定要发现的节点 node1、node2
discovery.zen.ping.unicast.hosts: ["node1", "node2"]
 
grep -v "^#" /etc/elasticsearch/elasticsearch.yml  

(4) Create a data storage path and authorize it

mkdir -p /data/elk_data
chown elasticsearch:elasticsearch /data/elk_data/  

(5) Whether to start elasticsearch successfully

systemctl start elasticsearch.service
netstat -antp | grep 9200 

(6) View node information

浏览器访问 http://192.168.229.90:9200 、 http://192.168.229.80:9200 查看节点 Node1、Node2 的信息。

浏览器访问 http://192.168.229.90:9200/_cluster/health?pretty 、 http://192.168.229.80:9200/_cluster/health?pretty查看群集的健康情况,可以看到 status 值为 green(绿色), 表示节点健康运行。

浏览器访问 http://192.168.229.90:9200/_cluster/state?pretty 检查群集状态信息。

#使用上述方式查看群集的状态对用户并不友好,可以通过安装 Elasticsearch-head 插件,可以更方便地管理群集。

1.3. Install the Elasticsearch-head plug-
in Elasticsearch After version 5.0, the Elasticsearch-head plug-in needs to be installed as an independent service and needs to be installed using the npm tool (a package management tool for NodeJS).
To install Elasticsearch-head, you need to install the dependent software node and phantomjs in advance.
node: is a JavaScript runtime environment based on the Chrome V8 engine.
phantomjs: is a webkit-based JavaScript API, which can be understood as an invisible browser, and it can do anything that a webkit-based browser can do.
(1) Compile and install node
#Upload software package node-v8.2.1.tar.gz to /opt

yum install gcc gcc-c++ make -y
 
cd /opt
tar zxvf node-v8.2.1.tar.gz
 
cd node-v8.2.1/
./configure
make && make install  

(2) Install phantomjs
#Upload the package phantomjs-2.1.1-linux-x86_64.tar.bz2 to

cd /opt
tar jxvf phantomjs-2.1.1-linux-x86_64.tar.bz2 -C /usr/local/src/
cd /usr/local/src/phantomjs-2.1.1-linux-x86_64/bin
cp phantomjs /usr/local/bin 

(3) Install the Elasticsearch-head data visualization tool
#Upload the software package elasticsearch-head.tar.gz to /opt

cd /opt
tar zxvf elasticsearch-head.tar.gz -C /usr/local/src/
cd /usr/local/src/elasticsearch-head/
npm install  

(4) Modify the Elasticsearch main configuration file

vim /etc/elasticsearch/elasticsearch.yml
......
--末尾添加以下内容--
http.cors.enabled: true #开启跨域访问支持,默认为 false
http.cors.allow-origin: "*" #指定跨域访问允许的域名地址为所有
 
systemctl restart elasticsearch 

(5) Start elasticsearch-head service
#The service must be started in the decompressed elasticsearch-head directory, and the process will read the gruntfile.js file in this directory, otherwise it may fail to start.

cd /usr/local/src/elasticsearch-head/
npm run start &
 
> [email protected] start /usr/local/src/elasticsearch-head
> grunt server
 
Running "connect:server" (connect) task
Waiting forever...
Started connect web server on http://localhost:9100
 
#elasticsearch-head 监听的端口是 9100
netstat -natp |grep 9100  

(6) View Elasticsearch information through Elasticsearch-head
Access the http://192.168.229.90:9100/ address through a browser and connect to the cluster. If you see the cluster health value is green, it means the cluster is healthy.

(7) Insert index
#Insert a test index through the command, the index is index-demo, and the type is test.

curl -X PUT 'localhost:9200/index-demo/test/1?pretty&pretty' -H 'content-Type: application/json' -d '{"user":"zhangsan","mesg":"hello world"}'
//输出结果如下:
{
"_index" : "index-demo",
"_type" : "test",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"created" : true
}  

Visit http://192.168.229.90:9100/ with a browser to view the index information, and you can see that the index is divided into 5 fragments by default, and there is a copy.

Click "Data Browse", you will find the index created on node1 as index-demo, and the related information of type test.

2. Instance operation: ELK Elasticsearch cluster deployment (operated on Node1 and Node2 nodes)

Close firewall
insert image description here
2.1, change host name, configure domain name resolution, view Java environment
insert image description here
insert image description here
insert image description here
insert image description here
2.2, deploy Elasticsearch software
(1) install elasticsearch—rpm package
insert image description here
(2) load system service
insert image description here
(3) modify elasticsearch main configuration file
insert image description here
(4) create data storage path and Authorization
insert image description here
(5) Whether the startup of elasticsearch is successful.
insert image description here
The above steps need to be operated on node1 and node2. Here only the configuration of Node1 is listed. The two configurations are the same. You only need to modify the Node node name in the configuration file. (6) Check the
node Information
View node1 and node2 node information
insert image description here
insert image description here
insert image description here
insert image description here
insert image description here
2.3. Install the Elasticsearch-head plugin (node1 node2)
(1) Compile and install node
insert image description here
(2) Install phantomjs

insert image description here
(3) Install Elasticsearch-head data visualization tool
insert image description here
insert image description here
insert image description here
(4) Modify Elasticsearch main configuration file
insert image description here
(5) Start elasticsearch-head service
insert image description here
insert image description here
(6) View Elasticsearch information through Elasticsearch-head
insert image description here
(7) Insert index
insert image description here
insert image description here

3. ELK Logstash deployment (operated on Apache nodes)

Logstash is generally deployed on servers whose logs need to be monitored. In this case, Logstash is deployed on the Apache server to collect the log information of the Apache server and send it to Elasticsearch.
3.1. change hostname

hostnamectl set-hostname apache 

3.2. Install Apache service (httpd)

yum -y install httpd
systemctl start httpd  

3.3. Install the Java environment

yum -y install java
java -version  

3.4.Install logstash
#Upload the software package logstash-5.5.1.rpm to the /opt directory

cd /opt
rpm -ivh logstash-5.5.1.rpm
systemctl start logstash.service
systemctl enable logstash.service
 
ln -s /usr/share/logstash/bin/logstash /usr/local/bin/ 

3.5.Test Logstash
Logstash command common options:

-f: Through this option, you can specify the configuration file of Logstash, and configure the input and output streams of Logstash according to the configuration file.

-e: Obtained from the command line, the input and output are followed by a string, which can be used as the configuration of Logstash (if it is empty, stdin is used as input and stdout is used as output by default).

-t: Test whether the configuration file is correct, and then exit.

Define input and output streams:

#Input uses standard input, output uses standard output (similar to pipes)

logstash -e 'input { stdin{} } output { stdout{} }'
......
www.baidu.com   #键入内容(标准输入)
2020-12-22T03:58:47.799Z node1 www.baidu.com    #输出结果(标准输出)
www.sina.com.cn #键入内容(标准输入)
2017-12-22T03:59:02.908Z node1 www.sina.com.cn  #输出结果(标准输出)
 
//执行 ctrl+c 退出
 
#使用 rubydebug 输出详细格式显示,codec 为一种编解码器
logstash -e 'input { stdin{} } output { stdout{ codec=>rubydebug } }'
......
www.baidu.com   #键入内容(标准输入)
{
"@timestamp" => 2020-12-22T02:15:39.136Z,    #输出结果(处理后的结果)
"@version" => "1",
"host" => "apache",
"message" => "www.baidu.com"
}
 
#使用 Logstash 将信息写入 Elasticsearch 中
logstash -e 'input { stdin{} } output { elasticsearch { hosts=>["192.168.229.90:9200"] } }'
输入  输出  对接
......
www.baidu.com   #键入内容(标准输入)
www.sina.com.cn #键入内容(标准输入)
www.google.com  #键入内容(标准输入)  

//The result is not displayed on the standard output, but sent to Elasticsearch, you can visit http://192.168.229.90:9100/ with a browser to view index information and data browsing.

3.6. Define the logstash configuration file
The Logstash configuration file basically consists of three parts: input, output, and filter (optional, use as needed).

#The format is as follows:

input {...}
filter {...}
output {...}  

#In each section, multiple access methods can also be specified. For example, to specify two log source files, the format is as follows:

input {
file { path =>"/var/log/messages" type =>"syslog"}
file { path =>"/var/log/httpd/access.log" type =>"apache"}
}  

#Modify the Logstash configuration file to collect system logs /var/log/messages and output them to elasticsearch.

chmod +r /var/log/messages  #让 Logstash 可以读取日志
 
vim /etc/logstash/conf.d/system.conf
input {
file{
path =>"/var/log/messages"   #指定要收集的日志的位置
type =>"system"  #自定义日志类型标识
start_position =>"beginning" #表示从开始处收集
}
}
output {
elasticsearch { #输出到 elasticsearch
hosts => ["192.168.229.90:9200"] #指定 elasticsearch 服务器的地址和端口
index =>"system-%{+YYYY.MM.dd}"  #指定输出到 elasticsearch 的索引格式
}
}
 
systemctl restart logstash

Browser access http://192.168.229.90:9100/ to view index information

4. Instance operation: ELK Logstash deployment (operated on Apache nodes)

4.1. Change hostname
insert image description here
4.2. Install Apache service (httpd)
insert image description here
4.3. Install the Java environment
insert image description here
4.4. Install logstash
insert image description here
insert image description here
insert image description here
4.5. Test Logstash
Define input and output streams: input uses standard input, output uses standard output (similar to pipelines)
insert image description here
Define input and output streams: use rubydebug to output detailed format display, codec is a codec
insert image description here
insert image description here
insert image description here4.6. Define the logstash configuration file

insert image description here
insert image description here

5. ELK Kiabana deployment (operate on Node1 node)

5.1. Install Kiabana
#Upload the package kibana-5.5.1-x86_64.rpm to the /opt directory

cd /opt
rpm -ivh kibana-5.5.1-x86_64.rpm

5.2. Set up Kibana's main configuration file

vim /etc/kibana/kibana.yml
--2--取消注释,Kiabana 服务的默认监听端口为5601
server.port: 5601
--7--取消注释,设置 Kiabana 的监听地址,0.0.0.0代表所有地址
server.host: "0.0.0.0"
--21--取消注释,设置和 Elasticsearch 建立连接的地址和端口
elasticsearch.url: "http://192.168.229.90:9200"
--30--取消注释,设置在 elasticsearch 中添加.kibana索引
kibana.index: ".kibana"  

5.3. Start the Kibana service

systemctl start kibana.service
systemctl enable kibana.service
 
netstat -natp | grep 5601  

Browser access http://192.168.229.90:5601
For the first login, you need to add an Elasticsearch index:
Index name or pattern
//Input: system-* #Enter the previously configured Output prefix "system" in the index name

Click the "create" button to create, click the "Discover" button to view chart information and log information.
The data display can be displayed by category, in the "host" in "Available Fields", and then click the "add" button, you can see the results filtered by "host"

5.5.Add Apache server logs (accessed, errors) to Elasticsearch and display them via Kibana

vim /etc/logstash/conf.d/apache_log.conf
input {
file{
path => "/etc/httpd/logs/access_log"
type => "access"
start_position => "beginning"
}
file{
path => "/etc/httpd/logs/error_log"
type => "error"
start_position => "beginning"
}
}
output {
if [type] == "access" {
elasticsearch {
hosts => ["192.168.80.10:9200"]
index => "apache_access-%{+YYYY.MM.dd}"
}
}
if [type] == "error" {
elasticsearch {
hosts => ["192.168.80.10:9200"]
index => "apache_error-%{+YYYY.MM.dd}"
}
}
}
 
cd /etc/logstash/conf.d/
/usr/share/logstash/bin/logstash -f apache_log.conf

Browser access http://192.168.229.90:9100 to check whether the index is created

Access http://192.168.229.90:5601 with a browser to log in to Kibana, click the "Create Index Pattern" button to add an index, enter the previously configured Output prefix apache_access- in the index name, and click the "Create" button. Add the apache_error-index in the same way .
Select the "Discover" tab, select the newly added apache_access- and apache_error- indexes in the middle drop-down list
, and you can view the corresponding charts and log information.

6. ELK Kiabana deployment (operate on Node1 node)

6.1. Install Kiabana
insert image description here
6.2. Set the main configuration file of Kibana and start the kibana service
insert image description here
6.3. Verify Kibana
① Add system index
insert image description here
② Click the "Discover" button to view chart information and log information
insert image description here
6.4 Add Apache server logs (access, error) to Elasticsearch and display them through Kibana Start
insert image description here
adding to Elasticsearch
insert image description here
browser to access Kibana test

The browser accesses the Apache server 192.168.229.70, in order to generate access logs
insert image description here
Create apache_access index log
insert image description here
insert image description here
Create apache_error index log
insert image description here
insert image description here

7. Filebeat+ELK deployment (follow the experiment above)

insert image description here
//Operate on Node1 node
7.1. Install Filebeat
#Upload the software package filebeat-6.2.4-linux-x86_64.tar.gz to the /opt directory

tar zxvf filebeat-6.2.4-linux-x86_64.tar.gz
mv filebeat-6.2.4-linux-x86_64/ /usr/local/filebeat

7.2. Set up filebeat's main configuration file

cd /usr/local/filebeat
 
vim filebeat.yml
filebeat.prospectors:
- type: log #指定 log 类型,从日志文件中读取消息
enabled: true
paths:
- /var/log/messages #指定监控的日志文件
- /var/log/*.log
fields: #可以使用 fields 配置选项设置一些参数字段添加到 output 中
service_name: filebeat
log_type: log
service_id: 192.168.229.90
 
--------------Elasticsearch output-------------------
(全部注释掉)
 
----------------Logstash output---------------------
output.logstash:
hosts: ["192.168.229.70:5044"] #指定 logstash 的 IP 和端口
 
#启动 filebeat
./filebeat -e -c filebeat.yml  

7.3. Create a new Logstash configuration file on the node where the Logstash component is located

cd /etc/logstash/conf.d
 
vim logstash.conf
input {
beats {
port => "5044"
}
}
output {
elasticsearch {
hosts => ["192.168.80.10:9200"]
index => "%{[fields][service_name]}-%{+YYYY.MM.dd}"
}
stdout {
codec => rubydebug
}
}
 
#启动 logstash
logstash -f logstash.conf  

7.4. Browser access test
Browser access http://192.168.229.90:5601 Log in to Kibana

Click the "Create Index Pattern" button to add the index "filebeat-*", click the "create" button to create, click the "Discover" button to view the chart information and log information.

8. Filebeat+ELK deployment (follow the above experiment, I am operating on Node1 here, you can choose any virtual machine to operate)

8.1. Install Filebeat
insert image description here
insert image description here8.2. Set up filebeat's main configuration file

insert image description here
Create a new Logstash configuration file (192.168.229.70) on the node where the Logstash component is located,
insert image description here
and use the logstash -f logstash.conf command to start it.
insert image description here
8.3 When operating on Node1, you
must first start logstash to start filebeat before starting filebeat, otherwise it will prompt that you cannot connect to logstash: Failed to connect: dial tcp 192.168.229.70:5044: getsockopt: connection refused
insert image description here
8.4. Browser access http://192.168.229.90:5601 to log in to Kibana test,
① Click the "Create Index Pattern" button to add the index "filebeat-*", click the "create" button to create
insert image description here
② Click the "Discover" button to view the chart information and log information.
insert image description here

Guess you like

Origin blog.csdn.net/weixin_59325762/article/details/130041597