Article Directory
- 1. Overview of ELK log analysis system
- 2. Operation steps of ELK log analysis system cluster deployment
-
- 1. ELK Elasticsearch cluster deployment (operated on Node1 and Node2 nodes)
- 2. Instance operation: ELK Elasticsearch cluster deployment (operated on Node1 and Node2 nodes)
- 3. ELK Logstash deployment (operated on Apache nodes)
- 4. Instance operation: ELK Logstash deployment (operated on Apache nodes)
- 5. ELK Kiabana deployment (operate on Node1 node)
- 6. ELK Kiabana deployment (operate on Node1 node)
- 7. Filebeat+ELK deployment (follow the experiment above)
- 8. Filebeat+ELK deployment (follow the above experiment, I am operating on Node1 here, you can choose any virtual machine to operate)
1. Overview of ELK log analysis system
1. Introduction to ELK
LK is the abbreviation of three open source software, respectively: Elasticsearch, Logstash, Kibana, they are all open source software. A new FileBeat has been added, which is a lightweight log collection and processing tool (Agent). Filebeat takes up less resources and is suitable for collecting logs on each server and transferring them to Logstash. This tool is also recommended by the official.
Elasticsearch is an open source distributed search engine that provides three functions of collecting, analyzing and storing data.
Its features are: distributed, zero configuration, automatic discovery, automatic index fragmentation, index copy mechanism, restful style interface, multiple data sources, automatic search load, etc. It is mainly responsible for indexing and storing logs to facilitate retrieval and query by business parties.
Logstash is mainly a tool for collecting, analyzing, and filtering logs, and supports a large number of data acquisition methods.
The general working method is c/s architecture, the client is installed on the host that needs to collect logs, and the server is responsible for filtering and modifying the received logs of each node and sending them to elasticsearch.
It is a middleware for log collection, filtering, and forwarding. It is mainly responsible for collecting and filtering various logs of various business lines in a unified manner, and forwarding them to Elasticsearch for further processing.
Kibana is also an open source and free tool. Kibana can provide a log analysis friendly web interface for Logstash and ElasticSearch, which can help summarize, analyze and search important data logs.
2. Reasons for using ELK
Logs mainly include system logs, application logs, and security logs. System operation and maintenance and developers can use logs to understand server software and hardware information, check errors in the configuration process and the reasons for errors. Frequent analysis of logs can help you understand the server's load, performance security, and take timely measures to correct errors.
Often we can use tools such as grep and awk to analyze the logs of a single machine basically, but when the logs are scattered and stored on different devices. If you manage tens or hundreds of servers, you are still viewing logs using the traditional method of logging into each machine in turn. Does this feel cumbersome and inefficient. As a matter of urgency, we use centralized log management, such as open source syslog, to collect and summarize logs on all servers. After centralized management of logs, the statistics and retrieval of logs has become a more troublesome thing. Generally, we can use Linux commands such as grep, awk and wc to achieve retrieval and statistics, but for more demanding queries, sorting and statistics, etc. And the huge number of machines is still a bit powerless to use this method.
Generally, a large-scale system is a distributed deployment architecture. Different service modules are deployed on different servers. When a problem occurs, in most cases, it is necessary to locate the specific server and service module based on the key information exposed by the problem, and build a set of centralized A log system can improve the efficiency of locating problems.
3. Basic features of complete log system
Collection: Ability to collect log data from multiple sources
Transmission: It can stably analyze and filter log data and transmit it to the storage system
storage: store log data
Analysis: Support UI analysis
Warning: can provide error reporting, monitoring mechanism
4. Working principle of ELK
(1) Deploy Logstash on all servers that need to collect logs; or centralize the management of logs on the log server first, and deploy Logs tash on the log server.
(2) Logstash collects logs, formats and outputs logs to the Elasticsearch cluster.
(3) Elasticsearch indexes and stores the formatted data.
(4) Kibana queries the data from the ES cluster to generate charts and displays the front-end data.
2. Operation steps of ELK log analysis system cluster deployment
Environment preparation:
all servers close the firewall and SElinux
systemctl stop firewalld
setenforce 0
1. ELK Elasticsearch cluster deployment (operated on Node1 and Node2 nodes)
1.1. Change the host name, configure domain name resolution, and view the Java environment
Node1节点:hostnamectl set-hostname node1
Node2节点:hostnamectl set-hostname node2
vim /etc/hosts
192.168.229.90 node1
192.168.229.80 node2
java -version #如果没有安装,yum -y install java
openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-b12)
OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)
1.2. Deploy Elasticsearch software
(1) Install elasticsearch—rpm package
#Upload elasticsearch-5.5.0.rpm to the /opt directory
cd /opt
rpm -ivh elasticsearch-5.5.0.rpm
(2) Loading system services
systemctl daemon-reload
systemctl enable elasticsearch.service
(3) Modify the elasticsearch main configuration file
cp /etc/elasticsearch/elasticsearch.yml /etc/elasticsearch/elasticsearch.yml.bak
vim /etc/elasticsearch/elasticsearch.yml
--17--取消注释,指定集群名字
cluster.name: my-elk-cluster
--23--取消注释,指定节点名字:Node1节点为node1,Node2节点为node2
node.name: node1
--33--取消注释,指定数据存放路径
path.data: /data/elk_data
--37--取消注释,指定日志存放路径
path.logs: /var/log/elasticsearch/
--43--取消注释,改为在启动的时候不锁定内存
bootstrap.memory_lock: false
--55--取消注释,设置监听地址,0.0.0.0代表所有地址
network.host: 0.0.0.0
--59--取消注释,ES 服务的默认监听端口为9200
http.port: 9200
--68--取消注释,集群发现通过单播实现,指定要发现的节点 node1、node2
discovery.zen.ping.unicast.hosts: ["node1", "node2"]
grep -v "^#" /etc/elasticsearch/elasticsearch.yml
(4) Create a data storage path and authorize it
mkdir -p /data/elk_data
chown elasticsearch:elasticsearch /data/elk_data/
(5) Whether to start elasticsearch successfully
systemctl start elasticsearch.service
netstat -antp | grep 9200
(6) View node information
浏览器访问 http://192.168.229.90:9200 、 http://192.168.229.80:9200 查看节点 Node1、Node2 的信息。
浏览器访问 http://192.168.229.90:9200/_cluster/health?pretty 、 http://192.168.229.80:9200/_cluster/health?pretty查看群集的健康情况,可以看到 status 值为 green(绿色), 表示节点健康运行。
浏览器访问 http://192.168.229.90:9200/_cluster/state?pretty 检查群集状态信息。
#使用上述方式查看群集的状态对用户并不友好,可以通过安装 Elasticsearch-head 插件,可以更方便地管理群集。
1.3. Install the Elasticsearch-head plug-
in Elasticsearch After version 5.0, the Elasticsearch-head plug-in needs to be installed as an independent service and needs to be installed using the npm tool (a package management tool for NodeJS).
To install Elasticsearch-head, you need to install the dependent software node and phantomjs in advance.
node: is a JavaScript runtime environment based on the Chrome V8 engine.
phantomjs: is a webkit-based JavaScript API, which can be understood as an invisible browser, and it can do anything that a webkit-based browser can do.
(1) Compile and install node
#Upload software package node-v8.2.1.tar.gz to /opt
yum install gcc gcc-c++ make -y
cd /opt
tar zxvf node-v8.2.1.tar.gz
cd node-v8.2.1/
./configure
make && make install
(2) Install phantomjs
#Upload the package phantomjs-2.1.1-linux-x86_64.tar.bz2 to
cd /opt
tar jxvf phantomjs-2.1.1-linux-x86_64.tar.bz2 -C /usr/local/src/
cd /usr/local/src/phantomjs-2.1.1-linux-x86_64/bin
cp phantomjs /usr/local/bin
(3) Install the Elasticsearch-head data visualization tool
#Upload the software package elasticsearch-head.tar.gz to /opt
cd /opt
tar zxvf elasticsearch-head.tar.gz -C /usr/local/src/
cd /usr/local/src/elasticsearch-head/
npm install
(4) Modify the Elasticsearch main configuration file
vim /etc/elasticsearch/elasticsearch.yml
......
--末尾添加以下内容--
http.cors.enabled: true #开启跨域访问支持,默认为 false
http.cors.allow-origin: "*" #指定跨域访问允许的域名地址为所有
systemctl restart elasticsearch
(5) Start elasticsearch-head service
#The service must be started in the decompressed elasticsearch-head directory, and the process will read the gruntfile.js file in this directory, otherwise it may fail to start.
cd /usr/local/src/elasticsearch-head/
npm run start &
> [email protected] start /usr/local/src/elasticsearch-head
> grunt server
Running "connect:server" (connect) task
Waiting forever...
Started connect web server on http://localhost:9100
#elasticsearch-head 监听的端口是 9100
netstat -natp |grep 9100
(6) View Elasticsearch information through Elasticsearch-head
Access the http://192.168.229.90:9100/ address through a browser and connect to the cluster. If you see the cluster health value is green, it means the cluster is healthy.
(7) Insert index
#Insert a test index through the command, the index is index-demo, and the type is test.
curl -X PUT 'localhost:9200/index-demo/test/1?pretty&pretty' -H 'content-Type: application/json' -d '{"user":"zhangsan","mesg":"hello world"}'
//输出结果如下:
{
"_index" : "index-demo",
"_type" : "test",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"created" : true
}
Visit http://192.168.229.90:9100/ with a browser to view the index information, and you can see that the index is divided into 5 fragments by default, and there is a copy.
Click "Data Browse", you will find the index created on node1 as index-demo, and the related information of type test.
2. Instance operation: ELK Elasticsearch cluster deployment (operated on Node1 and Node2 nodes)
Close firewall
2.1, change host name, configure domain name resolution, view Java environment
2.2, deploy Elasticsearch software
(1) install elasticsearch—rpm package
(2) load system service
(3) modify elasticsearch main configuration file
(4) create data storage path and Authorization
(5) Whether the startup of elasticsearch is successful.
The above steps need to be operated on node1 and node2. Here only the configuration of Node1 is listed. The two configurations are the same. You only need to modify the Node node name in the configuration file. (6) Check the
node Information
View node1 and node2 node information
2.3. Install the Elasticsearch-head plugin (node1 node2)
(1) Compile and install node
(2) Install phantomjs
(3) Install Elasticsearch-head data visualization tool
(4) Modify Elasticsearch main configuration file
(5) Start elasticsearch-head service
(6) View Elasticsearch information through Elasticsearch-head
(7) Insert index
3. ELK Logstash deployment (operated on Apache nodes)
Logstash is generally deployed on servers whose logs need to be monitored. In this case, Logstash is deployed on the Apache server to collect the log information of the Apache server and send it to Elasticsearch.
3.1. change hostname
hostnamectl set-hostname apache
3.2. Install Apache service (httpd)
yum -y install httpd
systemctl start httpd
3.3. Install the Java environment
yum -y install java
java -version
3.4.Install logstash
#Upload the software package logstash-5.5.1.rpm to the /opt directory
cd /opt
rpm -ivh logstash-5.5.1.rpm
systemctl start logstash.service
systemctl enable logstash.service
ln -s /usr/share/logstash/bin/logstash /usr/local/bin/
3.5.Test Logstash
Logstash command common options:
-f: Through this option, you can specify the configuration file of Logstash, and configure the input and output streams of Logstash according to the configuration file.
-e: Obtained from the command line, the input and output are followed by a string, which can be used as the configuration of Logstash (if it is empty, stdin is used as input and stdout is used as output by default).
-t: Test whether the configuration file is correct, and then exit.
Define input and output streams:
#Input uses standard input, output uses standard output (similar to pipes)
logstash -e 'input { stdin{} } output { stdout{} }'
......
www.baidu.com #键入内容(标准输入)
2020-12-22T03:58:47.799Z node1 www.baidu.com #输出结果(标准输出)
www.sina.com.cn #键入内容(标准输入)
2017-12-22T03:59:02.908Z node1 www.sina.com.cn #输出结果(标准输出)
//执行 ctrl+c 退出
#使用 rubydebug 输出详细格式显示,codec 为一种编解码器
logstash -e 'input { stdin{} } output { stdout{ codec=>rubydebug } }'
......
www.baidu.com #键入内容(标准输入)
{
"@timestamp" => 2020-12-22T02:15:39.136Z, #输出结果(处理后的结果)
"@version" => "1",
"host" => "apache",
"message" => "www.baidu.com"
}
#使用 Logstash 将信息写入 Elasticsearch 中
logstash -e 'input { stdin{} } output { elasticsearch { hosts=>["192.168.229.90:9200"] } }'
输入 输出 对接
......
www.baidu.com #键入内容(标准输入)
www.sina.com.cn #键入内容(标准输入)
www.google.com #键入内容(标准输入)
//The result is not displayed on the standard output, but sent to Elasticsearch, you can visit http://192.168.229.90:9100/ with a browser to view index information and data browsing.
3.6. Define the logstash configuration file
The Logstash configuration file basically consists of three parts: input, output, and filter (optional, use as needed).
#The format is as follows:
input {...}
filter {...}
output {...}
#In each section, multiple access methods can also be specified. For example, to specify two log source files, the format is as follows:
input {
file { path =>"/var/log/messages" type =>"syslog"}
file { path =>"/var/log/httpd/access.log" type =>"apache"}
}
#Modify the Logstash configuration file to collect system logs /var/log/messages and output them to elasticsearch.
chmod +r /var/log/messages #让 Logstash 可以读取日志
vim /etc/logstash/conf.d/system.conf
input {
file{
path =>"/var/log/messages" #指定要收集的日志的位置
type =>"system" #自定义日志类型标识
start_position =>"beginning" #表示从开始处收集
}
}
output {
elasticsearch { #输出到 elasticsearch
hosts => ["192.168.229.90:9200"] #指定 elasticsearch 服务器的地址和端口
index =>"system-%{+YYYY.MM.dd}" #指定输出到 elasticsearch 的索引格式
}
}
systemctl restart logstash
Browser access http://192.168.229.90:9100/ to view index information
4. Instance operation: ELK Logstash deployment (operated on Apache nodes)
4.1. Change hostname
4.2. Install Apache service (httpd)
4.3. Install the Java environment
4.4. Install logstash
4.5. Test Logstash
Define input and output streams: input uses standard input, output uses standard output (similar to pipelines)
Define input and output streams: use rubydebug to output detailed format display, codec is a codec
4.6. Define the logstash configuration file
5. ELK Kiabana deployment (operate on Node1 node)
5.1. Install Kiabana
#Upload the package kibana-5.5.1-x86_64.rpm to the /opt directory
cd /opt
rpm -ivh kibana-5.5.1-x86_64.rpm
5.2. Set up Kibana's main configuration file
vim /etc/kibana/kibana.yml
--2--取消注释,Kiabana 服务的默认监听端口为5601
server.port: 5601
--7--取消注释,设置 Kiabana 的监听地址,0.0.0.0代表所有地址
server.host: "0.0.0.0"
--21--取消注释,设置和 Elasticsearch 建立连接的地址和端口
elasticsearch.url: "http://192.168.229.90:9200"
--30--取消注释,设置在 elasticsearch 中添加.kibana索引
kibana.index: ".kibana"
5.3. Start the Kibana service
systemctl start kibana.service
systemctl enable kibana.service
netstat -natp | grep 5601
Browser access http://192.168.229.90:5601
For the first login, you need to add an Elasticsearch index:
Index name or pattern
//Input: system-* #Enter the previously configured Output prefix "system" in the index name
Click the "create" button to create, click the "Discover" button to view chart information and log information.
The data display can be displayed by category, in the "host" in "Available Fields", and then click the "add" button, you can see the results filtered by "host"
5.5.Add Apache server logs (accessed, errors) to Elasticsearch and display them via Kibana
vim /etc/logstash/conf.d/apache_log.conf
input {
file{
path => "/etc/httpd/logs/access_log"
type => "access"
start_position => "beginning"
}
file{
path => "/etc/httpd/logs/error_log"
type => "error"
start_position => "beginning"
}
}
output {
if [type] == "access" {
elasticsearch {
hosts => ["192.168.80.10:9200"]
index => "apache_access-%{+YYYY.MM.dd}"
}
}
if [type] == "error" {
elasticsearch {
hosts => ["192.168.80.10:9200"]
index => "apache_error-%{+YYYY.MM.dd}"
}
}
}
cd /etc/logstash/conf.d/
/usr/share/logstash/bin/logstash -f apache_log.conf
Browser access http://192.168.229.90:9100 to check whether the index is created
Access http://192.168.229.90:5601 with a browser to log in to Kibana, click the "Create Index Pattern" button to add an index, enter the previously configured Output prefix apache_access- in the index name, and click the "Create" button. Add the apache_error-index in the same way .
Select the "Discover" tab, select the newly added apache_access- and apache_error- indexes in the middle drop-down list , and you can view the corresponding charts and log information.
6. ELK Kiabana deployment (operate on Node1 node)
6.1. Install Kiabana
6.2. Set the main configuration file of Kibana and start the kibana service
6.3. Verify Kibana
① Add system index
② Click the "Discover" button to view chart information and log information
6.4 Add Apache server logs (access, error) to Elasticsearch and display them through Kibana Start
adding to Elasticsearch
browser to access Kibana test
The browser accesses the Apache server 192.168.229.70, in order to generate access logs
Create apache_access index log
Create apache_error index log
7. Filebeat+ELK deployment (follow the experiment above)
//Operate on Node1 node
7.1. Install Filebeat
#Upload the software package filebeat-6.2.4-linux-x86_64.tar.gz to the /opt directory
tar zxvf filebeat-6.2.4-linux-x86_64.tar.gz
mv filebeat-6.2.4-linux-x86_64/ /usr/local/filebeat
7.2. Set up filebeat's main configuration file
cd /usr/local/filebeat
vim filebeat.yml
filebeat.prospectors:
- type: log #指定 log 类型,从日志文件中读取消息
enabled: true
paths:
- /var/log/messages #指定监控的日志文件
- /var/log/*.log
fields: #可以使用 fields 配置选项设置一些参数字段添加到 output 中
service_name: filebeat
log_type: log
service_id: 192.168.229.90
--------------Elasticsearch output-------------------
(全部注释掉)
----------------Logstash output---------------------
output.logstash:
hosts: ["192.168.229.70:5044"] #指定 logstash 的 IP 和端口
#启动 filebeat
./filebeat -e -c filebeat.yml
7.3. Create a new Logstash configuration file on the node where the Logstash component is located
cd /etc/logstash/conf.d
vim logstash.conf
input {
beats {
port => "5044"
}
}
output {
elasticsearch {
hosts => ["192.168.80.10:9200"]
index => "%{[fields][service_name]}-%{+YYYY.MM.dd}"
}
stdout {
codec => rubydebug
}
}
#启动 logstash
logstash -f logstash.conf
7.4. Browser access test
Browser access http://192.168.229.90:5601 Log in to Kibana
Click the "Create Index Pattern" button to add the index "filebeat-*", click the "create" button to create, click the "Discover" button to view the chart information and log information.
8. Filebeat+ELK deployment (follow the above experiment, I am operating on Node1 here, you can choose any virtual machine to operate)
8.1. Install Filebeat
8.2. Set up filebeat's main configuration file
Create a new Logstash configuration file (192.168.229.70) on the node where the Logstash component is located,
and use the logstash -f logstash.conf command to start it.
8.3 When operating on Node1, you
must first start logstash to start filebeat before starting filebeat, otherwise it will prompt that you cannot connect to logstash: Failed to connect: dial tcp 192.168.229.70:5044: getsockopt: connection refused
8.4. Browser access http://192.168.229.90:5601 to log in to Kibana test,
① Click the "Create Index Pattern" button to add the index "filebeat-*", click the "create" button to create
② Click the "Discover" button to view the chart information and log information.