Table of contents
1. Introduction to ELK log analysis system
2. Introduction to Elasticsearch
6. Deploy the ELK log analysis system
6.1 ELK Elasticsearch cluster deployment (operate on Node1, Node2 nodes)
6.2 Deploy Elasticsearch software
6.3. Create data storage path and authorize
6.4 Whether to start elasticsearch successfully
6.6 Install the Elasticsearch-head plugin
6.6.1 Compile and install node
6.6.2 Install phantomjs (front-end framework)
6.6.3 Install Elasticsearch-head data visualization tool
6.6.4 Modify the Elasticsearch main configuration file
6.6.5 Start elasticsearch-head service
6.6.6 View Elasticsearch information through Elasticsearch-head
6.7 ELK Logstash deployment (operating on Apache nodes)
6.7.2 Install Apache service (httpd)
6.7.3 Install Java environment
6.7.6 Define the logstash configuration file
1. Introduction to ELK log analysis system
ELK is a whole composed of three open source software including but not limited to Elasticsearch (es for short), Logstash, and Kibana. These three software synthesize ELK. It is a complete set of solutions for data extraction (Logstash), search analysis (Elasticsearch), and data display (Kibana), so it is also called ELK stack.
ELK is a general term for a free and open source log analysis architecture technology stack, the official website https://www.elastic.co/cn. Contains three basic components, namely Elasticsearch, Logstash, Kibana.
2. Introduction to Elasticsearch
2.1 Overview of Elasticsearch
Provides a full-text search engine for distributed users.
ElasticSearch: It is a distributed storage and retrieval engine developed based on Lucene (a full-text retrieval engine architecture), used to store various logs. Elasticsearch is developed in Java and can communicate with Elasticsearch through a RESTful web interface through a browser. Elasticsearch is a real-time, distributed, scalable search engine that allows full-text, structured searches. It is typically used to index and search large volumes of log data, and can also be used to search many different types of documents.
3. Introduction to Logstash
as a data collection engine. It supports dynamic collection of data from various data sources, and performs operations such as filtering, analyzing, enriching, and unifying the format on the data, and then stores it in a location specified by the user, and generally sends it to Elasticsearch. Logstash is written in Ruby language and runs on Java Virtual Machine (JVM). It is a powerful data processing tool that can realize data transmission, format processing, and format output. Logstash has a powerful plug-in function, which is often used for log processing.
Relative input (data collection) filter (data filtering) output (data output)
4. Introduction to Kibana
Kibana is usually deployed together with Elasticsearch. Kibana is a powerful data visualization Dashboard for Elasticsearch. Kibana provides a graphical web interface to browse Elasticsearch log data, which can be used to summarize, analyze and search for important data.
Five, ELK working principle
(1) Deploy Logstash on all servers that need to collect logs; or centralize the management of logs on the log server first, and deploy Logstash on the log server.
(2) Logstash collects logs, formats and outputs logs to the Elasticsearch cluster.
(3) Elasticsearch indexes and stores the formatted data.
(4) Kibana queries the data from the ES cluster to generate charts and displays the front-end data.
Summary: logstash, as a log collector, collects data from data sources, filters and formats the data, and then stores it in Elasticsearch, and kibana visualizes the logs.
6. Deploy the ELK log analysis system
Node1 node (2C/4G): node1/192.168.237.21 Elasticsearch Kibana
Node2 node (2C/4G): node2/192.168.237.22 Elasticsearch
Apache node: apache/192.168.237.23 Logstash Apache
6.1 ELK Elasticsearch cluster deployment (operate on Node1, Node2 nodes)
两台node都做
vim /etc/hosts
192.168.10.13 node1
192.168.10.14 node2
Note: version issue
java -version #If not installed,
yum -y install java openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-b12)
OpenJDK 64-Bit Server VM (build 25.131-b12, mixed mode)
It is recommended to use jdk
6.2 Deploy Elasticsearch software
(1)安装elasticsearch—rpm包
#上传elasticsearch-5.5.0.rpm到/opt目录下
cd /opt
rpm -ivh elasticsearch-5.5.0.rpm
(2)加载系统服务
systemctl daemon-reload
systemctl enable elasticsearch.service
(3)修改elasticsearch主配置文件
cp /etc/elasticsearch/elasticsearch.yml /etc/elasticsearch/elasticsearch.yml.bak
vim /etc/elasticsearch/elasticsearch.yml
--17--取消注释,指定集群名字
cluster.name: my-elk-cluster
--23--取消注释,指定节点名字:Node1节点为node1,Node2节点为node2
node.name: node1
--33--取消注释,指定数据存放路径
path.data: /data/elk_data
--37--取消注释,指定日志存放路径
path.logs: /var/log/elasticsearch/
--43--取消注释,改为在启动的时候不锁定内存
bootstrap.memory_lock: false
--55--取消注释,设置监听地址,0.0.0.0代表所有地址
network.host: 0.0.0.0
--59--取消注释,ES 服务的默认监听端口为9200
http.port: 9200
--68--取消注释,集群发现通过单播实现,指定要发现的节点 node1、node2
discovery.zen.ping.unicast.hosts: ["node1", "node2"]
grep -v "^#" /etc/elasticsearch/elasticsearch.yml
6.3. Create data storage path and authorize
mkdir -p /data/elk_data
chown elasticsearch:elasticsearch /data/elk_data/
6.4 Whether to start elasticsearch successfully
systemctl start elasticsearch.service
netstat -antp | grep 9200
6.5 View node information
browser access
View nodes Node1, Node2
http://192.168.237.21:9200/_cluster/health?pretty
http://192.168.237.22:9200/_cluster/health?pretty
View the health status of the cluster, and you can see that the status value is green (green), indicating that the node is running healthily.
#Using the above method to view the status of the cluster is not user-friendly. You can manage the cluster more conveniently by installing the Elasticsearch-head plug-in.
6.6 Install the Elasticsearch-head plugin
After Elasticsearch version 5.0, the Elasticsearch-head plug-in needs to be installed as an independent service, and needs to be installed using the npm tool (a package management tool for NodeJS).
To install Elasticsearch-head, you need to install the dependent software node and phantomjs in advance.
node: is a JavaScript runtime environment based on the Chrome V8 engine.
phantomjs: is a webkit-based JavaScript API, which can be understood as an invisible browser, and it can do anything that a webkit-based browser can do.
6.6.1 Compile and install node
#上传软件包 node-v8.2.1.tar.gz 到/opt
yum install gcc gcc-c++ make -y
cd /opt
tar zxvf node-v8.2.1.tar.gz
cd node-v8.2.1/
./configure
make && make install
6.6.2 Install phantomjs (front-end framework)
#上传软件包 phantomjs-2.1.1-linux-x86_64.tar.bz2 到
cd /opt
tar jxvf phantomjs-2.1.1-linux-x86_64.tar.bz2 -C /usr/local/src/
cd /usr/local/src/phantomjs-2.1.1-linux-x86_64/bin
cp phantomjs /usr/local/bin
6.6.3 Install Elasticsearch-head data visualization tool
#上传软件包 elasticsearch-head.tar.gz 到/opt
cd /opt
tar zxvf elasticsearch-head.tar.gz -C /usr/local/src/
cd /usr/local/src/elasticsearch-head/
npm install
6.6.4 Modify the Elasticsearch main configuration file
vim /etc/elasticsearch/elasticsearch.yml
...... --末尾添加以下内容--
http.cors.enabled: true #开启跨域访问支持,默认为
false http.cors.allow-origin: "*" #指定跨域访问允许的域名地址为所有
systemctl restart elasticsearch
6.6.5 Start elasticsearch-head service
#必须在解压后的 elasticsearch-head 目录下启动服务,进程会读取该目录下的 gruntfile.js 文件,否则可能启动失败。
cd /usr/local/src/elasticsearch-head/
npm run start &
netstat -natp |grep 9100
6.6.6 View Elasticsearch information through Elasticsearch-head
6.6.7 Insert Index
#通过命令插入一个测试索引,索引为 index-demo,类型为 test。
//输出结果如下:curl -X PUT 'localhost:9200/index-demo/test/1?pretty&pretty' -H 'content-Type: application/json' -d '{"user":"zhangsan","mesg":"hello world"}'
Visit http://192.168.237.21:9100/ with the browser to view the index information, and you can see that the index is divided into 5 fragments by default, and there is a copy. Click "Data Browse", you will find the index created on node1 as index-demo, and the related information of type test.
6.7 ELK Logstash deployment (operating on Apache nodes)
6.7.1 Changing the hostname
hostnamectl set-hostname apache
6.7.2 Install Apache service (httpd)
yum -y install httpd
systemctl start httpd
6.7.3 Install Java environment
yum -y install java
java -version
6.7.4 Install logstash
#上传软件包 logstash-5.5.1.rpm 到/opt目录下
cd /opt
rpm -ivh logstash-5.5.1.rpm
systemctl start logstash.service
systemctl enable logstash.service
ln -s /usr/share/logstash/bin/logstash /usr/local/bin/
6.7.5 Testing Logstash
Common options for Logstash commands: -f: Through this option, you can specify the configuration file of Logstash, and configure the input and output streams of Logstash according to the configuration file. -e: Obtained from the command line, the input and output are followed by a string, which can be used as the configuration of Logstash (if it is empty, stdin is used as input and stdout is used as output by default). -t: Test whether the configuration file is correct, and then exit.
Define input and output streams: #Input uses standard input, output uses standard output (similar to pipes)
logstash -e 'input { stdin{} } output { stdout{} }'
......
www.baidu.com #键入内容(标准输入)
2020-12-22T03:58:47.799Z node1 www.baidu.com #输出结果(标准输出)
www.sina.com.cn #键入内容(标准输入)
2017-12-22T03:59:02.908Z node1 www.sina.com.cn #输出结果(标准输出)
//执行 ctrl+c 退出
#使用 rubydebug 输出详细格式显示,codec 为一种编解码器
logstash -e 'input { stdin{} } output { stdout{ codec=>rubydebug } }'
......
www.baidu.com #键入内容(标准输入)
//The result is not displayed on the standard output, but sent to Elasticsearch, you can visit http://192.168.10.13:9100/ with a browser to view index information and data browsing.
#使用 Logstash 将信息写入 Elasticsearch 中
logstash -e 'input { stdin{} } output { elasticsearch { hosts=>["192.168.10.13:9200"] } }'
输入 输出 对接
......
www.sina.com.cn #键入内容(标准输入)
6.7.6 Define the logstash configuration file
The Logstash configuration file basically consists of three parts: input, output, and filter (optional, use as needed).
input: Indicates collecting data from data sources, common data sources such as Kafka, log files, etc.
filter: Indicates the data processing layer, including data formatting, data type conversion, data filtering, etc., and supports regular expressions
output: Indicates that the data collected by Logstash is processed by the filter and then output to Elasticsearch.
#The format is as follows:
input {...}
filter {...}
output {...}
#In each section, multiple access methods can also be specified. For example, to specify two log source files, the format is as follows:
input { file { path =>"/var/log/messages" type =>"syslog"}
file { path =>"/var/log/httpd/access.log" type =>"apache"} }
Modify the Logstash configuration file to collect system logs /var/log/messages and output them to elasticsearch.
chmod +r /var/log/messages #让 Logstash 可以读取日志
vim /etc/logstash/conf.d/system.conf
input {
file{
path =>"/var/log/messages" #指定要收集的日志的位置
type =>"system" #自定义日志类型标识
start_position =>"beginning" #表示从开始处收集
}
}
output {
elasticsearch { #输出到 elasticsearch
hosts => ["192.168.10.13:9200"] #指定 elasticsearch 服务器的地址和端口
index =>"system-%{+YYYY.MM.dd}" #指定输出到 elasticsearch 的索引格式
}
}
systemctl restart logstash
Browser access http://192.168.237.22:9100/ to view index information