ELK log collection system cluster experiment (version 5.5.0)

Table of contents

Preface

I. Overview

2. Component introduction

1、elasticsearch

2、logstash

3、kibana

3. Architecture type

4. ELK log collection cluster experiment

1. Experimental topology

2. Install elasticsearch on node1 and node2 nodes

3. Start the elasticsearch service

4. Install the elasticsearch-head plug-in on node1

5. Test input

6. Install logstash on node1 server

7. Logstash log collection file format (stored in /etc/logstash/conf.d by default)

8. Install kibana on node1 node

5. Configure http node



Preface

ELK refers to the combination of Elasticsearch, Logstash and Kibana. They are a set of open source log collection, storage, search and visualization systems that are often used to centrally manage and analyze log data.

1. Elasticsearch: A distributed real-time search and analysis engine. It can handle large-scale data and provide fast search, aggregation and data analysis capabilities.

2. Logstash: A tool for log collection, processing and transmission. It supports collecting log data from multiple sources, can perform data cleaning, transformation and filtering, and send the data to target storage such as Elasticsearch.

3. Kibana: A tool for data visualization and analysis. It can visually display data in Elasticsearch through charts, dashboards, and reports to help users understand and analyze logs.

The workflow of the ELK log collection system is as follows:
1. Logstash configuration: Configure the input plug-in in Logstash and specify the source of log data, such as files, networks, or message queues.
2. Data processing: Use the Logstash filter plug-in to clean, convert and filter the log data to make it meet the needs, and then send the processed data to Elasticsearch.
3. Data storage: Elasticsearch indexes and stores the received log data for fast search and analysis.
4. Data visualization: Use Kibana to create visual components such as charts, dashboards, and reports, and display statistical information and trends of log data by searching and aggregating data.
5. Real-time search and analysis: Through the search function provided by Kibana, log data can be searched and analyzed in real time to help discover problems, monitor the system and optimize performance.

The advantages of the ELK log collection system include:
- Efficient processing of large-scale log data: Elasticsearch, as a storage and search engine, can handle large-scale log data.
- Flexible data processing and filtering: Logstash provides a wealth of plug-ins and filters to flexibly process and transform log data.
- Intuitive data visualization: Kibana provides a graphical interface to display log data statistics and trends in an intuitive way.
- Real-time search and analysis: The ELK system supports real-time search and analysis, which can help quickly locate and solve problems.

In short, the ELK log collection system is a powerful combination of tools that can help enterprises centrally manage, store, search and visualize large amounts of log data, and improve system monitoring and troubleshooting capabilities.


I. Overview

1. ELK consists of three components: Elasticsearch, Logstash and Kibana.

Log collection Logstash
Log analysis Elasticsearch
Log visualization Kibana

   
2. Why use?
    Logs are very important for analyzing the status of systems and applications, but generally the amount of logs is relatively large and scattered.
    If there are relatively few servers or programs under management, we can also log in to each server one by one to view and analyze. But if the number of servers or programs is large, this method will become insufficient. Based on this, some centralized logging systems have been applied. Currently, the more famous and mature ones include Splunk (commercial), FaceBook's Scribe, Apache's Chukwa Cloudera's Fluentd, and ELK, etc.
    

2. Component introduction

1、elasticsearch

Function: log analysis, open source log collection, analysis, storage program

Features:

Distributed,
zero configuration,
automatic discovery
, automatic index sharding, index
copy mechanism,
Restful style interface,
multiple data sources,
automatic search load

2、logstash

Function: Tool for log collection, collection, analysis, and log filtering

Working process:

The general working method is c/s architecture. The client is installed on the server that needs to collect logs. The server is responsible for filtering and modifying the received logs of each node, and then sends them to Elasticsearch together with Inputs → Filters →
Outputs
. -->Filtering-->Output

inputs
File: Read from a file in the file system, similar to the tail -f command
Syslog: Listen to system log messages on port 514 and parse them according to the RFC3164 standard
Redis: read from redis service
Beats: read from filebeat
FILETER (filter)
Grok: Parses arbitrary text data. Grok is the most important plug-in for Logstash. Its main function is to convert text format strings into specific structured data and use it with regular expressions.
Officially provided grok expression: logstash-patterns-core/patterns at main · logstash-plugins/logstash-patterns-core · GitHub
Grok online debugging: Grok Debugger
Mutate: Convert fields. For example, delete, replace, modify, rename fields, etc.
Drop: Drop some events without processing.
Clone: ​​Copy the Event. Fields can also be added or removed during this process.
Geoip: Add geographical information (used for front-end kibana graphical display)
OUTPUTS
Elasticsearch: It can save data efficiently and query it conveniently and simply.
File: Save Event data to a file.
Graphite: Send event data to a graphical component, implementing a currently popular open source component for storing graphical display.

3、kibana

Function: Log visualization

A friendly web interface for Logstash and ElasticSearch to analyze based on collected and stored logs, which can help summarize, analyze and search important data logs.

3. Architecture type

1、ELK
    es
    logstash
    kibana

2、ELKK
    es
    logstash
    kafka
    kibana

3. ELFK
    es
    logstash is heavyweight and takes up more system resources.
    filebeat is lightweight and takes up less system resources
    . kibana

4、ELFKK
    es
    logstash
    filebeat
    kafka
    kibana

4. ELK log collection cluster experiment

1. Experimental topology

When conducting this experiment, at least 2 cores 4G are provided for each host. Otherwise hehe~~ you understand.

Download address https://elasticsearch.cn/download/

Set hostnames on node1 and node2

####分别修改主机名#####
###node1
hostnamectl set-hostname node1
echo "192.168.115.131 node1" "192.168.115.136 node2" >> /etc/hosts
scp /etc/hosts 192.168.115.136:/etc/hosts
bash
###node2
hostnamectl set-hostname node2
bash
######测试通联#######
###node1
ping node2
###node2
ping node1

2. Install elasticsearch on node1 and node2 nodes

2.1. First check the Java environment Java-version. If not, install yum install -y java-1.8.0-openjdk.

2.2. Install elasticsearch

As shown below, this is the installation package used in this experiment.

##安装elasticsearch
rpm -ivh elasticsearch-5.5.0.rpm
##配置
vim /etc/elasticsearch/elasticsearch.yml
###进去解开注释
cluster.name:my-elk-cluster  #集群名称   
node.name:node1              #节点名字
path.data: /var/lib/elasticsearch      #数据存放路径
path.logs:/var/log/elasticsearch/   #日志存放路径
bootstrap.memory_lock:false           #在启动的时候不锁定内存
network.host:192.168.115.131                  #提供服务绑定的IP地址,0.0.0.0代表所有地址
http.port:9200                        #侦听端口为9200
discovery.zen.ping.unicast.hosts:["node1","node2"] #群集发现通过单播实现
###同理安装node2的elasticsearch
###把这份配置文件传输给node2,修改一下节点名字和IP就好了
scp /etc/elasticsearch/elasticsearch.yml 192.168.115.136:/etc/elasticsearch/elasticsearch.yml

3. Start the elasticsearch service

3.1. Start the command systemctl start elasticsearch.service

node1

node2

3.2. Browser access to view node information

192.168.115.131:9200

192.168.115.136
:9200

Check the cluster health status: 192.168.115.131:9200/cluster/health

                                192.168.115.136:9200/cluster/health

Green health yellow warning red cluster unavailable, serious error

4. Install the elasticsearch-head plug-in on node1

####编译安装
cd elk

tar xf node-v8.2.1.tar.gz

cd node-v8.2.1

./configure && make && make install
###等待安装完毕。安装完毕后会生成命令:npm
###拷贝命令
cd elk
tar xf phantomjs-2.1.1-linux-x86_64.tar.bz2
cd phantomjs-2.1.1-linux-x86_64/bin
cp phantomjs  /usr/local/bin
##安装elasticsearch-head
cd elk
tar xf elasticsearch-head.tar.gz 
cd elasticsearch-head
npm install

###修改elasticsearch配置文件node1、2都要改
vim /etc/elasticsearch/elasticsearch.yml
 # Require explicit names when deleting indices:
#
#action.destructive_requires_name:true 
http.cors.enabled: true  //开启跨域访问支持,默认为false
http.cors.allow-origin: "*"  //跨域访问允许的域名地址

4.1. Restart the service: systemctl restart elasticsearch. Check whether port 9200 is up on both nodes.

###启动elasticsearch-head
cd /root/elk/elasticsearch-head
npm run start &
##查看监听: netstat -anput | grep :9100

4.2. Visit 192.168.115.136:9100

5. Test input

5.1、curl  -XPUT  '192.168.115.131:9200/index-demo/test/1?pretty&pretty' -H  'Content-Type: application/json' -d '{"user":"hy","mesg":"hello"}

Visit 192.168.115.131:9100 to see our test data "hy hello"

6. Install logstash on node1 server

cd  elk
rpm -ivh logstash-5.5.1.rpm
systemctl start logstash.service
In -s /usr/share/logstash/bin/logstash  /usr/local/bin/

Test 1: Standard input and output logstash -e 'input{ stdin{} }output { stdout{} }'

Test 2: Use rubydebug to decode logstash -e 'input { stdin{} } output { stdout{ codec=>rubydebug }}'

Test 3: Output to elasticsearch

logstash -e 'input { stdin{} } output { elasticsearch{ hosts=>["192.168.115.131:9200"]} }'

OK, the test is over, let’s check at 192.168.115.131:9100

Elasticsearch is already analyzing. This is the measured content and there is an additional index.

7. Logstash log collection file format (stored in /etc/logstash/conf.d by default)

7.1. Introduction

Logstash配置文件基本由三部分组成:input、output以及 filter(根据需要)。标准的配置文件格式如下:
input (...)  输入
filter {...}   过滤
output {...}  输出
在每个部分中,也可以指定多个访问方式。例如,若要指定两个日志来源文件,则格式如下:
input {
file{path =>"/var/log/messages" type =>"syslog"}
file { path =>"/var/log/apache/access.log"  type =>"apache"}
}

7.2. Configuration

Collect system information logs through logstash

##因为要收集日志,root用户可以操作,其他用户是没权限的,所以加个读取的权限,否则是收集不到日志的
chmod o+r /var/log/messages

vim /etc/logstash/conf.d/system.conf  
##system.conf是自定义的因为我搜集的是系统日志,若其他的应用的话可以取对应的名字来作为区分
##插入
input {
file{                          ##类型:文件
path =>"/var/log/messages"     ##系统日志文件路径
type => "system"               ##类型自定义
start_position => "beginning"
}
}
output {
elasticsearch{
hosts =>["192.168.115.131:9200"]   ##给谁处理
index => "system-%{+YYYY.MM.dd}" ##自定义索引
}
}

Restart the log service: systemctl restart logstash

Browser view 192.168.115.131:9100

8. Install kibana on node1 node

8.1. Installation and configuration

####安装kibana
cd elk
rpm -ivh kibana-5.5.1-x86_64.rpm
##配置kibana
vim /etc/kibana/kibana.yml
server.port:5601                          #Kibana打开的端口
server.host:"0.0.0.0"                     #Kibana侦听的地址
elasticsearch.url: "http://192.168.115.131:9200"  #和Elasticsearch 建立连接
kibana.index:".kibana"  #在Elasticsearch中添加.kibana索引

##启动kibana
systemctl start kibana

8.2. Access kibana

Visit 192.168.115.131:9100

Visit 192.168.115.131:5601 to add indexes. You can add the indexes in the picture above.

Here I add a system index

5. Configure http node

1. Configure the http service of 192.168.115.140

yum -y install httpd
systemctl start httpd
netstat -anput | grep 80

Visit httpd

2. Configure our logstash on this node to collect the access logs of our http server

http access log path/var/log/httpd/access_log

##安装logstash
rpm -ivh logstash-5.5.1.rpm

3. Modify the configuration file

vim /etc/logstash/conf.d/httpd.conf
###插入
input {
file{                          ##类型:文件
path =>"/var/log/httpd/access_log"     ##系统日志文件路径
type => "access"               ##类型自定义
start_position => "beginning"
}
}
output {
elasticsearch{
hosts =>["192.168.115.131:9200"]   ##给谁处理
index => "httpd-%{+YYYY.MM.dd}" ##自定义索引
}
}

Start log collection systemctl start logstash.service

Use the logstash command to import the configuration:

Create a soft connection In -s /usr/share/logstash/bin/logstash /usr/local/bin/

Import logstash -f /etc/logstash/conf.d/httpd.conf

4. Visit 192.168.115.131:9200

kibana 192.168.115.131:5601

Check the access log of httpd

OK, this is the end of the experiment

Guess you like

Origin blog.csdn.net/2302_78534730/article/details/132570979