ELK (ElasticSearch+Logstash+ Kibana) builds a real-time log analysis platform

1. Introduction

      ELK consists of three parts: elasticsearch, logstash, and kibana. Elasticsearch is a near real-time search platform that makes it possible for you to process big data at an unprecedented speed.

      None of the technologies involved in Elasticsearch are innovative or revolutionary, full-text search, analytics systems, and distributed databases already exist. It is revolutionary in integrating these independent and useful technologies into an all-in-one, real-time application. Elasticsearch is document oriented, which means it can store entire objects or documents. However, it doesn't just store, it indexes the content of each document so that it can be searched. In Elasticsearch, you can index, search, sort, and filter documents (not rows and columns of data). This way of understanding data is completely different than it used to be, which is one of the reasons why Elasticsearch is able to perform complex full-text searches.

      Most of the application's logs are output in the server's log file. Most of these logs are viewed by developers, and then developers do not have permission to log in to the server. If developers need to view the logs, they need to go to the server to get the logs. Then hand it over to the development; imagine that a company has 10 developers, and a developer asks the operation and maintenance to get a log every day, which is a lot of work for the operation and maintenance personnel, which greatly affects the work efficiency of the operation and maintenance. After ELKstack, developers can directly log in to Kibana to view logs, and do not need to view logs through operation and maintenance, which reduces the work of operation and maintenance.

      There are many types of logs, and they are scattered in different locations and are difficult to find: if there is an access failure on the LAMP/LNMP website, you may need to query the log to analyze the cause of the failure. If you need to view the apache error log, you need to log in to the Apache server. Check, if you want to view the database error log, you need to log in to the database to query, imagine, if it is a cluster environment with dozens of hosts? At this time, if ELKstack is deployed, you can log in to the Kibana page to view the logs. To view different types of logs, you only need to switch the index with the electric mouse.

Logstash: A log collection tool that can collect various logs from local disks, network services (listen on ports, accept user logs), and message queues, then filter and analyze them, and output the logs to Elasticsearch.

Elasticsearch: A log distributed storage/search tool that natively supports clustering functions. It can generate an index for logs at a specified time to speed up log query and access.

Kibana: A visual log web display tool that displays logs stored in Elasticsearch, and can also generate dazzling dashboards.

 

2. Installation and deployment (Because I am a test environment, I installed ElasticSearch+Logstash+ Kibana on a virtual machine)

install jdk

rpm -ivh jdk-8u92-linux-x64.rpm
vi /etc/profile
JAVA_HOME=/usr/java/jdk1.8.0_92/

source /etc/profile

echo $JAVA_HOME   
/usr/java/jdk1.8.0_92/

java -version
java version "1.8.0_92"
Java(TM) SE Runtime Environment (build 1.8.0_92-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)

Install elasticsearch

rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
添加yum文件
echo "
[elasticsearch-2.x]
name=Elasticsearch repository for 2.x packages
baseurl=http://packages.elastic.co/elasticsearch/2.x/centos
gpgcheck=1
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1" >> /etc/yum.repos.d/elasticsearch.repo
yum install elasticsearch -y

mkdir /data/elk/{data,logs}

Type Description Location
home elasticsearch installation directory {extract.path}
bin elasticsearch binary script directory {extract.path}/bin
conf configuration file directory {extract.path}/config
data data directory {extract.path}/data
logs log directory{ extract.path}/logs
plugins plugin directory {extract.path}/plugin

Configuration instructions:
vi /etc/elasticsearch/elasticsearch.yml
cluster.name: es
path.data: /data/elk/data
path.logs: /data/elk/logs
bootstrap.mlockall: true
network.host: 0.0.0.0
http .port: 9200
discovery.zen.ping.unicast.hosts: ["192.168.2.215", "host2"]
start:
/etc/init.d/elasticsearch start

http://192.168.2.215:9200/

There are two configuration files in elasticsearch's config folder: elasticsearch.yml and logging.yml,

The first is the basic configuration file of es, the second is the log configuration file, and es also uses log4j to record logs, so the settings in logging.yml are set according to the ordinary log4j configuration file. The following mainly explains the configurable things in the elasticsearch.yml file.
cluster.name
: The cluster name of elasticsearch configuration es, the default is elasticsearch, es will automatically discover es under the same network segment, if there are multiple clusters under the same network segment, you can use this attribute to distinguish different clusters.
node.name: "FranzKafka"
node name. By default, a name in the name list is randomly assigned. The list is in the name.txt file in the config folder of the es jar package, and there are many interesting names added by the author.
node.master: true
specifies whether the node is eligible to be elected as a node, the default is true, es is the first machine in the default cluster as the master, if the machine hangs up, the master will be re-elected.
node.data:true
specifies whether the node stores index data, the default is true.
index.number_of_shards: 5
sets the default number of index shards, which is 5 by default.
index.number_of_replicas: 1
sets the default number of index replicas, the default is 1 replica.
path.conf:/path/to/conf
sets the storage path of the configuration file. The default is the config folder in the es root directory.
path.data:/path/to/data
Set the storage path of the index data. The default is the data folder in the es root directory. You can set multiple storage paths, separated by commas, for example:
path.data:/path/to/data1,/path/to/data2
path .work:/path/to/work
Sets the storage path of temporary files. The default is the work folder in the es root directory.
path.logs:/path/to/logs
sets the storage path of log files, the default is the logs folder in the es root directory
path.plugins:/path/to/plugins
sets the storage path of the plug-ins, the default is in the es root directory The plugins folder
bootstrap.mlockall:true
is set to true to lock the memory. Because the efficiency of es will be reduced when the jvm starts swapping, to ensure that it does not swap, you can set the two environment variables ES_MIN_MEM and ES_MAX_MEM to the same value, and ensure that the machine has enough memory allocated to es. At the same time, it is necessary to allow the elasticsearch process to lock the memory. Under Linux, the `ulimit-lunlimited` command can be used.
network.bind_host: 192.168.0.1
sets the bound IP address, which can be ipv4 or ipv6, and the default is 0.0.0.0. network.publish_host:192.168.0.1
Set the ip address of other nodes interacting with this node. If it is not set, it will be automatically judged, and the value must be a real ip address.
The parameter network.host:192.168.0.1
is used to set the above two parameters of bind_host and publish_host at the same time.
transport.tcp.port: 9300
Sets the tcp port for interaction between nodes, the default is 9300.
transport.tcp.compress:true
Set whether to compress the data during tcp transmission, the default is false, no compression.
http.port:9200
Sets the http port for external services, the default is 9200.
http.max_content_length: 100mb
sets the maximum capacity of the content, the default is 100mb
http.enabled: false
whether to use the http protocol to provide services to the outside world, the default is true, enabled.
gateway.type:
The type of local gateway, the default is local, which is the local file system. It can be set to the local file system, distributed file system, HDFS of hadoop, and s3 server of amazon. The setting method of other file systems will be detailed next time. Say.
gateway.recover_after_nodes: 1
Sets data recovery when N nodes in the cluster are started, the default is 1.
gateway.recover_after_time: 5m
Sets the timeout time for initializing the data recovery process. The default is 5 minutes.
gateway.expected_nodes: 2
sets the number of nodes in this cluster, the default is 2, once the N nodes are started, data recovery will be performed immediately.
cluster.routing.allocation.node_initial_primaries_recoveries: 4
When initializing data recovery, the number of concurrent recovery threads, the default is 4.
cluster.routing.allocation.node_concurrent_recoveries: 2
The number of concurrent recovery threads when adding and deleting nodes or load balancing, the default is 4.
indices.recovery.max_size_per_sec:0
sets the limited bandwidth during data recovery, such as 100mb, the default is 0, that is, unlimited.
indices.recovery.concurrent_streams:5
Set this parameter to limit the maximum number of concurrent streams opened at the same time when recovering data from other shards. The default is 5.
discovery.zen.minimum_master_nodes:1
Set this parameter to ensure that nodes in the cluster can know about other N master-qualified nodes. The default is 1. For large clusters , a larger value (2-4) can be
set
Poor network environments can use higher values ​​to prevent errors during automatic discovery.
discovery.zen.ping.multicast.enabled:false
Sets whether to enable multicast discovery nodes, the default is true.
discovery.zen.ping.unicast.hosts:["host1","host2:port","host3[portX-portY]"]
Sets the initial list of master nodes in the cluster, and these nodes can be used to automatically discover the newly added cluster node

Install the head plugin (cluster management plugin)

cd /usr/share/elasticsearch/bin/
./plugin install mobz/elasticsearch-head
ll /usr/share/elasticsearch/plugins/head
http://192.168.2.215:9200/_plugin/head/


Install kopf plugin (cluster resource view and query plugin)
/usr/share/elasticsearch/bin/plugin install lmenezes/elasticsearch-kopf
http://192.168.2.215:9200/_plugin/kopf

启动elasticearch
/etc/init.d/elasticsearch start

Installing kibana
Kibana is essentially the elasticsearch web client, an analysis and visualization elasticsearch platform that allows you to search, view and interact with indexes stored in elasticsearch through kibana. It is easy to perform advanced data analysis and visualize data in multiple formats, such as charts, tables, maps, etc.

Discover page: Interactive browsing of data. Every document in every index of the matching index pattern can be accessed. Can submit search queries, filter search results and view document data. You can also search query-matched document data and statistics on field values. You can also select the time and refresh frequency
https://download.elastic.co/kibana/kibana/kibana-4.5.1-linux-x64.tar.gz
tar zxvf kibana-4.5.1-linux-x64.tar.gz
mv kibana-4.5.1-linux-x64 /usr/local/
vi /etc/rc.local
/usr/local/kibana-4.5.1-linux-x64/bin/kibana > /var/log/kibana.log 2 >&1 &
vi /usr/local/kibana-4.5.1-linux-x64/config/kibana.yml
server.port: 5601
server.host: "192.168.2.215" elasticsearch.url:
" http://192.168.2.215 :9200 "

Convert nginx logs to json

vim /usr/local/nginx/conf/nginx.conf
log_format access1 '{"@timestamp":"$time_iso8601",'
        '"host":"$server_addr",'
        '"clientip":"$remote_addr",'
        '"size":$body_bytes_sent,'
        '"responsetime":$request_time,'
        '"upstreamtime":"$upstream_response_time",'
        '"upstreamhost":"$upstream_addr",'
        '"http_host":"$host",'
        '"url":"$uri",'
        '"domain":"$host",'
        '"xff":"$http_x_forwarded_for",'
        '"referer":"$http_referer",'
        '"status":"$status"}';
    access_log  /var/log/nginx/access.log  access1;

reload nginx

/usr/local/nginx/sbin/nginx -s reload

Installing logstash
in logstash includes three stages:
input input --> processing filter (not necessary) --> output output

rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
echo "
[logstash-2.1]
name=Logstash repository for 2.1.x packages
baseurl=http://packages.elastic.co/logstash/2.1 /centos
gpgcheck=1
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1" >> /etc/yum.repos.d/logstash.repo
yum install logstash -y
verify logstash by configuring input and output
vim /etc/logstash/conf.d/stdout.conf
input {
        stdin {}
}

output {
        stdout {
                codec => "rubydebug"
        }
}

vim /etc/logstash/conf.d/logstash.conf
input {
        stdin {}
}
input {
        stdin {}
 }
output {
        elasticsearch {
                hosts => ["192.168.2.215:9200"]
                index => "test"
        }
}

http://192.168.2.215:9200/_plugin/head/

vim /etc/logstash/conf.d/logstash.conf
output {
        elasticsearch {
                hosts => ["192.168.2.215:9200"]
                index => "test"
        }
input {
        file {
          type => "messagelog"
          path => "/var/log/messages"
          start_position => "beginning"
        }
}
output {
        file {
          path => "/tmp/123.txt"
        }
        elasticsearch {
                hosts => ["192.168.2.215:9200"]
                index => "system-messages-%{+yyyy.MM.dd}"
        }
}

Check configuration file syntax
/etc/init.d/logstash configtest

vim /etc/init.d/logstash
LS_USER=root
LS_GROUP=root

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326373625&siteId=291194637