filebeat installation and deployment

A brief overview

　　ELK do understand recent log acquisition-related content, this article mainly on collecting logs to achieve through filebeat. Log collection tool there are many, such as fluentd, flume, logstash, betas, and so on. You must first know why you want to use filebeat it? Because logstash jvm is running, resource consumption is relatively large, start a logstash would need to consume about 500M of memory, while filebeat only 10 to M memory resources. ELK common log collection program, most of the contents of the log approach is sent to all nodes through the message queue kafka filebeat, then read the message queue using the cluster logstash content filter based on the configuration file. Then transported to the file after filtering elasticsearch by kibana to show.

filebeat Introduction

　　Filebeat composed of two major parts: prospector and harvesters. These components work together to read the file and event data sent to your specified output.

What are the harvesters?
　　harvesters responsible for reading the contents of a single file. harvesters reads each file line by line, and sends the content to the output. Each file will start a harvesters. harvesters responsible for the file to open and close, which means harvesters run, the file remains open. If the collection process, even if you delete this file or rename the file, Filebeat will still continue to read this document, this time would have been occupied by the file corresponding to the disk space until Harvester closed. By default, Filebeat will always remain on file until close_inactive parameter exceeds the configured, Filebeat will put Harvester closed.

Close Harvesters influence will bring:
　　File Handler will be closed, if closed before Harvester, read files have been deleted or renamed, this time before the release of the occupied disk resources.
　　When the time arrives scan_frequency parameter configuration, it will restart the file contents of the collection.
　　If you turn off later in Havester, move or delete a file, when Havester started again, the file data will not be collected.
　　When you need to shut down Harvester, it can be controlled by close_ * configuration items.

What is Prospector?

　　Prospector is responsible for managing Harvsters, and find all the data sources need to be read. If the input type configuration is log type, Prospector will find the path to the configuration of all matches on the file, and then create a Harvster for each file. Prospector are each running in its own Go routine inside.

　　Filebeat Prospector currently supports two types: log and stdin. Prospector Each type can be defined in a plurality of profiles. log Prospector will check whether each file need to start Harvster, whether to start the Harvster still running, or if the file is ignored (can be configured ignore_order, files ignored). If the file is in the process of Filebeat run the newly created long after Harvster closed, the file size has changed, the new file will be selected to Prospector.

filebeat works

　　Filebeat can keep the status of each file, and frequently update from the state registry file to disk. It mentioned here is used to document the state of a file to be read when reading the recording position Harvster, able to ensure that all the log data are read out, and then sent to the output. If at some point, as the output of ElasticSearch or Logstash become unavailable, Filebeat will save the file last read position down until the output again when available, to recover quickly read the file data. During the Filebaet operation, each Prospector status information will be stored in memory. If after Filebeat travel the restart, reboot will restore the state information from the registry file before restarting, let FIlebeat continues from the previously known position to begin data reading.

Prospector will maintain state information for each file found. Because files can rename or change the path, file name and path is not sufficient to identify the file. For Filebeat, it is achieved by a unique identifier has been stored to determine whether the documents before been collected.

　　If your usage scenarios, will generate a lot of new files every day, you will find Filebeat registry files become very large. This time, you can refer to ( at The Section Called "Registry File IS TOO Large? Edit ), to solve this problem.

Installation filebeat Service

Download and install the key file

rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch

Yum create source files

[root@localhost ~]# vim /etc/yum.repos.d/elk-elasticsearch.repo
[elastic-5.x]
name=Elastic repository for 5.x packages
baseurl=https://artifacts.elastic.co/packages/5.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

start installation

yum install filebeat

Start Service

systemctl start filebeat
systemctl status filebeat

Collecting logs

Here we first collect docker logs, for example, simply to explain how the writing filebeat profile. Details are as follows:

[root@localhost ~]# grep "^\s*[^# \t].*$" /etc/filebeat/filebeat.yml 
filebeat.prospectors:
- input_type: log
  paths:
    - /var/lib/docker/containers/*/*.log
output.elasticsearch:
  hosts: ["192.168.58.128:9200"]

And we look like, and in fact there is not much content. We collect /var/lib/docker/containers/*/*.log, namely filebeat log of all containers where the nodes. Output position is that we ElasticSearch service address, where we will log directly delivered to the ES, but not by Logstash transit.

Before restarting, we also need to submit a filebeat index template to the ES, so that log data filebeat output elasticsearch know what properties and fields are included. filebeat.template.json After you install this file there, without having to write your own, you can not find students can look through the find. Load Template to elasticsearch in:

[root@localhost ~]# curl -XPUT 'http://192.168.58.128:9200/_template/filebeat?pretty' -d@/etc/filebeat/filebeat.template.json
{
  "acknowledged" : true
}

Restart Service

systemctl restart filebeat

Tip: If you start is a filebeat container needs to be / var / lib / docker / containers directory is mounted to the container;

Kibana arrangement

If the above configuration is no problem, you can access Kibana, but here you need to add a new index pattern. Manual in accordance with the requirements for the transport of logs filebeat our index name or pattern should be filled in as: "filebeat- *".

filebeat installation and deployment

Guess you like