前言、 Elastic Stack

　　Elastic Stack是ELK的官方称呼，网址：https://www.elastic.co/cn/products ，其作用是“构建在开源基础之上, Elastic Stack 让您能够安全可靠地获取任何来源、任何格式的数据，并且能够实时地对数据进行搜索、分析和可视化。”

它主要包括三个元件：

Beats + Logstash：采集任何格式，任何来源的数据。

　　　　Beats: Beats 是轻量型采集器的平台，从边缘机器向 Logstash 和 Elasticsearch 发送数据。

　　　　Beats 是数据采集的得力工具。将这些采集器安装在您的服务器中，它们就会把数据汇总到 Elasticsearch。如果需要更加强大的处理性能，Beats 还能将数据输送到 Logstash进行转换和解析。官方提供了多种现成的beats以针对不同协议的数据：

　　　　　　Filebeat：日志文件

　　　　　　Metricbeat：指标

　　　　　　Packagebeat: 网络数据

　　　　　　Winlogbeat： windows时间日志

　　　　　　Auditbeat: 审计日志

　　　　　　Heartbeat: 心跳日志

　　　　　beat实现可定制化：每款开源采集器都是以用于转发数据的通用库 libbeat 为基石。需要监控某个专用协议？您可以自己构建采集器。我们将为您提供所需的构建基块。

　　　　　Logstash: Logstash 是动态数据收集管道，拥有可扩展的插件生态系统，能够与 Elasticsearch 产生强大的协同作用。

　　　　　　Logstash 是开源的服务器端数据处理管道，能够同时从多个来源采集数据、转换数据，然后将数据发送到您最喜欢的 “存储库” 中。（我们的存储库当然是Elasticsearch。）

　　　　　　多种输入选择：数据往往以各种各样的形式，或分散或集中地存在于很多系统中。Logstash 支持各种输入选择，可以在同一时间从众多常用来源捕捉事件。能够以连续的流式传输方式，轻松地从您的日志、指标、Web 应用、数据存储以及各种 AWS 服务采集数据。

　　　　　　输出：尽管 Elasticsearch 是我们的首选输出方向，能够为我们的搜索和分析带来无限可能，但它并非唯一选择。Logstash 提供众多输出选择，您可以将数据发送到您要指定的地方，并且能够灵活地解锁众多下游用例。

　　　　　　过滤器：数据从源传输到存储库的过程中，Logstash 过滤器能够解析各个事件，识别已命名的字段以构建结构，并将它们转换成通用格式，以便更轻松、更快速地分析和实现商业价值。

　　　　　　可扩展：Logstash 采用可插拔框架，拥有 200 多个插件。您可以将不同的输入选择、过滤器和输出选择混合搭配、精心安排，让它们在管道中和谐地运行。您是从自定义应用程序采集数据？没有看到所需的插件？Logstash 插件很容易构建。我们有一个极好的插件开发 API 和插件生成器，可帮助您开始和分享您的创作。

Elasticsearch

　　　　Elasticsearch 是一个分布式的 RESTful 风格的搜索和数据分析引擎，能够解决不断涌现出的各种用例。作为 Elastic Stack 的核心，它集中存储您的数据，帮助您发现意料之中以及意料之外的情况。

　　　　可扩展性:原型环境和生产环境可无缝切换；无论 Elasticsearch 是在一个节点上运行，还是在一个包含 300 节点的集群上运行，您都能够以相同的方式与 Elasticsearch 进行通信。

　　　　速度：而且由于每个数据都被编入了索引，因此您再也不用因为某些数据没有索引而烦心。您可以用快到令人发指的速度使用和访问您的所有数据。

Kibana

　　　　Kibana 能够以图表的形式呈现数据，并且具有可扩展的用户界面，供您全方位配置和管理 Elastic Stack。

　　　　可视化与探索：Kibana 让您能够自由地选择如何呈现您的数据。或许您一开始并不知道自己想要什么。不过借助 Kibana 的交互式可视化，您可以先从一个问题出发，看看能够从中发现些什么。

　　　　多配件：Kibana 核心搭载了一批经典功能：柱状图、线状图、饼图、环形图，等等。它们充分利用了 Elasticsearch 的聚合功能。

部署准备

操作系统版本：CentOS6.9 64位（CentOS7及其他Linux系统都没有问题）
Elasticsearch版本：6.2.2 下载地址：https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.2.2.tar.gz

Kibana版本：6.2.2 下载地址：https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.2.2-linux-x86_64.tar.gz

filebeat版本：6.2.2 下载地址：https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.2.2-linux-x86_64.tar

logstash版本：6.2.2 下载地址：https://artifacts.elastic.co/downloads/logstash/logstash-6.2.2.tar.gz
这里默认你已经安装配置好了JDK1.8，如果没有的话请自行安装配置
注意，如无特殊说明以下操作都是在root用户下操作的（一开始的时候搞大数据，都是用普通用户+sudo权限操作的，然后发现好多权限问题，后来干脆全部root用户）

一、安装 elasticsearch

第一步：下载Elasticsearch6.2.2

[root@netcloud01 ~]# cd /data
[root@netcloud01 data]# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.2.2.tar.gz

第二步：解压Elasticsearch

[root@netcloud01 data]# tar -zxvf elasticsearch-6.2.2.tar.gz

第三步：创建Elasticsearch用户和所在组

Elasticsearch默认是不允许root用户运行的，不管是Elasticsearch5还是Elasticsearch6，否则的话Elasticsearch运行的时候会报错

1. 先创建es组，并指定组id为 503（我是搞大数据的，大数据所有用户及应用全部放在bigdata用户组内）

[root@netcloud01 data]# groupadd -g 503 bigdata

2. 创建es用户并指定所在组为 es

[root@netcloud01 data]# useradd -g es es

查看是否创建成功

[root@netcloud01 data]# id es
uid=500(es) gid=503(es) groups=503(es)

第四步：更改解压的Elasticsearch应用所属用户及组

[root@netcloud01 data]# chown -R es:es elasticsearch-6.2.2

来现在咱看下Elasticsearch的bin目录：
我们发现有很多.bat文件，这个是Windows下执行的脚本
看着就不爽，果断删除

第五步：执行脚本添加可执行权限

[root@netcloud01 bin]# chmod u+x *

第六步：修改配置文件

1、修改limits.conf配置文件（修改用户最大可创建文件数）

vim /etc/security/limits.conf

# End of file
# 添加如下内容

* soft nofile 1026420
* hard nofile 1026420
* soft nproc unlimited
* hard nproc unlimited

2、修改90-nproc.conf

vim /etc/security/limits.d/90-nproc.conf
# 修改成4096或者更大
*          soft    nproc     4096
root       soft    nproc     unlimited

3. 修改sysctl.conf

vim /etc/sysctl.conf

# 尾部添加下面的配置
vm.max_map_count=655360

# 然后执行命令
sysctl -p

4. 修改elasticsearch.yml

# 集群名称
cluster.name: BSDSJ

# 当前节点名称，这里配置为主机名
node.name: node-1

# Centos6不支持SecComp，
bootstrap.memory_lock: false
bootstrap.system_call_filter: false

# 默认的只能本机访问，修改后就可以远程访问了
network.host: 172.16.240.16

http.port: 9200
transport.tcp.port: 9500

bootstrap.memory_lock和bootstrap.system_call_filter默认为true而Centos6不支持SecComp，而Elasticsearch默认bootstrap.system_call_filter为true进行检测
会导致失败
上面的配置都要一个一个配置啊，不然后面你还要根据报错一个一个改

第七步：启动

从root用户切换到es用户，
然后在Elasticsearch根目录下执行

bin/elasticsearch

看到这个页面说明已经安装启动成功了，
不过还是要测试一下
Linux下输入curl命令测试一下

也可以编写一个启动程序文件在后台启动：

[root@netcloud01 bin]# vi startup.sh

#!/bin/
ES_HOME=/data/elasticsearch-6.2.2
nohup $ES_HOME/bin/elasticsearch  -v  >> $ES_HOME/logs/elasticsearch.log 2>&1 &

[root@netcloud01 bin]# chmod u+x startup.sh
[root@netcloud01 bin]# sh startup.sh

测试：

[root@netcloud01 elasticsearch-6.2.2]# curl 172.16.240.103:9200
{
  "name" : "node-1",
  "cluster_name" : "BSDSJ",
  "cluster_uuid" : "DX0_WhyITd23l2dc_nmlUw",
  "version" : {
    "number" : "6.2.2",
    "build_hash" : "10b1edd",
    "build_date" : "2018-02-16T19:01:30.685723Z",
    "build_snapshot" : false,
    "lucene_version" : "7.2.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

curl是CentOS自带的，如果你的系统上没有的话请自行安装
这里是通过主机名加端口号访问的，
我这里主机名已经映射了IP地址，所以也可以通过IP地址加端口号访问
下面通过浏览器访问试试
在浏览器输入地址：http://172.16.240.103:9200/

二：安装Kibana

1.获取kibana安装包，解压，进入目录

[root@netcloud01 data]# wget https://artifacts.elastic.co/downloads/kibana/kibana-6.2.2-linux-x86_64.tar.gz
[root@netcloud01 data]# tar -xzf kibana-6.2.2-linux-x86_64.tar.gz
[root@netcloud01 data]# cd kibana-6.2.2-linux-x86_64/

2.修改kibana.yml配置文件

添加如下配置项：
server.port: 5601　　　　  #端口
server.host: 172.16.240.103　　  #服务ip
server.name: "knode01"　　#服务名称
elasticsearch.url: "http://172.16.240.103:9200"　　#es集群url
elasticsearch.username: "kibana"　　　　 #kibana用户名
elasticsearch.password: "kibana"　　　　#kibana密码

3.启动验证

./bin/kibana

注意：未安装x-pack插件的kibana只具有基本功能，Monitoring、Graph等功能不能使用。

4.安装x-pack

在线安装：
[root@netcloud01 kibana-6.2.2-linux-x86_64]#  ./bin/kibana-plugin install x-pack
离线安装：
[root@netcloud01 kibana-6.2.2-linux-x86_64]#  wget https://artifacts.elastic.co/downloads/packs/x-pack/x-pack-6.2.2.zip
[root@netcloud01 kibana-6.2.2-linux-x86_64]#  ./bin/kibana-plugin install file:///data/x-pack-6.2.2.zip

#注意路径

5.重新启动验证(出现下面的报错信息)

[root@netcloud01 ~]#  ./bin/kibana

解决方法：

elasticsearch 6.3版本之前的添加认证需安装x-pack插件，6.3之后貌似去掉了这个。

1）去elasticsearch 目录安装安装X-Pack

[root@netcloud01 bin]# elasticsearch-plugin install file:///data/x-pack-6.2.2.zip

2）设置密码：

会对elasticsearch、logstash、kibana分别设置登录密码（默认es用户名为elastic，logstash用户名为logstash_system，kibana用户名为kibana）　

[es@netcloud01 bin]$  cd /data/elasticsearch-6.2.2/bin/x-pack
[es@netcloud01 x-pack]$ ./setup-passwords interactive

6、重新启动 kibana服务正常运行

[root@netcloud01 ~]#  ./bin/kibana >> /data/kibana-6.2.2-linux-x86_64/logs/kibana.log 2>&1 &

8、验证：http://172.16.240.103:5601

输入用户密码登录即可： username: kibana passwd: kibana

三、安装filebeat

1、为什么要使用filebeat？

谈到ELK，说起日志传输,首先想到的就是Logstash,Logstash主要的优点就是它的灵活性，这还主要因为它有很多插件。丰富的输入插件可以采集各种样式、大小和来源的数据。丰富的过滤插件可以实时解析和转换数据,能够动态地转换和解析数据，不受格式或复杂度的影响，Logstash 提供众多输出选择，您可以将数据发送到您要指定的地方，并且能够灵活地解锁众多下游用。

Logstash致命的问题是它的性能以及资源消耗（默认的堆大小是 1GB）。尽管它的性能在近几年已经有很大提升，与它的替代者们相比还是要慢很多的。如果服务器性能较差，并不推荐为每个服务器安装 Logstash ，这样就需要一个轻量的日志传输工具，将数据从服务器端经由一个或多个 Logstash 中心服务器传输到 Elasticsearch。

FileBeat介绍

Filebeat是一个轻量级的日志传输工具，它的存在正弥补了,Logstash的缺点：Filebeat作为一个轻量级的日志传输工具可以将日志推送到Logstash 或者缓冲池中。

2、安装filebeat


[root@netcloud01 ~]# cd /data

[root@netcloud01 ~]# wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.2.2-linux-x86_64.tar

[root@netcloud01 ~]# tar -zxvf filebeat-6.2.2-linux-x86_64.tar
[root@netcloud01 ~]# ll
-rwxr-xr-x 1 root root 11354200 Sep 15  2016 filebeat
-rw-r--r-- 1 root root      814 Sep 15  2016 filebeat.template.json
-rw-r--r-- 1 root root    17335 Aug 13  2018 filebeat.yml
drwxr-xr-x 2 root root     4096 Dec 18 21:43 logs
-rw-r--r-- 1 root root      153 Sep  1  2017 startup.sh

Filebeat的配置很简单，只需要配置监控的日志文件的全路径和输出端的IP，一般默认（默认配置文件filebeat.yml）是输出到ElasticSearch，也可以输出到kafka，在这里我们输出到logstash

3、配置

input：

#=========================== Filebeat prospectors =============================

filebeat.prospectors:

# Each - is a prospector. Most options can be set at the prospector level, so
# you can use different prospectors for various configurations.
# Below are the prospector specific configurations.

- type: log

  # Change to true to enable this prospector configuration.
  #设置为true使该配置生效
  enabled: true

  #收集日志路径,多个往下继续写就可以了
  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    #- /var/log/*.log
    - /data/bigdata_web.log

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

  ### Multiline options

output：注释掉elasticsearch 默认输出

#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
  # Array of hosts to connect to.
  #hosts: ["localhost:9200"]

  # Optional protocol and basic auth credentials.
  #protocol: "https"
  #username: "elastic"
  #password: "changeme"

#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["172.16.240.103:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

4、启动：

编写启动文件：startup.sh

vi startup.sh

#!/bin/
FILEBEAT_HOME=/data/filebeat-6.2.2-linux-x86_64
nohup $FILEBEAT_HOME/filebeat -c $FILEBEAT_HOME/filebeat.yml > $FILEBEAT_HOME/logs/filebeat.log 2>&1 &

chmod u+x startup.sh

sh startup.sh

如果是集成kafka配置如下：

这里我们不直接输出到ElasticSearch，而是kafka，所以需要配置 beat-kafka.yml，输入端配置如下

输出端配置

因为是输出到kafka，所以需要制定kafka的topic

Kafka的安装

在Filebeat已经配置输出端为kafka的配置，这里kafaka不需要配置任何东西，解压直接用，以下是kafka初用的几个必须的命令,具体查看kafka的官方API

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning

现在启动Filebeat:

./filebeat -e -c beat-kafka.yml -d "publish"

这里需要说明一点，不同的输出端可以配置不同的.yml文件，所以这里的 beat-kafka.yml文件是kafka对应的配置
以上命令输出没有报错并有如下信息

且可以看到你的监控文件的log的信息字段证明不存在问题
这是kafka的consumer会有如下信息

message就是文件的具体信息

四、安装logstash

1、安装


[root@netcloud01 ~]# cd /data
[root@netcloud01 ~]# https://artifacts.elastic.co/downloads/logstash/logstash-6.2.2.tar.gz

2、配置

[root@netcloud01 logstash-6.2.2]# vi bigdata-web-log-pipeline.conf

input {
    beats {
        port => "5044"
    }
}
filter {
    grok {
        remove_field => [ "@version", "@timestamp", "beat", "offset", "type", "source", "input_type", "count", "fields", "host", "tags" ]
        patterns_dir => ["./patterns"]
        match => {
            "message" => "(?<request_time>.*) INFO \[IP\[%{IP:client_ip}\], Account\[username:(?<username>.*), code:(?<code>.*), name:(?<name>.*)\], URL\[(?<request_url>.*)\], Request\[method:(?<request_method>.*), headers:(?<request_header>.*), params:(?<request_param>.*)\], Response\[headers:(?<response_header>.*), code:(?<response_code>.*)\], Cost\[(?<cost_millis>.*)ms\]\].*"
        }
    }
}
output {
    elasticsearch {
        hosts => [ "172.16.140.103:9200" ]
        index => "bigdata_web_log"
    }
}

3、编写启动程序

[root@netcloud01 bin]# vi startup.sh

#!/bin/
LOGSTASH_HOME=/data/logstash-6.2.2
nohup $LOGSTASH_HOME/bin/logstash -f $LOGSTASH_HOME/bigdata-web-log-pipeline.conf -v >> $LOGSTASH_HOME/logs/logstash.log 2>&1 &

授权执行权限：

chmod u+x startup.sh

4、启动

 [root@netcloud01 bin]# sh startup.sh

下面的配置是将日志文件输出到kafka 然后到logstash 再到ElasticSearch

Logstash也不要做太多的配置，从搭建Filebeat和Kafka时，已经实现了数据的流通，现在要完成的事是然流到kafka的数据流到logstash，logtash也需要输出，将ElasticSearch作为输出端

对于kafka不同的数据建议logstash用不同的通道去接
一下是配置的两个不同的通道

启动logstash

bin/logstash -f first-pipeline.conf --config.reload.automatic --path.data=/home/hadoop/logstash

--path.data=/home/hadoop/logstash是指存放数据的路径

小飞猪666

发布了110 篇原创文章 · 获赞 22 · 访问量 7万+

私信关注

ELK6.2.2 版本 (Logstash＋Elasticsearch＋Kibana+filebeat)的原理和详细搭建