K8S+DevOps Architect Practical Course | Practice using EFK to realize business log collection

Video source: Station B "Docker&k8s Tutorial Ceiling, Absolutely the best one taught by Station B, this set of learning k8s to get all the core knowledge of Docker is here"

Organize the teacher's course content and test notes while studying, and share them with everyone. Any infringement will be deleted. Thank you for your support!

Attach a summary post: K8S+DevOps Architect Practical Course | Summary


EFK Architecture Workflow

  • Elasticsearch is an open-source distributed, Restful-style search and data analysis engine, whose bottom layer is the open-source library Apache Lucene. It can be accurately described as follows: a distributed real-time document storage, each field can be indexed and searched; a distributed real-time analysis search engine; capable of expanding hundreds of service nodes, and supports PB-level structure structured or unstructured data.
  • Kibana Kibana is an open source analytics and visualization platform designed to work with Elasticsearch. Kibana can be used to search, view, and interact with data stored in Elasticsearch indexes. Advanced data analysis can also be easily performed, and data can be visualized in various charts, tables, and maps.
  • Fluentd  is a log collection, processing, and forwarding system. Through a rich plug-in system, logs from various systems or applications can be collected, converted into a format specified by the user, and forwarded to the log storage system specified by the user. Fluentd grabs log data from a given set of data sources, processes it (converts it into a structured data format), and forwards them to other services, such as Elasticsearch, object storage, kafka, etc. Fluentd supports more than 300 log storage and analysis services, so it is very flexible in this regard. The main operation steps are as follows. First, Fluentd obtains data from multiple log sources and tags the data, and then sends the data to multiple target services according to the matching tags.

Fluentd Intensive Speaking

Fluentd Architecture

Why is it recommended to use fluentd as a log collection tool for the k8s system?

  • Pluggable architecture design

  • Very small resource occupation Based on C and Ruby languages, 30-40MB, 13,000 events/second/core

  • Extreme reliability In-memory and local file-based caching Powerful failover

Life cycle and instruction configuration of fluentd event flow

Input -> filter 1 -> ... -> filter N -> Buffer -> Output

start command

$ fluentd -c fluent.conf

Instruction introduction:

  • source  , data source, corresponding to Input Use the source command to select and configure the required input plug-in to enable the Fluentd input source, and source submits the event to the routing engine of fluentd. Use type to distinguish different types of data sources. The following configuration can monitor the additional input of the specified file:
<source>
  @type tail
  path /var/log/httpd-access.log
  pos_file /var/log/td-agent/httpd-access.log.pos
  tag myapp.access
  format apache2
</source>
  • filter, Event processing pipeline (event processing flow) filter can be connected in series to form a pipeline, serially process the data, and finally hand it over to match for output. Event content can be processed as follows: <source> @type http port 9880 </source> <filter myapp.access> @type record_transformer <record> host_param “#{Socket.gethostname}” </record> </filter> filter After getting the data, call the built-in @type record_transformer plug-in, insert a new field host_param into the event record, and then hand it over to match for output.
  • The label command can specify @label in the source, and the event triggered by this source will be sent to the task contained in the specified label, and will not be acquired by other subsequent tasks. <source>@type forward</source><source>### This task specifies the label as @SYSTEM### will be sent to <label @SYSTEM>### and will not be sent to the following filter and match@type tail@label @SYSTEMpath /var/log/httpd-access.logpos_file /var/log/td-agent/httpd-access.log.postag myapp.accessformat apache2</source><filter access. >@type record_transformer<record># …</record></filter><match  >@type elasticsearch# …</match><label @SYSTEM>### will receive the source event of @type tail above<filter var.log .middleware. >@type grep# …</filter><match  >@type s3# …</match></label>
  • match, the match output looks for events that match "tags", and processes them. The most common use of the match command is to output events to other systems (thus, the plugin corresponding to the match command is called an "output plugin") <source> @type http port 9880 </source> <filter myapp.access> @type record_transformer <record> host_param “#{Socket.gethostname}” </record> </filter> <match myapp.access> @type file path /var/log/fluent/access </match>

The structure of the event:

time: the processing time of the event

tag: the source of the event, configured in fluentd.conf

record: real log content, json object

For example, the following original log:

192.168.0.1 - - [28/Feb/2013:12:00:00 +0900] "GET / HTTP/1.1" 200 777

After being processed by the fluentd engine, it may look like this:

2020-07-16 08:40:35 +0000 apache.access: {"user":"-","method":"GET","code":200,"size":777,"host":"192.168.0.1","path":"/"}

fluentd's buffer event buffer model

Input -> filter 1 -> ... -> filter N -> Buffer -> Output

Because the data volume of each event is usually very small, considering the reasons of data transmission efficiency and stability, it is basically not written to the output end immediately after each event is processed, so fluentd has established a buffer model, the main part of which is There are two concepts:

  • buffer_chunk: The event buffer block, which is used to store the events that have been processed locally and are to be sent to the destination. The size of each block can be set.
  • buffer_queue: the queue to store chunks, the length can be set

The parameters that can be set mainly include:

  • buffer_type, buffer type, can be set to file or memory
  • buffer_chunk_limit, the size of each chunk block, the default is 8MB
  • buffer_queue_limit, the maximum length of the chunk block queue, the default is 256
  • flush_interval, the time interval for flushing a chunk
  • retry_limit, the number of retries for chunk block sending failures, the default is 17 times, and then the chunk data will be discarded
  • retry_wait, the time interval for retrying to send chunk data, the default is 1s, if the second failure is resent, the interval is 2s, the next time is 4 seconds, and so on

The general process is:

As fluentd events are continuously generated and written to the chunk, the cache block keeps getting larger. When the cache block meets the buffer_chunk_limit size or the new cache block is born beyond the flush_interval time interval, it will be pushed to the end of the cache queue. The size of the queue is determined by the buffer_queue_limit .

Every time a new chunk is enqueued, the chunk block at the front of the queue will be written to the configured storage backend immediately. For example, if Kafka is configured, the data will be pushed into Kafka immediately.

The ideal situation is that every time a new cache block enters the cache queue, it will be written to the backend immediately. At the same time, the new cache block will continue to be enqueued, but the enqueue speed will not be faster than the dequeue speed. In this way, the cache queue is basically empty, and there is at most one cache block in the queue.

However, considering the network and other factors in the actual situation, there will often be delays or write failures when the cache block is written to the backend storage. When the cache block fails to be written to the backend, the cache block will remain in the queue, etc. Retry sending after retry_wait time, when the number of retry reaches retry_limit, the cache block is destroyed (data is discarded).

At this time, the cache queue continues to have new cache blocks coming in. If there are many cache blocks in the queue that have not been written to the backend storage in time, when the queue length reaches the buffer_queue_limit size, the new event will be rejected, and fluentd will report an error, error_class=Fluent ::Plugin::Buffer::BufferOverflowError error="buffer space has too many data".

Another situation is that the network transmission is slow. If a new block is generated every 3 seconds, but it takes 30s to write to the backend, and the queue length is 100, then the time for each block to be dequeued, Another 10 new blocks come in, then the queue will soon be full, causing an exception to occur.

Practice 1: Realize the collection and field analysis of business application logs

Objective: Collect the access.log logs of the nginx application in the container, and parse the log fields into JSON format. The original log format is:

$ tail -f access.log
...
53.49.146.149 1561620585.973 0.005 502 [27/Jun/2019:15:29:45 +0800] 178.73.215.171 33337 GET https

Collect and process into:

{
    "serverIp": "53.49.146.149",
    "timestamp": "1561620585.973",
    "respondTime": "0.005",
    "httpCode": "502",
    "eventTime": "27/Jun/2019:15:29:45 +0800",
    "clientIp": "178.73.215.171",
    "clientPort": "33337",
    "method": "GET",
    "protocol": "https"
}

Ideas:

  • Configure fluent.conf to use the @tail plug-in to analyze the nginx log format by monitoring the access.log file and using filter
  • Start the fluentd service
  • Manually append content to the access.log file
  • Observe whether the local output meets expectations

fluent.conf

<source>
      @type tail
      @label @nginx_access
      path /fluentd/access.log
      pos_file /fluentd/nginx_access.posg
      tag nginx_access
      format none
      @log_level trace
</source>
<label @nginx_access>
   <filter  nginx_access>
      @type parser
      key_name message
      format  /(?<serverIp>[^ ]*) (?<timestamp>[^ ]*) (?<respondTime>[^ ]*) (?<httpCode>[^ ]*) \[(?<eventTime>[^\]]*)\] (?<clientIp>[^ ]*) (?<clientPort>[^ ]*) (?<method>[^ ]*) (?<protocol>[^ ]*)/
   </filter>
   <match  nginx_access>
     @type stdout
   </match>
</label>

Start the service and append the file content:

$ docker run -u root --rm -ti 172.21.51.67:5000/fluentd_elasticsearch/fluentd:v2.5.2 sh
/ # cd /fluentd/
/ # touch access.log
/ # fluentd -c /fluentd/etc/fluent.conf
/ # echo '53.49.146.149 1561620585.973 0.005 502 [27/Jun/2019:15:29:45 +0800] 178.73.215.171 33337 GET https' >>/fluentd/access.log

Use this website for regex validation:  http://fluent.herokuapp.com

Practice 2: Use ruby ​​to realize the conversion and custom processing of log fields

<source>
      @type tail
      @label @nginx_access
      path /fluentd/access.log
      pos_file /fluentd/nginx_access.posg
      tag nginx_access
      format none
      @log_level trace
</source>
<label @nginx_access>
   <filter  nginx_access>
      @type parser
      key_name message
      format  /(?<serverIp>[^ ]*) (?<timestamp>[^ ]*) (?<respondTime>[^ ]*) (?<httpCode>[^ ]*) \[(?<eventTime>[^\]]*)\] (?<clientIp>[^ ]*) (?<clientPort>[^ ]*) (?<method>[^ ]*) (?<protocol>[^ ]*)/
   </filter>
   <filter  nginx_access>   
      @type record_transformer
      enable_ruby
      <record>
       host_name "#{Socket.gethostname}"
       my_key  "my_val"
       tls ${record["protocol"].index("https") ? "true" : "false"}
      </record>
   </filter>
   <match  nginx_access>
     @type stdout
   </match>
</label>

ConfigMap configuration file mounting usage scenarios

Before we start, let's review the common mounting scenarios of configmap.

Scenario 1: Mount a single file to an empty directory

If the business application has a configuration file named application-1.conf, if you want to mount this configuration to the /etc/application/ directory of the pod.

The content of application-1.conf is:

$ cat application-1.conf
name: "application"
platform: "linux"
purpose: "demo"
company: "luffy"
version: "v2.1.0"

The configuration file can be managed through configmap in k8s. Usually we have the following two ways to manage the configuration file:

  • Generate configmap through kubectl command line
# 通过文件直接创建
$ kubectl -n default create configmap application-config --from-file=application-1.conf

# 会生成配置文件,查看内容,configmap的key为文件名字
$ kubectl -n default get cm application-config -oyaml
  • Create directly through the yaml file
$ cat application-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: application-config
  namespace: default
data:
  application-1.conf: |
    name: "application"
    platform: "linux"
    purpose: "demo"
    company: "luffy"
    version: "v2.1.0"

# 创建configmap
$ kubectl apply -f application-config.yaml

Prepare a demo-deployment.yaml file and mount the above configmap to /etc/application/

$ cat demo-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo
  namespace: default
spec:
  selector:
    matchLabels:
      app: demo
  template:
    metadata:
      labels:
        app: demo
    spec:
      volumes:
      - configMap:
          name: application-config
        name: config
      containers:
      - name: nginx
        image: nginx:alpine
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - mountPath: "/etc/application"
          name: config

Create and view:

$ kubectl apply -f demo-deployment.yaml

Modify the content of the configmap file and observe whether the pod automatically senses the change:

$ kubectl edit cm application-config
The entire configmap file is directly mounted to the pod. If the configmap changes, the pod will automatically sense and pull it into the pod.
However, the process in the pod will not automatically restart, so many services will implement an internal reload interface to load the latest configuration file into the process.

Scenario 2: Mount multiple files

If there are multiple configuration files, they all need to be mounted inside the pod, and all of them are in one directory

$ cat application-1.conf
name: "application-1"
platform: "linux"
purpose: "demo"
company: "luffy"
version: "v2.1.0"
$ cat application-2.conf
name: "application-2"
platform: "linux"
purpose: "demo"
company: "luffy"
version: "v2.1.0"

It can also be created in two ways:

$ kubectl delete cm application-config

$ kubectl create cm application-config --from-file=application-1.conf --from-file=application-2.conf

$ kubectl get cm application-config -oyaml

Observe that the Pod has automatically picked up the latest changes

$ kubectl exec demo-55c649865b-gpkgk ls /etc/application/
application-1.conf
application-2.conf

At this point, it is mounted to the empty directory in the pod /etc/application. If you want to mount to the existing directory of the pod, for example:

$  kubectl exec   demo-55c649865b-gpkgk ls /etc/profile.d
color_prompt
locale

Change the mount directory of the deployment:

$ cat demo-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo
  namespace: default
spec:
  selector:
    matchLabels:
      app: demo
  template:
    metadata:
      labels:
        app: demo
    spec:
      volumes:
      - configMap:
          name: application-config
        name: config
      containers:
      - name: nginx
        image: nginx:alpine
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - mountPath: "/etc/profile.d"
          name: config

rebuild pod

$ kubectl apply -f demo-deployment.yaml

# 查看pod内的/etc/profile.d目录,发现已有文件被覆盖
$ kubectl exec demo-77d685b9f7-68qz7 ls /etc/profile.d
application-1.conf
application-2.conf

Scenario 3 mount subpath

Implement multiple configuration files, which can be mounted to different directories in the pod. for example:

  • application-1.conf is mounted to /etc/application/
  • application-2.conf is mounted to /etc/profile.d

configmap remains unchanged, modify the deployment file:

$ cat demo-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo
  namespace: default
spec:
  selector:
    matchLabels:
      app: demo
  template:
    metadata:
      labels:
        app: demo
    spec:
      volumes:
      - name: config
        configMap:
          name: application-config
          items:
          - key: application-1.conf
            path: application1
          - key: application-2.conf
            path: application2
      containers:
      - name: nginx
        image: nginx:alpine
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - mountPath: "/etc/application/application-1.conf"
          name: config
          subPath: application1
        - mountPath: "/etc/profile.d/application-2.conf"
          name: config
          subPath: application2

Test mount:

$ kubectl apply -f demo-deployment.yaml

$ kubectl exec demo-78489c754-shjhz ls /etc/application
application-1.conf

$ kubectl exec demo-78489c754-shjhz ls /etc/profile.d/
application-2.conf
color_prompt
locale

copy code

Files mounted inside the Pod using subPath will not automatically perceive changes in the original ConfigMap

Deploy es service

deployment analysis

  1. The es production environment is to deploy es clusters, usually using statefulset for deployment
  2. By default, es uses the elasticsearch user to start the process. The data directory of es is mounted through the path of the host machine, so the directory permissions are overwritten by the directory permissions of the host machine. Therefore, the initContainer container can be used to modify the directory permissions before the es process starts. Pay attention to init The container should be started in privileged mode.
  3. If you want to use helm deployment, refer to  https://github.com/helm/charts/tree/master/stable/elasticsearch

Use StatefulSet to manage stateful services

When using Deployment to create multiple replica pods:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: default
  labels:
    app: nginx-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx-deployment
  template:
    metadata:
      labels:
        app: nginx-deployment
    spec:
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80

When using StatefulSet to create a multi-replica pod:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nginx-statefulset
  namespace: default
  labels:
    app: nginx-sts
spec:
  replicas: 3
  serviceName: "nginx"
  selector:
    matchLabels:
      app: nginx-sts
  template:
    metadata:
      labels:
        app: nginx-sts
    spec:
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80

Headless Service Headless Service

kind: Service
apiVersion: v1
metadata:
  name: nginx
  namespace: default
spec:
  selector:
    app: nginx-sts
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80
  clusterIP: None
$ kubectl -n default exec  -ti nginx-statefulset-0 sh
/ # curl nginx-statefulset-2.nginx

deploy and validate

es-config.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: es-config
  namespace: logging
data:
  elasticsearch.yml: |
    cluster.name: "luffy-elasticsearch"
    node.name: "${POD_NAME}"
    network.host: 0.0.0.0
    discovery.seed_hosts: "es-svc-headless"
    cluster.initial_master_nodes: "elasticsearch-0,elasticsearch-1,elasticsearch-2"

es-svc-headless.yaml

apiVersion: v1
kind: Service
metadata:
  name: es-svc-headless
  namespace: logging
  labels:
    k8s-app: elasticsearch
spec:
  selector:
    k8s-app: elasticsearch
  clusterIP: None
  ports:
  - name: in
    port: 9300
    protocol: TCP

es-statefulset.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: logging
  labels:
    k8s-app: elasticsearch
spec:
  replicas: 3
  serviceName: es-svc-headless
  selector:
    matchLabels:
      k8s-app: elasticsearch
  template:
    metadata:
      labels:
        k8s-app: elasticsearch
    spec:
      initContainers:
      - command:
        - /sbin/sysctl
        - -w
        - vm.max_map_count=262144
        image: alpine:3.6
        imagePullPolicy: IfNotPresent
        name: elasticsearch-logging-init
        resources: {}
        securityContext:
          privileged: true
      - name: fix-permissions
        image: alpine:3.6
        command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
        securityContext:
          privileged: true
        volumeMounts:
        - name: es-data-volume
          mountPath: /usr/share/elasticsearch/data
      containers:
      - name: elasticsearch
        image: 172.21.51.67:5000/elasticsearch/elasticsearch:7.4.2
        env:
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
        resources:
          limits:
            cpu: '1'
            memory: 2Gi
          requests:
            cpu: '1'
            memory: 2Gi
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        - containerPort: 9300
          name: transport
          protocol: TCP
        volumeMounts:
          - name: es-config-volume
            mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
            subPath: elasticsearch.yml
          - name: es-data-volume
            mountPath: /usr/share/elasticsearch/data
      volumes:
        - name: es-config-volume
          configMap:
            name: es-config
            items:
            - key: elasticsearch.yml
              path: elasticsearch.yml
  volumeClaimTemplates:
  - metadata:
      name: es-data-volume
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: "nfs"
      resources:
        requests:
          storage: 5Gi

es-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: es-svc
  namespace: logging
  labels:
    k8s-app: elasticsearch
spec:
  selector:
    k8s-app: elasticsearch
  ports:
  - name: out
    port: 9200
    protocol: TCP
$ kubectl create namespace logging

## 部署服务
$ kubectl apply -f es-config.yaml
$ kubectl apply -f es-svc-headless.yaml
$ kubectl apply -f es-sts.yaml
$ kubectl apply -f es-svc.yaml

## 等待片刻,查看一下es的pod部署到了k8s-slave1节点,状态变为running
$ kubectl -n logging get po -o wide  
NAME              READY   STATUS    RESTARTS   AGE   IP  
elasticsearch-0   1/1     Running   0          15m   10.244.0.126 
elasticsearch-1   1/1     Running   0          15m   10.244.0.127
elasticsearch-2   1/1     Running   0          15m   10.244.0.128
# 然后通过curl命令访问一下服务,验证es是否部署成功
$ kubectl -n logging get svc  
es-svc            ClusterIP   10.104.226.175   <none>        9200/TCP   2s
es-svc-headless   ClusterIP   None             <none>        9300/TCP   32m 
$ curl 10.104.226.175:9200
{
  "name" : "elasticsearch-2",
  "cluster_name" : "luffy-elasticsearch",
  "cluster_uuid" : "7FDIACx9T-2ajYcB5qp4hQ",
  "version" : {
    "number" : "7.4.2",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "2f90bbf7b93631e52bafb59b3b049cb44ec25e96",
    "build_date" : "2019-10-28T20:40:44.881551Z",
    "build_snapshot" : false,
    "lucene_version" : "8.2.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"

departmentkibana

deployment analysis

  1. Kibana needs to expose web pages to the front-end, so use ingress to configure the domain name to achieve access to kibana
  2. Kibana is a stateless application, which can be started directly using Deployment
  3. Kibana needs to access es, just use k8s service discovery to access this address, http://es-svc:9200

deploy and validate

efk/kibana.yaml

apiVersion: apps/v1
kind: Deployment
metadata: 
  name: kibana
  namespace: logging
  labels:
    app: kibana
spec:
  selector:
    matchLabels:
      app: "kibana"
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
      - name: kibana
        image: 172.21.51.67:5000/kibana/kibana:7.4.2
        resources:
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        env:
          - name: ELASTICSEARCH_HOSTS
            value: http://es-svc:9200
          - name: SERVER_NAME
            value: kibana-logging
          - name: SERVER_REWRITEBASEPATH
            value: "false"
        ports:
        - containerPort: 5601
---
apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: logging
  labels:
    app: kibana
spec:
  ports:
  - port: 5601
    protocol: TCP
    targetPort: 5601
  type: ClusterIP
  selector:
    app: kibana
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: kibana
  namespace: logging
spec:
  rules:
  - host: kibana.luffy.com
    http:
      paths:
      - path: /
        backend:
          serviceName: kibana
          servicePort: 5601
$ kubectl apply -f kibana.yaml  
deployment.apps/kibana created
service/kibana created  
ingress/kibana created

## 配置域名解析 kibana.luffy.com,并访问服务进行验证,若可以访问,说明连接es成功

Fluentd service deployment

deployment analysis

  1. Fluentd is a log collection service. Each business node in the kubernetes cluster has logs, so it needs to be deployed in daemonset mode
  2. In order to further control resources, a selection label will be specified for daemonset, fluentd=true for further filtering, only nodes with this label will deploy fluentd
  3. For log collection, the logs in which directories need to be collected and sent to the es end after collection, so there are many things to configure. We choose to use configmap to mount the entire configuration file.

deployment service

efk/fluentd-es-config-main.yaml

apiVersion: v1
data:
  fluent.conf: |-
    # This is the root config file, which only includes components of the actual configuration
    #
    #  Do not collect fluentd's own logs to avoid infinite loops.
    <match fluent.**>
    @type null
    </match>

    @include /fluentd/etc/config.d/*.conf
kind: ConfigMap
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
  name: fluentd-es-config-main
  namespace: logging

Configuration file, fluentd-config.yaml, note:

  1. For the configuration of the data source source, k8s will redirect the standard and error output logs of the container to the host by default
  2. By default, the kubernetes_metadata_filter plug-in is integrated   to parse the log format and get k8s-related metadata, raw.kubernetes
  3. match output to the flush configuration on the es side

efk/fluentd-configmap.yaml

kind: ConfigMap
apiVersion: v1
metadata:
  name: fluentd-config
  namespace: logging
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
data:
  containers.input.conf: |-
    <source>
      @id fluentd-containers.log
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/es-containers.log.pos
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      localtime
      tag raw.kubernetes.*
      format json
      read_from_head false
    </source>
    # Detect exceptions in the log output and forward them as one log entry.
    # https://github.com/GoogleCloudPlatform/fluent-plugin-detect-exceptions 
    <match raw.kubernetes.**>
      @id raw.kubernetes
      @type detect_exceptions
      remove_tag_prefix raw
      message log
      stream stream
      multiline_flush_interval 5
      max_bytes 500000
      max_lines 1000
    </match>
  output.conf: |-
    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      @type kubernetes_metadata
    </filter>
    <match **>
      @id elasticsearch
      @type elasticsearch
      @log_level info
      include_tag_key true
      hosts elasticsearch-0.es-svc-headless:9200,elasticsearch-1.es-svc-headless:9200,elasticsearch-2.es-svc-headless:9200
      #port 9200
      logstash_format true
      #index_name kubernetes-%Y.%m.%d
      request_timeout    30s
      <buffer>
        @type file
        path /var/log/fluentd-buffers/kubernetes.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 2M
        queue_limit_length 8
        overflow_action block
      </buffer>
    </match>

daemonset definition file, fluentd.yaml, note:

  1. You need to configure rbac rules because you need to access the k8s api to query metadata based on logs
  2. The /var/log/containers/ directory needs to be mounted into the container
  3. The configuration file in fluentd's configmap needs to be mounted into the container
  4. If you want to deploy fluentd nodes, you need to add the label fluentd=true

efk/fluentd.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd-es
  namespace: logging
  labels:
    k8s-app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd-es
  labels:
    k8s-app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
  - ""
  resources:
  - "namespaces"
  - "pods"
  verbs:
  - "get"
  - "watch"
  - "list"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd-es
  labels:
    k8s-app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
  name: fluentd-es
  namespace: logging
  apiGroup: ""
roleRef:
  kind: ClusterRole
  name: fluentd-es
  apiGroup: ""
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    k8s-app: fluentd-es
  name: fluentd-es
  namespace: logging
spec:
  selector:
    matchLabels:
      k8s-app: fluentd-es
  template:
    metadata:
      labels:
        k8s-app: fluentd-es
    spec:
      containers:
      - env:
        - name: FLUENTD_ARGS
          value: --no-supervisor -q
        image: 172.21.51.67:5000/fluentd_elasticsearch/fluentd:v2.5.2
        imagePullPolicy: IfNotPresent
        name: fluentd-es
        resources:
          limits:
            memory: 500Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - mountPath: /var/log
          name: varlog
        - mountPath: /var/lib/docker/containers
          name: varlibdockercontainers
          readOnly: true
        - mountPath: /fluentd/etc/config.d
          name: config-volume
        - mountPath: /fluentd/etc/fluent.conf
          name: config-volume-main
          subPath: fluent.conf
      nodeSelector:
        fluentd: "true"
      securityContext: {}
      serviceAccount: fluentd-es
      serviceAccountName: fluentd-es
      volumes:
      - hostPath:
          path: /var/log
          type: ""
        name: varlog
      - hostPath:
          path: /var/lib/docker/containers
          type: ""
        name: varlibdockercontainers
      - configMap:
          defaultMode: 420
          name: fluentd-config
        name: config-volume
      - configMap:
          defaultMode: 420
          items:
          - key: fluent.conf
            path: fluent.conf
          name: fluentd-es-config-main
        name: config-volume-main
## 给slave1打上标签,进行部署fluentd日志采集服务
$ kubectl label node k8s-slave1 fluentd=true  
$ kubectl label node k8s-slave2 fluentd=true

# 创建服务
$ kubectl apply -f fluentd-es-config-main.yaml  
configmap/fluentd-es-config-main created  
$ kubectl apply -f fluentd-configmap.yaml  
configmap/fluentd-config created  
$ kubectl apply -f fluentd.yaml  
serviceaccount/fluentd-es created  
clusterrole.rbac.authorization.k8s.io/fluentd-es created  
clusterrolebinding.rbac.authorization.k8s.io/fluentd-es created  
daemonset.extensions/fluentd-es created 

## 然后查看一下pod是否已经在k8s-slave1
$ kubectl -n logging get po -o wide
NAME                      READY   STATUS    RESTARTS   AGE  
elasticsearch-logging-0   1/1     Running   0          123m  
fluentd-es-246pl          1/1     Running   0          2m2s  
kibana-944c57766-ftlcw    1/1     Running   0          50m

The above is a simplified version of the configuration of k8s log deployment collection. The full version can be viewed at https://gitbhub.com/kubernetes/kubernetes/tree/master/cluster/addons/fluentd-elasticsearch .

EFK functional verification

Verification ideas

Start the service in the slave node, print the test log to the standard output at the same time, and check whether it can be collected in kibana

Create a test container

efk/test-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: counter
spec:
  nodeSelector:
    fluentd: "true"
  containers:
  - name: count
    image: alpine:3.6
    args: [/bin/sh, -c, 'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done']
$ kubectl get po  
NAME                          READY   STATUS    RESTARTS   AGE  
counter                       1/1     Running   0          6s

configure kibana

Log in to the kibana interface and follow the sequence of the screenshots:

Log data can also be filtered by other metadata, for example, you can click on any log entry to view other metadata, such as container name, Kubernetes node, namespace, etc., such as kubernetes.pod_name : counter

So far, we have successfully deployed EFK on the Kubernetes cluster. To learn how to use Kibana for log data analysis, you can refer to the Kibana user guide document: https://www.elastic.co/guide/en/kib

Guess you like

Origin blog.csdn.net/guolianggsta/article/details/131609826