1. 总体目标

从日志平台本身的业务需求分析来看，我们至少应该希望通过平台获取到以下日志信息：

平台组件日志——面向运维
- 原生k8s组件日志，如：kube-apiserver、kubelet等
- 自研组件日志
应用日志——面向应用方
- 应用打到stdout/stderr日志中的日志
- 应用写入到文件中的日志
审计日志
- 用户登录paas平台的操作日志

2. 实现思路和要点

2.1 平台组件日志采集

2.1.1 平台组件启动方式概述

根据平台组件部署方式的不同，组件日志的采集也有不同。平台组件目前主要有两种启动方式：

通过systemd启动的组件，主要有etcd、kubelet。
通过容器化跑在k8s上的组件，主要是除etcd和kubelet外的其他k8s原生组件(hyperkube方式)和自研组件。

备注：
目前我的k8s集群的部署方式是用kargo(目前叫kubespray)部署的，etcd和kubelet是通过systemd启动的容器（即这两个组件的可靠性由systemd来管理）。其他k8s原生组件是使用static pod的形式起在k8s上（即这些组件的可靠性由k8s来保证）。

2.1.2 平台组件日志采集思路

跑在k8s集群中的组件日志可以通过以daemonset的方式在每台机器上启动一个fluentd来从主机容器日志的json文件中采集。
etcd和kubelet这类以systemd启动的组件可以从journal中采集。

2.2 应用日志采集

2.2.1 应用日志输出方式概述

应用都是指跑在k8s集群中的容器，但是根据应用日志的输出方式不同，应用日志的采集方式也有不同。目前平台主要支持两种日志采集：

应用输出到stdout/stderr的日志——标准方式
应用输出到文件的日志——个性化方式

2.2.2 应用日志采集思路

输出到标准输出(stdout/stderr)的日志可以复用2.1节的fluentd从容器日志的json文件中采集出来。
输出到指定位置的日志文件中的日志可以通过为应用在同一个pod中以sidecar的方式启动一个filebeat来满足用户个性化的采集需求。

3.实施

由第二节可知，采集方式主要有三种：

以daemonset启动的fluentd，通过原生tail插件从json文件中采集容器中跑的应用和部分组件日志
以daemonset启动的fluentd，通过fluentd的fluent-plugin-systemd插件，通过从journal中采集systemd启动的etcd和kubelet的日志
以sidecar启动的filebeat，采集输出到文件中的个性化应用日志

3.1 fluentd部署和配置

fluentd需要额外安装几个plugin:

fluentd-plugin-elasticsearch
(fluentd-plugin-kafka)
fluent-plugin-grep
fluent-plugin-systemd

备注：

fluentd的镜像中平常用的比较多的base image是alphine，但是fluentd-plugin-systemd这个插件依赖libsystemd.so.0等库，而这些库在alphine中是无法安装的，alphine中无systemd，相关问题见：https://github.com/fluent/fluentd-docker-image/issues/49，因此最终base image换为debian:stretch-slim，相关问题见：https://github.com/fluent/fluentd-docker-image/pull/71
官方的fluent-docker-image不是以root运行fluentd的，这就导致无法访问journal的目录/var/log或者/run/log啥的：https://github.com/fluent/fluentd-docker-image/issues/48，即使在Dockerfile中加了USER root也不是以root运行（由于那个entrypoint）
我的fluentd的Dockerfile文件请见：https://github.com/liukuan73/fluentd，我是基于官方的Dockerfile：https://github.com/fluent/fluentd-docker-image/tree/master/v0.12/debian，在其基础上做了以下一些改动：
- 1.加装了几个自己需要的plugin
- 2.装了几个自己写的filter的customize plugin，用来处理journal日志
- 3.以root运行fluentd
fluent-plugin-elasticsearch官方文档推荐>=0.14.0版的fluentd用>= 2.0.0的fluent-plugin-elasticsearch，>=fluentd-0.12版的fluentd用< 2.0.0的fluent-plugin-elasticsearch:https://github.com/uken/fluent-plugin-elasticsearch
fluent-plugin-system的配置方法请见：https://github.com/reevoo/fluent-plugin-systemd

configmap文件的fluent.conf解析：

<source>                              //容器日志
  @type tail                          //使用tail输入插件采标准输出日志
  format  json                        //日志是json格式的，要用json插件解析
  path /var/log/containers/*.log
  pos_file /var/log/containers/container.pos     //存储读到的位置
  refresh_interval 2                  //2秒刷新一次读取的文件列表
  time_key time_field                 //time_field在match中以年月建索引会用到
  keep_time_key true
  tag  container.*                    //打tag,“.*”的形式会加上path部分，即：container.var.log.containers.podID_namespaceName_appName-containerID.log
</source>

<filter container.**>
  @type record_transformer
  enable_ruby
  <record>
    logname     ${tag.split('.')[4]}   //提取出tag中的podID_namespaceName_appName-containerID
  </record>
</filter>

<match container.**>                  //把从容器中取到的日志分为应用日志和组件日志分别处理
  @type copy
  <store>
    @type grep
    input_key logname
    exclude ^kube
    remove_tag_prefix container
    add_tag_prefix application
  </store>
  <store>
    @type grep
    regexp1 logname ^kube
    remove_tag_prefix container
    add_tag_prefix component
  </store>
</match>

<filter application.**>
   @type record_transformer
   enable_ruby              //下面的host  #{Socket.gethostname}是ruby语法
   <record>
     app            ${(record['logname'].split('_')[0].split('-')[-2]  =~ /\d{10}/)==0 ? record['logname'].split('_')[0].split('-')[0..-3].join("-"):record['logname'].split('_')[0].split('-')[0..-2].join("-")}
     namespace      ${record['logname'].split('_')[1]}
     podname        ${record['logname'].split('_')[0]}
     message        ${record['log']}
     logtime        ${record['time']}
     cluster        cluster-1
     host           #{Socket.gethostname}
   </record>
   remove_keys    logname,log,stream,time
</filter>
<filter component.**>
   @type record_transformer
   enable_ruby
   <record>
     message        ${record['log']}
     logtime        ${record['time']}
     cluster        cluster-1
     component_name  ${record['logname'].split('-')[0]+'-'+record['logname'].split('-')[1]}
     host           #{Socket.gethostname}
   </record>
   remove_keys    logname,log,time,stream
</filter>

<filter component.**>               
  @type loglevel               //使用自己写的loglevel plugin提取出severity信息
</filter>

<match application.**>
  @type elasticsearch_dynamic
   flush_interval 5s
   hosts elasticsearch.kube-system.svc.cluster.local:9200
   logstash_format true                  //https://github.com/uken/fluent-plugin-elasticsearch#logstash_format
   logstash_prefix k8s-application
   logstash_dateformat %Y.%m             //为了以年、月在es中建索引：https://github.com/uken/fluent-plugin-elasticsearch#logstash_dataformat
   type_name ${record['cluster']}-${record['namespace']}
</match>
<match component.**>
  @type elasticsearch_dynamic
  flush_interval 5s
  hosts elasticsearch.kube-system.svc.cluster.local:9200
  logstash_format true
  logstash_prefix k8s-component
  logstash_dateformat %Y.%m
  type_name ${record['cluster']}
</match>


<source>                                //etcd和kubelet日志
  @type systemd                         //使用fluent-plugin-systemd插件获取journal日志
  path /run/log/journal                 //根据情况而定，一般是在/var/log/journal或/run/log/journal
  filters [{ "_SYSTEMD_UNIT": "etcd.service" }]
  tag etcd
  read_from_head false
  <storage>
    @type local
    persistent false
    path /run/log/journal/journal-etcd.pos
  </storage>
</source>
<source>
  @type systemd
  path /run/log/journal
  filters [{ "_SYSTEMD_UNIT": "kubelet.service" }]
  tag kubelet
  read_from_head false
  <storage>
    @type local
    persistent false
    path /run/log/journal/journal-kubelet.pos
  </storage>
</source>

<filter etcd>                      
  @type etcd                     //使用自己写的etcd plugin提取出severity信息
</filter>
<filter kubelet>
  @type kubelet
</filter>

<filter {etcd}>
   @type record_transformer
   enable_ruby
   <record>
     host           #{Socket.gethostname}
     cluster        cluster-1
     component_name etcd
   </record>
   remove_keys  MESSAGE,_SELINUX_CONTEXT,__CURSOR,__REALTIME_TIMESTAMP,__MONOTONIC_TIMESTAMP,_BOOT_ID,_TRANSPORT,PRIORITY,SYSLOG_FACILITY,_UID,_GID,_CAP_EFFECTIVE,_SYSTEMD_SLICE,_MACHINE_ID,_HOSTNAME,SYSLOG_IDENTIFIER,_PID,_COMM,_EXE,_CMDLINE,_SYSTEMD_CGROUP,_SYSTEMD_UNIT
</filter>
<filter {kubelet}>
   @type record_transformer
   enable_ruby
   <record>
     host           #{Socket.gethostname}
     cluster        cluster-1
     component_name kubelet
   </record>
   remove_keys  MESSAGE,_SELINUX_CONTEXT,__CURSOR,__REALTIME_TIMESTAMP,__MONOTONIC_TIMESTAMP,_BOOT_ID,_TRANSPORT,PRIORITY,SYSLOG_FACILITY,_UID,_GID,_CAP_EFFECTIVE,_SYSTEMD_SLICE,_MACHINE_ID,_HOSTNAME,SYSLOG_IDENTIFIER,_PID,_COMM,_EXE,_CMDLINE,_SYSTEMD_CGROUP,_SYSTEMD_UNIT
</filter>

<match {etcd,kubelet}>
  @type elasticsearch_dynamic
  flush_interval 5s
  hosts elasticsearch.kube-system.svc.cluster.local:9200
  logstash_format true
  logstash_prefix journal
  logstash_dateformat %Y.%m
  type_name ${record['cluster']}
</match>


<source>                                  //审计日志
  @type tail
  format json
  path /usr/local/openresty/nginx/logs/access.log
  pos_file /var/log/containers/access.pos
  time_key time_field
  keep_time_key true
  tag nginx
</source>

<filter nginx>
  @type nginx
</filter>

<match nginx>
  @type elasticsearch_dynamic
  flush_interval 5s
  hosts elasticsearch.kube-system.svc.cluster.local:9200
  index_name k8s-audit
  type_name ${record['cluster']}-${record['namespace']}
  remove_keys @timestamp,method
</match>

3.2 filebeat-sidecar部署和配置

待续

基于kubernetes的PaaS平台统一日志系统详解