前言:
kubernetes集群的apiserver服务的审计日志通常是不开启的,如果是新安装的kubernetes集群的话。
审计日志是kube-apiserver中比较常见的一种加固手段,通过对每一次请求的行为进行审计,从而达到加固集群的目的,同时,审计日志还能够帮助我们troubleshooting,因为每一次请求的内容都会被记录下来,如果请求的内容本身有问题,从而导致api返回5xx的错误,我们可以从审计日志中直接把报错信息抓出来给开发,帮助他们定位问题。
但有一点特别需要注意,如果审计策略是不恰当的,什么都记录的,可能会造成集群的内存资源浪费(审计日志会占用很多内存的)审计日志记录功能会增加 API server 的内存消耗,因为需要为每个请求存储审计所需的某些上下文。 内存消耗取决于审计日志记录的配置。
审计记录最初产生于 kube-apiserver 内部。每个请求在不同执行阶段都会生成审计事件;这些审计事件会根据特定策略 被预处理并写入后端。策略确定要记录的内容和用来存储记录的后端。 当前的后端支持日志文件和 webhook。
一,
如何正确的开启apiserver的审计日志?
首先,我们需要一个审计策略文件,该文件是yaml格式的,此文件内容定义的是要抓取审计哪些内容。
同样的,日志审计功能在官方的文档里有详细的说明:审计 | Kubernetes
什么时候记录?
每个请求都可被记录其相关的阶段(stage)。已定义的阶段有:
RequestReceived
- 此阶段对应审计处理器接收到请求后,并且在委托给 其余处理器之前生成的事件。ResponseStarted
- 在响应消息的头部发送后,响应消息体发送前生成的事件。 只有长时间运行的请求(例如 watch)才会生成这个阶段。ResponseComplete
- 当响应消息体完成并且没有更多数据需要传输的时候。Panic
- 当 panic 发生时生成。
记录哪些内容?
审计策略定义了关于应记录哪些事件以及应包含哪些数据的规则。 审计策略对象结构定义在 audit.k8s.io API 组 。处理事件时,将按顺序与规则列表进行比较。第一个匹配规则设置事件的 审计级别(Audit Level)。已定义的审计级别有:
None
- 符合这条规则的日志将不会记录。Metadata
- 记录请求的元数据(请求的用户、时间戳、资源、动词等等), 但是不记录请求或者响应的消息体。Request
- 记录事件的元数据和请求的消息体,但是不记录响应的消息体。 这不适用于非资源类型的请求。RequestResponse
- 记录事件的元数据,请求和响应的消息体。这不适用于非资源类型的请求。
下面是一个比较标准的审计策略定义:
apiVersion: audit.k8s.io/v1beta1 # This is required.
kind: Policy
#不要为RequestReceived阶段中的所有请求生成审核事件。
omitStages:
- "RequestReceived"
rules:
# 以下请求被手动确定为高容量和低风险,因此请取消这些请求。
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- group: "" # core
resources: ["endpoints", "services"]
- level: None
users: ["system:unsecured"]
namespaces: ["kube-system"]
verbs: ["get"]
resources:
- group: "" # core
resources: ["configmaps"]
- level: None
users: ["kubelet"] # legacy kubelet identity
verbs: ["get"]
resources:
- group: "" # core
resources: ["nodes"]
- level: None
userGroups: ["system:nodes"]
verbs: ["get"]
resources:
- group: "" # core
resources: ["nodes"]
- level: None
users:
- system:kube-controller-manager
- system:kube-scheduler
- system:serviceaccount:kube-system:endpoint-controller
verbs: ["get", "update"]
namespaces: ["kube-system"]
resources:
- group: "" # core
resources: ["endpoints"]
- level: None
users: ["system:apiserver"]
verbs: ["get"]
resources:
- group: "" # core
resources: ["namespaces"]
#不要记录这些只读URL。
- level: None
nonResourceURLs:
- /healthz*
- /version
- /swagger*
#不要记录事件请求。
- level: None
resources:
- group: "" # core
resources: ["events"]
# 机密、配置映射和令牌审查可以包含敏感和二进制数据,
# 因此,只能在元数据级别进行日志记录。
- level: Metadata
resources:
- group: "" # core
resources: ["secrets", "configmaps"]
- group: authentication.k8s.io
resources: ["tokenreviews"]
- level: Request
verbs: ["get", "list", "watch"]
resources:
- group: "" # core
- group: "admissionregistration.k8s.io"
- group: "apps"
- group: "authentication.k8s.io"
- group: "authorization.k8s.io"
- group: "autoscaling"
- group: "batch"
- group: "certificates.k8s.io"
- group: "extensions"
- group: "networking.k8s.io"
- group: "policy"
- group: "rbac.authorization.k8s.io"
- group: "settings.k8s.io"
- group: "storage.k8s.io"
# 已知API的默认级别。
- level: RequestResponse
resources:
- group: "" # core
- group: "admissionregistration.k8s.io"
- group: "apps"
- group: "authentication.k8s.io"
- group: "authorization.k8s.io"
- group: "autoscaling"
- group: "batch"
- group: "certificates.k8s.io"
- group: "extensions"
- group: "networking.k8s.io"
- group: "policy"
- group: "rbac.authorization.k8s.io"
- group: "settings.k8s.io"
- group: "storage.k8s.io"
- group: "autoscaling.alibabacloud.com"
# 所有其他请求的默认级别。
- level: Metadata
假设这个文件存放路径为:/etc/kubernetes/logpolicy/sample-policy.yaml ,那么现在需要在apiserver的配置文件内启用日志审计功能:
cat /etc/kubernetes/manifests/kube-apiserver.yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 192.168.123.11:6443
creationTimestamp: null
labels:
component: kube-apiserver
tier: control-plane
name: kube-apiserver
namespace: kube-system
spec:
containers:
- command:
- kube-apiserver
- --advertise-address=192.168.123.11
- --allow-privileged=true
- --authorization-mode=Node,RBAC
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --enable-admission-plugins=NodeRestriction
- --audit-policy-file=/etc/kubernetes/logpolicy/sample-policy.yaml #add 这个是绝对路径指定审计策略文件
- --audit-log-path=/var/log/kubernetes/audit-logs/audit.log#add 这个是绝对路径
- --audit-log-maxsize=7#add 单位为M,最多7M大小,超出就另生成一个新日志
- --audit-log-maxbackup=2#add 最多有两个7M的日志文件
- --enable-bootstrap-token-auth=true
- --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
- --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
- --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
- --etcd-servers=https://127.0.0.1:2379
- --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
- --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
- --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
- --requestheader-allowed-names=front-proxy-client
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --requestheader-extra-headers-prefix=X-Remote-Extra-
- --requestheader-group-headers=X-Remote-Group
- --requestheader-username-headers=X-Remote-User
- --secure-port=6443
- --service-account-issuer=https://kubernetes.default.svc.cluster.local
- --service-account-key-file=/etc/kubernetes/pki/sa.pub
- --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
- --service-cluster-ip-range=10.96.0.0/12
- --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
- --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
image: registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.15
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 192.168.123.11
path: /livez
port: 6443
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: kube-apiserver
readinessProbe:
failureThreshold: 3
httpGet:
host: 192.168.123.11
path: /readyz
port: 6443
scheme: HTTPS
periodSeconds: 1
timeoutSeconds: 15
resources:
requests:
cpu: 250m
startupProbe:
failureThreshold: 24
httpGet:
host: 192.168.123.11
path: /livez
port: 6443
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /etc/ssl/certs
name: ca-certs
readOnly: true
- mountPath: /etc/kubernetes/logpolicy/sample-policy.yaml#add
name: audit-policy#add
readOnly: true#add
- mountPath: /var/log/kubernetes/audit-logs#add
name: audit-logs#add
readOnly: false#add 这个不能是true,否则apiserver启动不了
- mountPath: /etc/pki
name: etc-pki
readOnly: true
- mountPath: /etc/kubernetes/pki
name: k8s-certs
readOnly: true
hostNetwork: true
priorityClassName: system-node-critical
securityContext:
seccompProfile:
type: RuntimeDefault
volumes:
- hostPath:
path: /etc/ssl/certs
type: DirectoryOrCreate
name: ca-certs
- hostPath:
path: /etc/pki
type: DirectoryOrCreate
name: etc-pki
- hostPath:
path: /etc/kubernetes/pki
type: DirectoryOrCreate
name: k8s-certs
- hostPath:#add
path: /etc/kubernetes/logpolicy/sample-policy.yaml#add
type: File#add 这个地方必须是这个type
name: audit-policy#add
- hostPath:#add
path: /var/log/kubernetes/audit-logs#add 这个路径不需要手动建立,会自动建立的
type: DirectoryOrCreate#add 这个地方必须是这个type
name: audit-logs#add
status: {}
二,
验证审计策略
此命令生成相关日志
kubectl get configmaps -n kube-system
利用jq命令检索,可以发现没有相关日志生成:
cat audit.log |grep configmaps|grep get |jq
为什么没有相关审计日志呢?由于上面的审计策略里有这个:
- level: None
users: ["system:unsecured"]
namespaces: ["kube-system"]
verbs: ["get"]
resources:
- group: "" # core
resources: ["configmaps"]
查询list就可以查到了:
[root@k8s-master audit-logs]# cat audit.log |grep configmaps|grep list |jq
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "Metadata",
"auditID": "39290452-13a5-4547-a1f3-8bd214ad88b4",
"stage": "ResponseComplete",
"requestURI": "/api/v1/configmaps?limit=500",
"verb": "list",
"user": {
"username": "kubernetes-admin",
"groups": [
"system:masters",
"system:authenticated"
]
},
"sourceIPs": [
"192.168.123.11"
],
"userAgent": "kubectl/v1.23.15 (linux/amd64) kubernetes/b84cb8a",
"objectRef": {
"resource": "configmaps",
"apiVersion": "v1"
},
"responseStatus": {
"metadata": {},
"code": 200
},
"requestReceivedTimestamp": "2023-01-08T15:13:38.086276Z",
"stageTimestamp": "2023-01-08T15:13:38.092581Z",
"annotations": {
"authorization.k8s.io/decision": "allow",
"authorization.k8s.io/reason": ""
}
}
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "Metadata",
"auditID": "ae5f5d3a-9f26-46be-bfa8-f0889a0321c4",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/default/configmaps?limit=500",
"verb": "list",
"user": {
"username": "kubernetes-admin",
"groups": [
"system:masters",
"system:authenticated"
]
},
"sourceIPs": [
"192.168.123.11"
],
"userAgent": "kubectl/v1.23.15 (linux/amd64) kubernetes/b84cb8a",
"objectRef": {
"resource": "configmaps",
"namespace": "default",
"apiVersion": "v1"
},
"responseStatus": {
"metadata": {},
"code": 200
},
"requestReceivedTimestamp": "2023-01-08T15:14:01.133134Z",
"stageTimestamp": "2023-01-08T15:14:01.137090Z",
"annotations": {
"authorization.k8s.io/decision": "allow",
"authorization.k8s.io/reason": ""
}
}
三,
审计后端
审计后端实现将审计事件导出到外部存储。Kube-apiserver
默认提供两个后端:
- Log 后端,将事件写入到文件系统
- Webhook 后端,将事件发送到外部 HTTP API
很明显,本文是使用的Log后端,仅仅将事件写入文件系统,基于此日志文件,可以将日志推送到filebeat内,构成一个完整的日志系统。
审计策略小结:
1、rule是白名单,配置了规则rule才会被打印 (验证:如果none类型后面还配置了 metadata类型,不会打印日志;如果去掉后面的metadata类型,只保留前面的none类型的,不会打印任何日志;如果metadata类型在none类型的前面,将会打印日志)
例如这样的打印日志:
apiVersion: audit.k8s.io/v1beta1
kind: Policy
rules:
- level: Metadata
resources:
- group: "" # core
resources: ["secrets", "configmaps"]
- group: authentication.k8s.io
resources: ["tokenreviews"]
- level: None
- level: None
userGroups: ["system:kube-controller-manager"]
nonResourceURLs:
- "/api*" # 通配符匹配。
- "/version"
- level: None
resources:
- group: "" # core
resources: ["events"]
把metadata和none位置替换后,将不会打印日志:
apiVersion: audit.k8s.io/v1beta1
kind: Policy
rules:
- level: None
- level: None
userGroups: ["system:kube-controller-manager"]
nonResourceURLs:
- "/api*" # 通配符匹配。
- "/version"
- level: None
resources:
- group: "" # core
resources: ["events"]
- level: Metadata
resources:
- group: "" # core
resources: ["secrets", "configmaps"]
- group: authentication.k8s.io
resources: ["tokenreviews"]
2、rule规则中 最前面那个是 结果 select 输出结果,后面的是条件 where 条件 (验证:查看输出结果就知道)
3、每条rule规则中,多个条件是与的关系,任何一次操作同时满足这些条件才能打印指定类型日志
4、rules是数组,越前面优先级越高,一条日志走策略文件,先匹配到哪条就返回指定结果
5、omitStage 可以配置全局的,也可以配置在每条规则下 (验证:略)