Intention
Non-sustainable way to customize Kubernetes
- Fork & Sending PRs to upstream
- without extensibility...
- 减少k8s维护团队负担
- 促使遵循k8s规范
Extension Patterns
webhook 通过http post一个请求,然后通过远程的server来决策, plugin是通过调用一个二进制可执行程序来决策, 对于controller是实现自动化的一种当时,k8s是一种声明式的资源定义,通过定义一个资源的期望状态,然后controller不断获取实际状态进行状态转换到达期望状态,所以他会不断地读取api-server的信息, 系统中内置的kube-controller-manager就包含许多这样的controller。
Extension Points
kubectl plugin
Kubernetes v1.10 alpha,
- Plugin loader Search order
- plugin.yaml
- binary
API Access Extensions
Webhook Token Authentication
--authentication-token-webhook-config-file flag指定配置文件, 配置文件的格式如kubeconfig形式
# clusters refers to the remote service.
clusters:
- name: name-of-remote-authn-service
cluster:
certificate-authority: /path/to/ca.pem # CA for verifying the remote service.
server: https://authn.example.com/authenticate # URL of remote service to query. Must use 'https'.
# users refers to the API server's webhook configuration.
users:
- name: name-of-api-server
user:
client-certificate: /path/to/cert.pem # cert for the webhook plugin to use
client-key: /path/to/key.pem # key matching the cert
# kubeconfig files require a context. Provide one for the API server.
current-context: webhook
contexts:
- context:
cluster: name-of-remote-authn-service
user: name-of-api-sever
name: webhook
当api-server收到上述请求之后会POST一个 authentication.k8s.io/v1beta1 TokenReview对象给webhook, 服务器会返回对应的状态,形如:
{
"apiVersion": "authentication.k8s.io/v1beta1",
"kind": "TokenReview",
"spec": {
"token": "(BEARERTOKEN)"
}
}
{
"apiVersion": "authentication.k8s.io/v1beta1",
"kind": "TokenReview",
"status": {
"authenticated": true,
"user": {
"username": "[email protected]",
"uid": "42",
"groups": [
"developers",
"qa"
],
"extra": {
"extrafield1": [
"extravalue1",
"extravalue2"
]
}
}
}
}
Authenticating Proxy
Authorization Webhook
Admission control
admissoin controller 是编译在api-server中的一系列binary, 通过在api-server的flag中指定执行那些controller进行插件式的使用,实现资源权限等的检查和修改。例如resourceQuta, limitRanger, 官方提供的controller必须提前编译进api-server, reload必须重启服务,基于上述不足,k8s提供了不同的扩展方式。
Dynamic Admission Controller
用户提供一个webhook来进行自定义(beta in 1.9), webhook分为MutatingAdmissionWebhook 和 ValidatingAdmissionWebhook. MutatingAdmissionWebhook执行修改操作,为用户未设置的资源字段提供默认值, ValidatingAdmissionWebhook 执行一些检查操作,可以拒绝用户的请求来增加额外的准入策略。
例如可以控制所有的容器镜像都是来自一个特定的registry, 拒绝来自其他镜像仓库的pod部署
使用的时候通过一个config来配置webhook server的地址, --admission-control-config-file
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
name: <name of this configuration object>
webhooks:
- name: <webhook name, e.g., pod-policy.example.io>
rules:
- apiGroups:
- ""
apiVersions:
- v1
operations:
- CREATE
resources:
- pods
clientConfig:
service:
namespace: <namespace of the front-end service>
name: <name of the front-end service>
caBundle: <pem encoded ca cert that signs the server cert used by the webhook>
有一些侧面效应比如改了用户的配置会让用户感到莫名其妙,可能破坏一些自动化contoller的执行逻辑,在将来的版本中会对可更改的字段进行限制。
ref: github example-webhook-admission-controller
PodPreset
是一个内置的admission controller, 可以在pod创建的时候注入一些信息, 例如一些volume mounts, secrets, environment variables 甚至是一个sidecar, podPreset通过判断lable来确定是否对该pod注入,如果注入失败并不会影响原来pod的正常运行
流程
- Retrieve all PodPresets available for use.
- Check if the label selectors of any PodPreset matches the labels on the pod being created.
- Attempt to merge the various resources defined by the PodPreset into the Pod being created.
- On error, throw an event documenting the merge error on the pod, and create the pod without any injected resources from the PodPreset.
- Annotate the resulting modified Pod spec to indicate that it has been modified by a PodPreset. The annotation is of the form podpreset.admission.kubernetes.io/podpreset-
如果想要显式拒绝这种注入, 可以定义一个annotataion: podpreset.admission.kubernetes.io/exclude: "true".
下面podPreset的定义:
apiVersion: settings.k8s.io/v1alpha1
kind: PodPreset
metadata:
name: allow-database
spec:
selector:
matchLabels:
role: frontend
env:
- name: DB_PORT
value: "6379"
volumeMounts:
- mountPath: /cache
name: cache-volume
volumes:
- name: cache-volume
emptyDir: {}
Initializers
alpha in 1.7,
initializers比上述两者功能都强大,可以对某种类型的资源进行更改, InitializerConfiguration 来决定一种资源类型应该被什么initializers处理
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: InitializerConfiguration
metadata:
name: example-config
initializers:
# the name needs to be fully qualified, i.e., containing at least two "."
- name: podimage.example.com
rules:
# apiGroups, apiVersion, resources all support wildcard "*".
# "*" cannot be mixed with non-wildcard.
- apiGroups:
- ""
apiVersions:
- v1
resources:
- pods
一个资源类型可以有多个initializers串行初始化,每种initilizers需要实现对应的controller, controller执行更改或验证操作,然后从metadata.initalizers.pending
list中移除该initializers, 当所有的initializer都被移除之后pod才能被调度到node之上,如果initializers未执行完,则默认无法看到该资源对象。
反过来说,如果一个initialier controller 下线后,该资源类型都无法创建成功,
ref: How Kubernetes Initializers work
kubernetes-initializers-deep-div
github initializers example
ImagePolicyWebhook
通过一个webhook 来判断是否允许拉取指定的镜像
User-Defined Types
- 什么是resource type, group, version, kind, controller & automation
- authn, authz
- storage
Custom Resource Definitions (CRD)
- Do not require programming
- Easy to deploy: kubectl create -f crd.yaml
- No new point-of-failure
举个栗子: etcd operator, 定义一个etcdop资源,可以一键部署,告别复杂配置
sample controller
- CustomResourceDefinition yaml
- code-gen
- informer
- controller
- interagation with others: events, Garbage conllector(ownerReferences)
ref: operators
API Aggregation
trade-off
- Require coding, built atop k8s.io/apiserver library
- Highly customizable, like adding a new verb, create/delete hooks
- Typed fields, validation, defaults
- Multi-versioning, supporting old clients
- Generated OpenAPI schema
- Supports protobuf
Supports strategic merge patch
intention
- Provide an API for registering API servers.
- Summarize discovery information from all the servers.
Proxy client requests to individual servers.
install
- setup extension apiserver
- run as a Deployment
- register with the core apiserver using an apiregistration.k8s.io/v1beta1/APIService
- setup etcd storage for the extension apiserver
- run as StatefulSet or etcd operator
- setup the extension controller-manager
- run as a Deployment (maybe the same Pod as the extension apiserver)
- configure to talk to the core apiserver (extension APIs are used through core
使用起来较为复杂,官方提供的 apiserver library 提供的基础模块,api interface, kubernetes-incubator/apiserver-builder 提供了一个framwork。
sample api-server
kubernetes-incubator
- service catalog (Open service broker API, SaaS)
- metrics-server (top, Horizontal Pod Autoscaler )
- custom-metrics-apiserver
CRD summary
其实CRD, APIServer 本身就是两种类型,分别位于group下: apiextensions.k8s.io/v1beta1 apiregistration.k8s.io/v1
只不过这另种resource比较特殊,可以定义其他resource, 与其说是将扩展,不如说是讲了两种resource
scheduler
kubernetes scheduler component, 首先是predicates 过滤掉不符合的node, 然后priority来选择合适的node
自定义的三种方式:
- kube-scheduler的--policy-config-file flag来配置, 配置文件中可以指定一些基本的指标,对于高级的指标就需要编辑kube-scheduler的源码
https://github.com/kubernetes/community/blob/master/contributors/devel/scheduler.md
第二种是直接配置一个新的scheduler,与kube-scheuler并排跑, 通过pod启动时的spec.schedulerName来选择scheduler
https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
第三种是schduler extender, 在kube-schduler调用附加的webhook, 使用方式就是在上面kube-scheduler的--policy-config-file flag的配置文件中增加一个extender字段来配置extender
https://github.com/fabric8io/kansible/blob/master/vendor/k8s.io/kubernetes/docs/design/scheduler_extender.md