Original link: http://blog.geekidentity.com/k8s/kops/cluster_spec_cn/
Description of keys in config and cluster.spec in kops
The description here is not complete, but aims to document those keys that are not so easy to interpret. Our godoc reference provides a more detailed list of API values. There are two top-level API values that describe clusters in the YAML file: ClusterSpec is defined as kind:YAML, and InstanceGroup is defined as kind:YAML.
spec
api
This object configures how we expose the API:
-
dns
Will allow direct access to the master instance and configure DNS to point directly to the master node. -
loadBalancer
A load balancer (ELB) will be configured in front of the master node and DNS will be configured to point to the ELB.
DNS example:
spec:
api:
dns: {}
When configuring LoadBalancer, you can choose to use a public ELB or an intranet (VPC only) ELB. type
Field should be Public
or Internal
.
Additionally, additional additionalSecurityGroups
pre-created security groups can be added to the load balancer through settings.
spec:
api:
loadBalancer:
type: Public
additionalSecurityGroups:
- sg-xxxxxxxx
- sg-xxxxxxxx
idleTimeoutSeconds
In addition, the idle timeout of the load balancer can be increased by setting . The default idle timeout is 5 minutes, AWS allows up to 3600 seconds (60 minutes). For more information, see Configuring Idle Timeout .
spec:
api:
loadBalancer:
type: Public
idleTimeoutSeconds: 300
etcdClusters v3 & tls
While kops is currently not etcd3 by default, v3 and TLS authentication can be turned on for communication between cluster members. These options can be enabled via the cluster configuration file (manifests only, no command line options). Warning: There is currently no upgrade path to migrate from v2 to v3, so do not attempt to enable this feature on a v2 running cluster as it must be done when the cluster is created. The sample code snippet below assumes an HA cluster consisting of three master servers.
etcdClusters:
- etcdMembers:
- instanceGroup: master0-az0
name: a-1
- instanceGroup: master1-az0
name: a-2
- instanceGroup: master0-az1
name: b-1
enableEtcdTLS: true
name: main
version: 3.0.17
- etcdMembers:
- instanceGroup: master0-az0
name: a-1
- instanceGroup: master1-az0
name: a-2
- instanceGroup: master0-az1
name: b-1
enableEtcdTLS: true
name: events
version: 3.0.17
kubernetesApiAccess
This array configures the CIDR that can access the kubernetes API. On AWS, this manifests as an inbound security group rule on the ELB or primary security group.
For example, use this key to restrict cluster access to office IP address ranges.
spec:
kubernetesApiAccess:
- 12.34.56.78/32
cluster.spec Subnet Keys
id
Subnet ID to share in the existing VPC.
egress
The resource identifier (ID) in an existing VPC that you want to use as an "egress" for your network.
This feature was originally envisioned to allow re-use of NAT gateways. The usage is as follows. While NAT gateways are for "public" resources, in the cluster specification they must be specified in the private private subnet section. One way to think about this is that you specify "egress", which is the default route from this private subnet.
spec:
subnets:
- cidr: 10.20.64.0/21
name: us-east-1a
egress: nat-987654321
type: Private
zone: us-east-1a
- cidr: 10.20.32.0/21
name: utility-us-east-1a
id: subnet-12345
type: Utility
zone: us-east-1a
publicIP
The IP of the existing EIP that you want to add to the NAT gateway.
spec:
subnets:
- cidr: 10.20.64.0/21
name: us-east-1a
publicIP: 203.93.148.142
type: Private
zone: us-east-1a
kubeAPIServer
The configuration contained in this section kube-apiserver
.
oidc flag for Open ID Connect token
Read more about this here: https://kubernetes.io/docs/admin/authentication/#openid-connect-tokens
spec:
kubeAPIServer:
oidcIssuerURL: https://your-oidc-provider.svc.cluster.local
oidcClientID: kubernetes
oidcUsernameClaim: sub
oidcUsernamePrefix: "oidc:"
oidcGroupsClaim: user_roles
oidcGroupsPrefix: "oidc:"
oidcCAFile: /etc/kubernetes/ssl/kc-ca.pem
audit logging
Read more about this here: https://kubernetes.io/docs/admin/audit
spec:
kubeAPIServer:
auditLogPath: /var/log/kube-apiserver-audit.log
auditLogMaxAge: 10
auditLogMaxBackups: 1
auditLogMaxSize: 100
auditPolicyFile: /srv/kubernetes/audit.yaml
Note : auditPolicyFile is required. If this flag is omitted, no event is logged.
Advanced audit policy files can be pushed on the master node using the fileAssets feature.
A sample policy file can be found here
Max Requests Inflight
The maximum number of non-mutating requests in flight at a given time. When the server exceeds this, it rejects requests. Zero is unlimited. (default 400)
spec:
kubeAPIServer:
maxRequestsInflight: 1000
runtimeConfig
Here keys and values are translated into values in kube-apiserver, --runtime-config
separated by commas.
Use it to enable alpha features like:
spec:
kubeAPIServer:
runtimeConfig:
batch/v2alpha1: "true"
apps/v1alpha1: "true"
will generate parameters --runtime-config=batch/v2alpha1=true,apps/v1alpha1=true
. Note that kube-apiserver accepts true as the value for the switch-like flag.
serviceNodePortRange
This value is used as a parameter of kube-apiserver --service-node-port-range
.
spec:
kubeAPIServer:
serviceNodePortRange: 30000-33000
externalDns
This section of configuration options is provided for external DNS. The current external DNS provider is kops dns-controller, which can set up DNS records for Kubernetes resources. dns-controller is planned to be phased out and replaced with external-dns.
spec:
externalDns:
watchIngress: true
The default kops behavior is false. watchIngress:true uses the default dns-controller behavior to watch the ingress controller for changes. Setting this option risks interrupting service updates in some cases.
kubelet
This section contains the configuration of the kubelet. See https://kubernetes.io/docs/admin/kubelet/
Note: If the corresponding config value is nullable, you can set the field to be empty in the spec, which will pass an empty string to the kubelet as a config value.
spec:
kubelet:
resolvConf: ""
The flag --resolv-conf= will be built.
Enable custom metrics support
To use custom metrics in kubernetes as per the [custom metrics doc] we have to set the flag --enable-custom-metrics to true on all kubelets. It can be specified in the kubelet specification in our cluster.yml.
spec:
kubelet:
enableCustomMetrics: true
kubeScheduler
This section contains configuration for kube-scheduler. See https://kubernetes.io/docs/admin/kube-scheduler/
spec:
kubeScheduler:
usePolicyConfigMap: true
Will make kube-scheduler use the "scheduler-policy" scheduler policy from the configmap in the namespace kube-system.
Note that from Kubernetes 1.8.0 kube-scheduler does not automatically reload its configuration from configmap. Need to go into the master instance and restart the Docker container manually.
kubeControllerManager
This section contains the configuration of the controller-manager.
spec:
kubeControllerManager:
horizontalPodAutoscalerSyncPeriod: 15s
horizontalPodAutoscalerDownscaleDelay: 5m0s
horizontalPodAutoscalerUpscaleDelay: 3m0s
For more details on the horizontalPodAutoscaler flag, see the official HPA documentation and the Kops guide on how to set it .
Feature Gates
spec:
kubelet:
featureGates:
Accelerators: "true"
AllowExtTrafficLocalEndpoints: "false"
will generate flag --feature-gates=Accelerators=true,AllowExtTrafficLocalEndpoints=false
Note: Feature gate ExperimentalCriticalPodAnnotation
is enabled by default because some critical components such as kube-proxy
depend on it.
Computing resource reservation
spec:
kubelet:
kubeReserved:
cpu: "100m"
memory: "100Mi"
storage: "1Gi"
kubeReservedCgroup: "/kube-reserved"
systemReserved:
cpu: "100m"
memory: "100Mi"
storage: "1Gi"
systemReservedCgroup: "/system-reserved"
enforceNodeAllocatable: "pods,system-reserved,kube-reserved"
will generate the flag: --kube-reserved=cpu=100m,memory=100Mi,storage=1Gi --kube-reserved-cgroup=/kube-reserved --system-reserved=cpu=100mi,memory=100Mi,storage=1Gi --system-reserved-cgroup=/system-reserved --enforce-node-allocatable=pods,system-reserved,kube-reserved
.
Learn more about reserving compute resources .
networkID
On AWS, this is the ID of the VPC where the cluster was created. If you need to create a cluster from scratch, you can leave this field unspecified; kops will create a VPC for you.
spec:
networkID: vpc-abcdefg1
See here for more information on running in an existing VPC .
hooks
Hooks allow to perform some actions before Kubernetes is installed on each node in the cluster. For example, Nvidia drivers can be installed to use the GPU. This hook can be in the form of a Docker image or manifest (systemd unit). Hooks can be placed in the cluster spec, which means they will be deployed globally, or they can be placed in the instanceGroup spec. Note: When instanceGroup is the same as the service name on the cluster spec, instanceGroup takes precedence and the definition in the cluster spec is ignored; i.e., if the cluster has a unit file "myunit.service", and there is a unit file "myunit.service" in the instanceGroup, The files in the instanceGroup are applied.
spec:
# many sections removed
hooks:
- before:
- some_service.service
requires:
- docker.service
execContainer:
image: kopeio/nvidia-bootstrap:1.6
# these are added as -e to the docker environment
environment:
AWS_REGION: eu-west-1
SOME_VAR: SOME_VALUE
# or a raw systemd unit
hooks:
- name: iptable-restore.service
roles:
- Node
- Master
before:
- kubelet.service
manifest: |
[Service]
EnvironmentFile=/etc/environment
# do some stuff
# or disable a systemd unit
hooks:
- name: update-engine.service
disabled: true
# or you could wrap this into a full unit
hooks:
- name: disable-update-engine.service
before:
- update-engine.service
manifest: |
Type=oneshot
ExecStart=/usr/bin/systemctl stop update-engine.service
Install Ceph
spec:
# many sections removed
hooks:
- execContainer:
command:
- sh
- -c
- chroot /rootfs apt-get update && chroot /rootfs apt-get install -y ceph-common
image: busybox
fileAssets
FileAssets is an alpha feature that allows you to put inline file contents into cluster and instanceGroup configurations. It's designed to be alpha, and you can replace it with kubernetes daemonsets.
spec:
fileAssets:
- name: iptable-restore
# Note if not path is specificied the default path it /srv/kubernetes/assets/<name>
path: /var/lib/iptables/rules-save
roles: [Master,Node,Bastion] # a list of roles to apply the asset to, zero defaults to all
content: |
some file content
cloudConfig
disableSecurityGroupIngress
If you are using aws as the cloudProvider, you can disable the authorization of the ELB security group to the Kubernetes Nodes security group. In other words, it doesn't add security group rules. This can be useful to avoid the AWS limit: 50 rules per security group.
spec:
cloudConfig:
disableSecurityGroupIngress: true
elbSecurityGroup
Warning: This only works with Kubernetes versions above 1.7.0.
To avoid creating a security group for each elb, you can specify the security group id, which will be assigned to the LoadBalancer. It must be the security group ID, not the name. api.loadBalancer.additionalSecurityGroups
Must be empty because Kubernetes will add rules for each port specified in the services file. This avoids the AWS limit: 500 security groups per region, 50 rules per security group.
spec:
cloudConfig:
elbSecurityGroup: sg-123445678
docker
Docker daemon options that can be overridden for all masters and nodes in the cluster. Check out the API documentation for a complete list of options.
registryMirrors
If you have a bunch of Docker instances (physicsal or vm) running, every time one of them pulls an image that doesn't exist on the host, it will pull it from DockerHub. By caching these images, you can keep traffic on your local network and avoid egress bandwidth usage. This setup is good not only for cluster configuration, but also for images pulling.
@see Cache-Mirror Dockerhub For Speed @see Configure the Docker daemon.
spec:
docker:
registryMirrors:
- https://registry.example.com
storage
The Docker storage driver can be specified to override the default. Make sure the driver you choose is supported by your OS and docker version.
docker:
storage: devicemapper
storageOpts:
- "dm.thinpooldev=/dev/mapper/thin-pool"
- "dm.use_deferred_deletion=true"
- "dm.use_deferred_removal=true"
sshKeyName
In some cases it may be necessary to use an existing AWS SSH key instead of allowing kops to create a new one. The name of the key provided in AWS is an alternative to --ssh-public-key.
spec:
sshKeyName: myexistingkey
target
In some use cases, you may wish to use additional options to increase the target output. Target supports the fewest options you can do. Currently only terraform targets support this, but kops may eventually support more if other use cases emerge.
spec:
target:
terraform:
providerExtraConfig:
alias: foo