Outline
RC, Deployment, DaemonSet are for stateless services, they manage Pod of IP, name, start and stop order, etc. are random, but what StatefulSet that? As the name suggests, there is a collection of state, you have to manage all state services, such as MySQL, MongoDB clusters and so on.
On StatefulSet is essentially a variant of Deployment has become a GA version v1.9 version, it has to solve the problem of state services, it manages Pod Pod has a fixed name, start and stop order, in StatefulSet in Pod name called network identification (hostname), must also use the shared storage.
In Deployment, the corresponding service is a service, in the corresponding StatefulSet the headless service, headless service, i.e., headless service, and the service is that it does not distinguish Cluster IP, which will return its name parsing Headless Endpoint list of all corresponding Pod of Service.
In addition, StatefulSet on the basis Headless Service has created a DNS domain name for each Pod copy StatefulSet controlled format of this domain is:
$(podname).(headless server name)
FQDN: $(podname).(headless server name).namespace.svc.cluster.local
StatefulSet example
Next, look at some examples demonstrate under the above mentioned characteristics, in order to deepen understanding
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx" #声明它属于哪个Headless Service.
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates: #可看作pvc的模板
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "gluster-heketi" #存储类名,改为集群中已存在的
resources:
requests:
storage: 1Gi
With this configuration file, it can be seen StatefulSet of three components:
- Headless Service: named nginx, used to define the network Pod identification (DNS domain).
- StatefulSet: define the specific application, called Nginx, there are three copies of Pod, and defines a domain name for each Pod.
- volumeClaimTemplates: storage volume application templates, create PVC, specify the name pvc size, will automatically create pvc, pvc and must be supplied by the storage class.
Why headless service headless service?
When using the Deployment, each Pod name is not sequential, a random string of the name is the Pod is disordered, but requires statefulset must be ordered, each pod can be optionally substituted, pod after pod reconstruction the name is still the same. The pod IP is changing, so is the name used to identify the Pod. pod pod name is unique identifier, lasting stability must be valid. This time to use the service without a head, it can be a unique name to each Pod.
Why volumeClaimTemplate? For the state of the replica set persistent storage will be used for a distributed system is concerned, its greatest feature is the data is not the same, so each node can not use the same storage volumes, each have their own dedicated storage node, but if Deployment of the storage volume defined in Pod template, all share a replica set storage volume, the data are the same, because it is based on the template, and statefulset each have their own proprietary Pod storage volume, so statefulset storage volumes can not be reused Pod template to create, so statefulSet use volumeClaimTemplate, called roll application template, it generates different for each of pvc Pod, and bind pv, enabling each pod has a dedicated storage. This is the reason why the use of volumeClaimTemplate.
create:
$ kubectl create -f nginx.yaml
service "nginx" created
statefulset "web" created
The three look Pod creation process:
#第一个是创建web-0
$ kubectl get pod
web-0 1/1 ContainerCreating 0 51s
#待web-0 running且ready时,创建web-1
$ kubectl get pod
web-0 1/1 Running 0 51s
web-1 0/1 ContainerCreating 0 42s
#待web-1 running且ready时,创建web-2
$ kubectl get pod
web-0 1/1 Running 0 1m
web-1 1/1 Running 0 45s
web-2 1/1 ContainerCreating 0 36s
#最后三个Pod全部running且ready
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 4m
web-1 1/1 Running 0 3m
web-2 1/1 Running 0 1m
According PVC volumeClaimTemplates created automatically
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
www-web-0 Bound pvc-ecf003f3-828d-11e8-8815-000c29774d39 2G RWO gluster-heketi 7m
www-web-1 Bound pvc-0615e33e-828e-11e8-8815-000c29774d39 2G RWO gluster-heketi 6m
www-web-2 Bound pvc-43a97acf-828e-11e8-8815-000c29774d39 2G RWO gluster-heketi 4m
If no dynamic cluster of PVC supply StorageClass mechanism can also advance manually create multiple PV, created after the name of PVC PVC, manually create must conform StatefulSet naming rules: (volumeClaimTemplates.name) - (pod_name)
Pod entitled web Statefulset three copies: web-0, web-1, web-2, volumeClaimTemplates name is: www, then created automatically as the name of PVC www-web [0-2], to create a per Pod PVC.
Law Summary:
Matching Pod name (network ID) model is: $ (statefulset name) - $ (ID), such as the above example: web-0, web-1, web-2.
StatefulSet copy of each Pod to create a DNS domain name, the domain name format is:. $ (Podname) (headless server name), which means among service is to communicate, rather than by the domain name Pod Pod IP, because when the Pod is located when Node failure, Pod will be drift to other Node, Pod IP will change, but Pod domain name will not change.
StatefulSet use Headless services to control Pod domain name, FQDN domain name is:. $ (Service name) $ (namespace) .svc.cluster.local, which, "cluster.local" refers to a cluster of domain names.
The volumeClaimTemplates, each Pod create a pvc, pvc naming rule matching pattern: (volumeClaimTemplates.name) - (pod_name), such as the above volumeMounts.name = www, Pod name = web- [0-2], thus creating out of PVC is www-web-0, www-web-1, www-web-2.
Delete Pod does not remove its pvc, pvc manually deleted automatically release pv.
About Cluster Domain, headless service name, StatefulSet DNS domain name examples of how it affects the StatefulSet of Pod:
Cluster Domain | Service (ns/name) | StatefulSet (ns/name) | StatefulSet Domain | under DNS | Pod Hostname |
---|---|---|---|---|---|
cluster.local | default/nginx | default/web | nginx.default.svc.cluster.local | web-{0..N-1}.nginx.default.svc.cluster.local | web- {0..N-1} |
cluster.local | foo/nginx | foo/web | nginx.foo.svc.cluster.local | web-{0..N-1}.nginx.foo.svc.cluster.local | web- {0..N-1} |
kube.local | foo/nginx | foo/web | nginx.foo.svc.kube.local | web- {0..N 1} .nginx.foo.svc.kube.local | web- {0..N-1} |
Statefulset start and stop order:
Ordered Deployment: when deploying StatefulSet, Pod if there are multiple copies, they are sequentially created (from 0 to N-1) and, before the next run all Pod Pod must both before and Ready Running state.
Ordered Delete: are deleted when Pod, the order in which they are to be terminated from the N-1 to zero.
Ordered extended: when executing an extended operation of the Pod, and deployment, as it must be in front of the Pod and Ready Running Status
Statefulset Pod management strategies:
After v1.7, modify Pod ordering policy by allowing, while ensuring the uniqueness of their identity by .spec.podManagementPolicy field.
- OrderedReady: start and stop sequence described above, the default setting.
- Parallel: Parallel controller tells StatefulSet start or terminate all Pod, and before another is started or stopped Pod Pod becomes a front Running and Ready or completely terminate without waiting.
StatefulSet usage scenarios:
Persistent storage stable, i.e. after the rescheduling Pod or access to the same persistent data, based on the PVC to achieve.
Stable network identifier, i.e. the Pod and reschedule its PodName HostName unchanged.
Orderly deployment and orderly expansion, based init containers to achieve.
Orderly contraction.
Update Policy
In Kubernetes 1.7 and later, by .spec.updateStrategy field allows you to configure or disable the Pod, labels, source request / limits, annotations automatic rollover feature.
OnDelete : is OnDelete, StatefulSet the controller does not automatically update the Pod StatefulSet by .spec.updateStrategy.type field set. Users must manually delete the Pod, for the controller to create a new Pod.
RollingUpdate : By .spec.updateStrategy.type field to RollingUpdate, to achieve the automatic scrolling Pod update, if .spec.updateStrategy not specified, this is the default policy.
StatefulSet controller will delete and recreate StatefulSet each Pod. It will be Pod termination sequence (sequence from the largest to the smallest ordinal number) is performed, update each Pod. Before the next update a Pod, you must wait for the Pod Running and Ready.
- Partitions : .spec.updateStrategy.rollingUpdate.partition be specified by updating strategies for RollingUpdate partition, if the specified partition, when the update .spec.template StatefulSet having all Pod ordinal number greater than or equal partition to be updated.
All Pod of ordinals less than partition will not be updated, even if you remove them will also be re-created. If large StatefulSet .spec.updateStrategy.rollingUpdate.partition thereon .spec.replicas, it will not propagate the update to .spec.template Pod. In most cases, you do not need to use partitions.