Affinity and anti-affinity in Kubernetes

Under normal circumstances, the administrator does not need to worry about which Nodes the Pod assigns to. This process will be automatically implemented by the scheduler. But sometimes, we need to specify some scheduling restrictions, for example, some applications should run on a node with SSD storage, some applications should run on the same node, and so on.

nodeSelector :
First, we plan the label for the Node, and then when creating the deployment, use the nodeSelector label to specify which nodes the Pod runs on.

Affinity (Affinity) and Anti-Affinity (AntiAffinity):

  • Affinity: The two applications of Application A and Application B interact frequently, so it is necessary to use affinity to make the two applications as close as possible, even on one node, to reduce the performance loss caused by network communication.
  • Anti-affinity: When the application is deployed with multiple copies, it is necessary to use anti-affinity to disperse each application instance on each node to improve HA.

Contains: nodeAffinity (host affinity), podAffinity (POD affinity) and podAntiAffinity (POD anti-affinity)

Strategy name Match target Supported operators Support topology domain Design goals
nodeAffinity Host label In,NotIn,Exists,DoesNotExist,Gt,Lt not support Decide on which hosts the Pod can be deployed
podAffinity Pod tags In,NotIn,Exists,DoesNotExist stand by Decide which Pod can be deployed in the same topology domain as which Pod
PodAntiAffinity Pod tags In,NotIn,Exists,DoesNotExist stand by Decide which Pods cannot be deployed in the same extension

NodeAffinity usage scenarios :

  • Deploy all Pods of the S1 service to designated hosts that meet the labeling rules.
  • Deploy all Pods of the S1 service to other hosts except some hosts.
    PodAffinity usage scenarios:
  • Deploy a specific service pod in the same topology domain without specifying a specific topology domain.
  • If the S1 service uses the S2 service, in order to reduce the network delay between them (or other reasons), deploy the POD of the S1 service and the pod of the S2 service in the same topology domain.
    PodAntiAffinity usage scenarios:

  • Disperse the POD of a service in different hosts or topological domains to improve the stability of the service itself.
  • Give POD exclusive access to a node to ensure resource isolation and ensure that no other pods will share node resources.
  • Distribute the PODs of services that may affect each other on different hosts.

Operator relationship:

  • In: The value of label is in a list
  • NotIn: The value of label is not in a list
  • Gt: The value of label is greater than a certain value
  • Lt: The value of label is less than a certain value
  • Exists: a label exists
  • DoesNotExist: a label does not exist

The scheduling method of nodeSelector is slightly simpler. Through the affinity and anti-affinity configuration, it can provide more flexible strategies for scheduling. The main enhancements are as follows:

  • More expression support, not just ADD and exact match
  • You can set soft/preference scheduling strategy instead of rigid requirements
  • You can use Pod tags to perform scheduling constraints, not just Node tags

Affinity features include two ways:

  • requiredDuringSchedulingIgnoredDuringExecution: hard, strictly enforced and meets the rule scheduling, otherwise it will not be scheduled, and it will be executed in the preselection stage, so it will not be scheduled to violate the hard agreement
  • preferredDuringSchedulingIgnoredDuringExecution: soft, do your best to execute, give priority to meeting rule scheduling, and execute in the preferred stage
  • requiredDuringSchedulingRequiredDuringExecution, similar to requiredDuringSchedulingIgnoredDuringExecution. The difference is that if the node no longer satisfies the affinity of the pod during the running of the pod, the pod will be evicted from the node.

IgnoreDuringExecution indicates that if the label of the Node changes during the Pod operation, causing the affinity policy to be unsatisfied, the current Pod will continue to run.

Limit
topologyKey:

  • For affinity and soft anti-affinity, empty topologyKey is not allowed;
  • For hard anti-affinity, the LimitPodHardAntiAffinityTopology controller is used to limit the topologyKey to kubernetes.io/hostname;
  • For soft anti-affinity, an empty topologyKey is interpreted as a combination of kubernetes.io/hostname, failure-domain.beta.kubernetes.io/zone and failure-domain.beta.kubernetes.io/region;

example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-cache
spec:
  selector:
    matchLabels:
      app: redis
  replicas: 3
  template:
    metadata:
      labels:
        app: redis
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - redis
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: redis-server
        image: redis:latest

In the above example, a deployment with three instances is created, and the anti-affinity strategy between Pods is adopted. When the created instances are restricted, if an instance with the same label already exists on the node, the deployment is not performed, which avoids Deploy multiple identical instances on a node.

Based on the above example:

Create 3 more web service instances, the same as the Redis configuration above, first ensure that the two webs will not be deployed to the same node, and then apply the inter-Pod affinity policy, and first deploy the web on the node with the Redis service.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-server
spec:
  selector:
    matchLabels:
      app: web-store
  replicas: 3
  template:
    metadata:
      labels:
        app: web-store
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - web-store
            topologyKey: "kubernetes.io/hostname"
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - redis
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: web-app
        image: nginx:latest

⚠️Notes:

  • The affinity strategy between Pods requires a considerable amount of calculations, which may significantly reduce the performance of the cluster. It is not recommended to use it in the range of more than 100 nodes.
  • The anti-affinity policy between Pods requires that all Nodes have consistent labels. For example, all nodes in the cluster should have labels matching the topologyKey. If some nodes lack these labels, it may cause abnormal behavior.
  • If nodeSelector and nodeAffinity are specified at the same time, two conditions must be met before the Pod can be scheduled to the candidate node.
  • If the pod is already scheduled on the node, when we delete or modify the label of the node, the pod will not be removed. In other words, the affinity selection is only valid during pod scheduling.
  • As long as one of the values ​​under the key meets the condition, then the current key meets the condition
  • If there are multiple key lists under matchExpressions, the pod can be scheduled to a certain node [for hard affinity] only when all the keys are satisfied.

Guess you like

Origin blog.51cto.com/14034751/2597734