Elastic Scaling in Application Modernization

Author: Ma Wei, container consultant of Qingyun Technology, cloud-native enthusiast, currently focusing on cloud-native technology, and the technology stack in the cloud-native field involves Kubernetes, KubeSphere, KubeKey, etc.

In 2019, I am deploying virtualization to many enterprises, introducing virtual networks and virtual storage.

In 2023, these enterprises will already be cloud-native. For high-traffic web applications, real-time data analysis, large-scale data processing, mobile applications and other services, containers are more suitable than virtual machines because they are lightweight, fast in response, easy to port, and have strong elasticity ability.

Why do you need elastic scaling?

  • Peak load handling: Promotional activities, holiday shopping seasons or emergencies quickly expand resources according to demand to ensure application availability and performance.
  • Improve resource utilization: dynamically adjust resource scale according to actual resource load, avoid waste of infrastructure resources, and reduce TCO.
  • Response to failure and fault tolerance: multi-instance deployment and rapid replacement to improve business continuity and availability.
  • Follow demand changes: match the front-end business needs and pressures, quickly adjust the scale, improve event response capabilities, and meet needs and expectations.

Horizontal Pod Autoscaling

Kubernetes itself provides an elastic scaling mechanism, including Vertical Pod Autoscaler (VPA) and Horizontal Pod Autoscaler (HPA). HPA increases or decreases the number of pods of the replica controller according to CPU and memory utilization, which is a functional feature of scaling resource scale.

HPA relies on Metrics-Server to capture CPU and memory data to provide resource usage measurement data, and can also be scaled according to custom indicators (such as Prometheus).

As can be seen from the above figure, HPA continuously monitors the indicators of Metrics-Server, and then calculates the required number of copies to dynamically adjust resource copies to achieve horizontal scaling of the set target resource value.

But there are certain limitations:

  • No external metrics support. Such as different event sources, different middleware/applications, etc., the application changes and dependencies on the business side are diverse, not just based on CPU and memory expansion.
  • Cannot 1->0. Can the application not run the workload at a time when there is always 0 load?

So there is Kubernetes-based Event-Driven Autoscaling (KEDA)!

WHO

KEDA automatically scales based on event-driven. What is event driven? My understanding is to react to various events on the system and act accordingly (scaling). Then KEDA is a HPA+multiple triggers. As long as the trigger receives an event and is triggered, KEDA can use HPA for automatic scaling, and KEDA can be 1-0, 0-1!

architecture

KEDA itself has several components:

  • Agent: KEDA activates and stops Kubernetes workloads (keda-operator main function)
  • Metrics: KEDA serves as a Kubernetes metrics server, providing rich event data to Horizontal Pod Autoscaler and consuming events from the source. (The main role of keda-operator-metrics-apiserver).
  • Admission Webhooks: Automatically validate resource changes to prevent misconfigurations.
  • Event sources: KEDA external event/trigger sources for changing pod count. Such as Prometheus, Kafka.
  • Scalers: Monitor event sources, get metrics and trigger scaling based on events.
  • Metrics adapter: Get metrics from Scalers and send them to HPA.
  • Controller: Operate according to the indicators provided by Adapter, and tune to the resource status specified in ScaledObject. The Scaler continuously monitors events according to the event source set in the ScaledObject, and passes the metrics to the Metrics Adapter when any trigger event occurs. The Metrics Adapter adjusts the metrics and provides them to the Controller component, and the Controller expands or shrinks the Deployment according to the scaling rules set in the ScaledObject.

In general, KEDA sets a ScaledObject and defines an event trigger, which can be a message from a message queue, a message from a topic subscription, a message from a storage queue, an event from an event gateway, or a custom trigger. Based on these events, the number of replicas of the application or the resource configuration of the handler is automatically adjusted to achieve elastic scaling according to the actual load situation.

CRD

  • ScaledObjects: Represents event sources (such as Rabbit MQ) and Kubernetes. Required mapping between Deployment, StatefulSet or any custom resource that defines/scales sub-resources.
  • ScaledJobs: Mapping between event sources and Kubernetes Jobs. Adjust the job size according to the event trigger.
  • TriggerAuthentications: Authentication parameters of the trigger.
  • ClusterTriggerAuthentications: cluster dimension authentication.

Deploy KEDA

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
kubectl create namespace keda
helm install keda kedacore/keda --namespace keda

kubectl apply -f https://github.com/kedacore/keda/releases/download/v2.10.1/keda-2.10.1.yaml

root@node-1:/# kubectl get all -n keda
NAME                                          READY   STATUS    RESTARTS   AGE
pod/keda-metrics-apiserver-7d89dbcb54-v22nl   1/1     Running   0          44s
pod/keda-operator-5bb9b49d7c-kh6wt            0/1     Running   0          44s

NAME                             TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
service/keda-metrics-apiserver   ClusterIP   10.233.44.19   <none>        443/TCP,80/TCP   45s

NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/keda-metrics-apiserver   1/1     1            1           45s
deployment.apps/keda-operator            0/1     1            0           45s

NAME                                                DESIRED   CURRENT   READY   AGE
replicaset.apps/keda-metrics-apiserver-7d89dbcb54   1         1         1       45s
replicaset.apps/keda-operator-5bb9b49d7c            1         1         0       45s
root@node-1:/# kubectl get all -n keda
NAME                                          READY   STATUS    RESTARTS   AGE
pod/keda-metrics-apiserver-7d89dbcb54-v22nl   1/1     Running   0          4m8s
pod/keda-operator-5bb9b49d7c-kh6wt            1/1     Running   0          4m8s

NAME                             TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
service/keda-metrics-apiserver   ClusterIP   10.233.44.19   <none>        443/TCP,80/TCP   4m9s

NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/keda-metrics-apiserver   1/1     1            1           4m9s
deployment.apps/keda-operator            1/1     1            1           4m9s

NAME                                                DESIRED   CURRENT   READY   AGE
replicaset.apps/keda-metrics-apiserver-7d89dbcb54   1         1         1       4m9s
replicaset.apps/keda-operator-5bb9b49d7c
# kubectl get crd | grep keda
clustertriggerauthentications.keda.sh                     2023-05-11T09:26:06Z
scaledjobs.keda.sh                                        2023-05-11T09:26:07Z
scaledobjects.keda.sh                                     2023-05-11T09:26:07Z
triggerauthentications.keda.sh                            2023-05-11T09:26:07Z

KubeSphere deploys KEDA

kubectl edit cc -n kubesphere-system (kubesphere 3.4+)
spec:
···
  autoscaling:
    enabled: true
···

Extended Workload CRD

ScaledObject resource definition, please refer to https://keda.sh/docs/2.10/concepts/scaling-deployments/ for detailed parameters .

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: {scaled-object-name}
spec:
  scaleTargetRef:
    apiVersion:    {api-version-of-target-resource}  # Optional. Default: apps/v1
    kind:          {kind-of-target-resource}         # Optional. Default: Deployment
    name:          {name-of-target-resource}         # Mandatory. Must be in the same namespace as the ScaledObject
    envSourceContainerName: {container-name}         # Optional. Default: .spec.template.spec.containers[0]
  pollingInterval:  30                               # Optional. Default: 30 seconds
  cooldownPeriod:   300                              # Optional. Default: 300 seconds
  idleReplicaCount: 0                                # Optional. Default: ignored, must be less than minReplicaCount
  minReplicaCount:  1                                # Optional. Default: 0
  maxReplicaCount:  100                              # Optional. Default: 100
  fallback:                                          # Optional. Section to specify fallback options
    failureThreshold: 3                              # Mandatory if fallback section is included
    replicas: 6                                      # Mandatory if fallback section is included
  advanced:                                          # Optional. Section to specify advanced options
    restoreToOriginalReplicaCount: true/false        # Optional. Default: false
    horizontalPodAutoscalerConfig:                   # Optional. Section to specify HPA related options
      name: {name-of-hpa-resource}                   # Optional. Default: keda-hpa-{scaled-object-name}
      behavior:                                      # Optional. Use to modify HPA's scaling behavior
        scaleDown:
          stabilizationWindowSeconds: 300
          policies:
          - type: Percent
            value: 100
            periodSeconds: 15
  triggers:
  # {list of triggers to activate scaling of the target resource}


View metrics exposed by KEDA Meterics Server

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1"

Demo

KEDA currently supports 53 kinds of Scalers, such as Kafka, Elasticsearch, MySQL, RabbitMQ, Prometheus and so on. Here is an example of Prometheus and Kafka.

Prometheus & KEDA

Deploy a web application and use Prometheus to monitor the http request indicators of the web application.

In order to seek a demonstration effect, a clickable and interactive Demo APP is deployed here, the address is as follows: https://github.com/livewyer-ops/keda-demo/blob/v1.0.0/examples/keda/.

Access via NodePort after successful deployment:

Enter the KubeSphere project and create a custom scaling:

Set the minimum number of copies to 1, the maximum number of copies to 10, the polling interval to 5 seconds, and the waiting time to 1 minute:

KubeSphere supports Cron, Prometheus, and custom triggers:

The trigger sets Prometheus, the set request is the sum of the growth rate within 30s, and when the threshold is greater than 3, the event drives the scaling:

Set some other settings, such as whether to restore the original number of copies after resource deletion, and scaling policy settings:

Now access the Web App concurrently:

You can see the changes in monitoring indicators in custom monitoring:

The number of copies of the Web App starts to scale out:

Finally expanded to 10 copies defined in ScaledObject:

After the access stops, you can see that the value of the monitoring indicator is gradually decreasing:

The Deployment starts shrinking:

Kafka & KEDA

The overall topology demonstrated by KEDA using the Kafka event source is as follows:

Kafka uses Demo code: https://github.com/ChamilaLiyanage/kafka-keda-example.git.

Department Kafka

Open the KubeSphere application store and view the DMP database center:

Select Kafka to install:

After installing Kafka, create a test Kafka Topic, set the Topic partition to 5, and set the copy to 1:

Create a Kafka Producer service:

Send order to topic:

Create a Consumer service:

Send a new order to see if the Consumer service is consumed:

Now you can do automatic scaling, create a ScaledObject, set the minimum number of copies to 0, the maximum to 10, the polling interval to 5s, and Kafka LagThreshold to 10:

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-scaledobject
  namespace: default
  labels:
    deploymentName: kafka-consumer-deployment # Required Name of the deployment we want to scale.
spec:
  scaleTargetRef:
    deploymentName: kafka-consumer-deployment # Required Name of the deployment we want to scale.
  pollingInterval: 5
  minReplicaCount: 0   #Optional Default 0
  maxReplicaCount: 10  #Optional Default 100
  triggers:
  - type: kafka
    metadata:
      # Required
      BootstrapeServers: radondb-kafka-kafka-external-bootstrap.demo:9092 # Kafka bootstrap server host and port
      consumerGroup: order-shipper  # Make sure that this consumer group name is the same one as the one that is consuming topics
      topic: test
      lagThreshold: "10" # Optional. How much the stream is lagging on the current consumer group

Create a custom extension:

Now, let's submit about 100,000 order messages to the queue and see autoscaling in action. You'll see more kafka-consumer pods spawned as the queue grows with excess messages.

NAMESPACE   NAME                      REFERENCE                   TARGETS      MINPODS   MAXPODS   REPLICAS   AGE
demo        keda-hpa-kafka-consumer   Deployment/kafka-consumer   5/10 (avg)   1         10        1          2m35s

Here we see that the maximum number of copies is 5, but not 10, because the default maximum number of copies will not exceed the number of Kafka topic partitions. The partition is set to 5 above, and allowIdleConsumers: true can be activated to disable this default behavior. After re-editing the custom scaling, the maximum copy changes to 10:

When there is no message consumption, the copy change is 0:

end

This is the end of this article, and friends who have needs or are interested in this can start practicing.

This article is published by OpenWrite, a multi-post platform for blogging !

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4197945/blog/8806257