Using eBPF to accelerate Alibaba Cloud Service Grid ASM

background

With the rapid development of cloud-native application architecture, microservice architecture has become one of the main ways to build modern applications. In a microservices architecture, communication between services becomes crucial. In order to achieve elasticity and scalability, many organizations are beginning to adopt service mesh technology to manage communication between services.

As one of the most popular service meshes today, Istio provides a powerful set of features to simplify the management and operation of service meshes. It implements functions such as traffic management, monitoring and security control between services by introducing a set of specialized proxies (ie Sidecar).

In Istio, a sidecar is a special proxy that is deployed with each service instance and is responsible for handling communication between that instance and other services. It lives inside a service container, runs alongside application instances, and provides the functionality of a service mesh by intercepting and forwarding network traffic.

However, just because the sidecar runs with each service instance, it can also introduce some potential performance issues, one of the main ones being latency.

Since each service instance needs to communicate with its corresponding sidecar, this increases the length of the request path and network latency. In addition, Sidecar is also responsible for performing various functions, such as traffic management, monitoring and security control, etc., which will also have a certain impact on performance.

In response to the delay problem introduced by Sidecar, eBPF sockops technology is commonly used in the industry to optimize. Under the same node, socket communication between two processes is short-circuited, that is, TCP messages do not need to go through the TCP/IP protocol stack. The schematic diagram of the accelerated traffic path is as follows:

Alibaba Cloud Service Grid recently launched the sidecar acceleration component . Next, we will test and verify it, especially comparing the actual acceleration effect before and after it is turned on.

Installation, deployment and environment introduction

Environmental preparation

First, create an ASM instance according to the documentation. The author uses the latest version of ASM v1.18 Enterprise Edition.

Then, create an ACK cluster. The ASM sidecar acceleration component only supports ACK managed version and ACK proprietary version clusters. The author created an ACK managed version instance. The version uses v1.26. The cluster contains 3 nodes. The node operating system image uses Alibaba Cloud Linux3 recommended by the document. And add ACK to the ASM instance.

The environment information is as follows:

  • ✅ASM instance

  • ✅ACK cluster

The network CNI plug-in uses terway

Deployment test example

A simplified version of the stress test program extracted from the istio official benchmark tool is used here.

---
apiVersion: v1
kind: Service
metadata:
  name: fortioserver
spec:
  ports:
  - name: http-echo
    port: 8080
    protocol: TCP
  - name: tcp-echoa
    port: 8078
    protocol: TCP
  - name: grpc-ping
    port: 8079
    protocol: TCP
  selector:
    app: fortioserver
  type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: fortioserver
  name: fortioserver
spec:
  selector:
    matchLabels:
      app: fortioserver
  template:
    metadata:
      labels:
        app: fortioserver
      annotations:
        sidecar.istio.io/proxyCPULimit: 2000m
        proxy.istio.io/config: |
          concurrency: 2
    spec:
      containers:
      - name: captured
        image: fortio/fortio:latest_release
        ports:
        - containerPort: 8080
          protocol: TCP
        - containerPort: 8078
          protocol: TCP
        - containerPort: 8079
          protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  annotations:
      service.beta.kubernetes.io/alibaba-cloud-loadbalancer-health-check-switch: "off"
  name: fortioclient
spec:
  ports:
  - name: http-report
    port: 8080
    protocol: TCP
  selector:
    app: fortioclient
  type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: fortioclient
  name: fortioclient
spec:
  selector:
    matchLabels:
      app: fortioclient
  template:
    metadata:
      annotations:
        sidecar.istio.io/proxyCPULimit: 4000m
        proxy.istio.io/config: |
           concurrency: 4
      labels:
        app: fortioclient
    spec:
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - fortioserver
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: captured
        volumeMounts:
        - name: shared-data
          mountPath: /var/lib/fortio
        image: fortio/fortio:latest_release
        args:
        - report
        ports:
        - containerPort: 8080
          protocol: TCP
      volumes:
      - name: shared-data
        emptyDir: {}

According to the Sidecar Acceleration component documentation , opening the component cannot accelerate existing TCP connections. Therefore, the author configured the relevant connection pool configuration on the client side through DestinationRule, and set the idle time of the connection to 30s to ensure that multiple rounds of testing before and after, the connection is always Newly built. (The interval between the two rounds of testing should be more than 30 seconds)

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: fortioserver
spec:
  host: fortioserver.default.svc.cluster.local
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        idleTimeout: 30s

Copy the above yaml and kubectl apply. Note that automatic sidecar injection has been enabled for the default namespace before deployment.

Stress test model: It is very simple: fortioclient -> fortioserver. After injecting sidecar, the stress test traffic path becomes:

[ fortioclient -> sidecar ] -> [ sidecar -> fortioserver ]

The simple description of Yaml configuration is as follows:

1) Considering that most functions of envoy routing and load balancing capabilities are played by the outbound sidecar, the above configuration deliberately increases the CPU of the outbound sidecar, sets its CPU limit to 4000m, and adjusts the concurrency to 4 (optimal performance) to avoid pressure. The test client becomes the bottleneck.

  1.   In order to test the acceleration effect in multiple stages, fortioclient and fortioserver are deliberately scheduled to the same node through pod affinity.

3) The stress test results of each round can be viewed through port 8080 of fortioclient.

Pressure measurement method:

1) http request performance stress test

kubectl exec deployment/fortioclient -c captured -- fortio load -c 64 -qps 14000 -t 30s -a -r 0.00005 -httpbufferkb=64 -labels http-after-install-acceleration-perf-test-1 http://fortioserver:8080/echo?size=1024

2) TCP request performance stress test

kubectl exec deployment/fortioclient -c captured -- fortio load -c 64 -qps  0 -t 30s -a -r 0.00005  -labels tcp-after-install-acceleration-perf-test-1 tcp://fortioserver:8078

Among them, labels are the names corresponding to this round of stress testing, which can be used to distinguish the results of multiple rounds of stress testing.

qps needs to be adjusted according to the actual stress testing scenario. Set to 0 for no upper limit. Setting it to non-zero means using a fixed QPS for stress testing.

For the meaning of fortio related parameters, please refer to the official link document: https://github.com/fortio/fortio

Performance Testing

In order to avoid interference information during stress testing, the log can be temporarily closed. Just close the operation under the observable configuration of the ASM console.

First, conduct a round of environmental QPS upper limit testing. Compare the QPS before and after it is turned on to see if there is any improvement.

Pressure test related parameter settings:

  • 64 concurrent
  • No upper limit on QPS
  • Continuous pressure test for 30 seconds
  • http payload 1024 (1KB) size
kubectl exec deployment/fortioclient -c captured -- fortio load -c 64 -qps 0 -t 30s -a -r 0.00005 -httpbufferkb=64 -labels http-after-install-acceleration-perf-test-1 http://fortioserver:8080/echo?size=1024

Pressure test results:

You can also view the related histogram through the loadbalancer ip access of fortioclient, and you can see the latency distribution of most requests.

Test the effect after turning on the Sidecar Acceleration acceleration component:

Find the acceleration component under the component management menu of the ACK console and click Install;

After the installation prompts that the installation is successful, use the same pressure test command again to perform the pressure test:

kubectl exec deployment/fortioclient -c captured -- fortio load -c 64 -qps 0 -t 30s -a -r 0.00005 -httpbufferkb=64 -labels http-after-install-acceleration-perf-test-1 http://fortioserver:8080/echo?size=1024

Pressure test results:

Comparison before and after turning on:

From a QPS perspective, 13521 / 11461.0 = 1.179739987784661, a QPS improvement of about 18%.

From a latency perspective: 4.732/5.583 = 0.8475729894322049, the average AVG latency is reduced by about 16%.

We can intuitively see from the histogram provided by fortio UI that after the acceleration component is turned on, the latency is lower, and most requests are in the low-latency area. Compared with the requests before the acceleration component is turned on, more than a part of the requests are in a higher delay area.

The author conducted multiple rounds of pressure tests to eliminate related environmental jitter factors.

By adjusting and conducting multiple rounds of stress tests concurrently, the basic improvement in QPS can be guaranteed to be around 15%.

Then, a set of TCP stress test comparisons were conducted again.

Stress test related parameter configuration:

  • 64 concurrent
  • 1024 payload
  • Continuous pressure test for 30 seconds

Before opening:

Execute the following command to perform a stress test;

kubectl exec deployment/fortioclient -c captured -- fortio load -c 64 -qps  0 -t 30s -a -r 0.00005 --payload-size 1024  -labels tcp-not-install-acceleration-perf-test-1 tcp://fortioserver:8078

Carry out multiple rounds of pressure tests. The differences between the multiple rounds of pressure tests are not significant, and interference information is eliminated.

After opening:

Execute the following command:

kubectl exec deployment/fortioclient -c captured -- fortio load -c 64 -qps  0 -t 30s -a -r 0.00005 --payload-size 1024  -labels tcp-after-install-acceleration-perf-test-1 tcp://fortioserver:8078

Turn on histogram comparison before and after:

QPS before and after comparison:

85665/54564.9 = 1.5699653073679234, a QPS improvement of more than 50%. This is because for TCP, sidecar/envoy only performs pure forwarding of TCP load balancing and does not need to perform HTTP message parsing.

Therefore, in this scenario, the proportion of time that packets take to pass through the TCP/IP protocol stack is relatively high. We can also see this through Latency comparison.

Latency before and after comparison:

0.746 ms / 1.172.ms = 0.636, a latency reduction of nearly 40%.

Summarize

The Sidecar under the service grid acts as a proxy for sending and receiving requests for business services, and provides business-level traffic control (routing), load balancing and other functions, which will introduce a certain latency delay. Using eBPF technology (deployment of sidecar acceleration component) to socket short-circuit TCP messages between two processes under the same node can improve performance to a certain extent. In HTTP scenarios, QPS can be increased by about 15%, effectively reducing the latency of business requests.

In actual business scenarios, for Latency-sensitive businesses, we can deploy upstream and downstream dependent services on the same node through pod affinity, and use the Sidecar Acceleration Using eBPF component to ensure lower latency and higher QPS of the service.

The author of the open source framework NanUI switched to selling steel, and the project was suspended. The first free list in the Apple App Store is the pornographic software TypeScript. It has just become popular, why do the big guys start to abandon it? TIOBE October list: Java has the biggest decline, C# is approaching Java Rust 1.73.0 Released A man was encouraged by his AI girlfriend to assassinate the Queen of England and was sentenced to nine years in prison Qt 6.6 officially released Reuters: RISC-V technology becomes the key to the Sino-US technology war New battlefield RISC-V: Not controlled by any single company or country, Lenovo plans to launch Android PC
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3874284/blog/10117446