Non-intrusive full-link A/B Test practice based on WASM

1 Background introduction

We all know that a service mesh (ServiceMesh) can provide non-intrusive traffic governance capabilities for the microservices running on it. By configuring VirtualService and DestinationRule, you can implement functions such as traffic management, timeout retry, traffic replication, current limiting, and circuit breaking without modifying the microservice code.

The premise of traffic management is that there are multiple versions of a service. We can classify them according to the purpose of deploying multiple versions. The following is a brief description to facilitate understanding of Yu Wen.

  • traffic routing : According to the request information (Header/Cookie/Query Params), route the request traffic to the endpoint (Pod[]) of the specified version (Deployment) of the specified service (Service). This is what we call A/B Testing.
  • traffic shifting : Through grayscale/canary (Canary) release, the request traffic is routed to the endpoints (Pod[]) of each version (Deployment[]) of the specified service (Service) in an indiscriminate proportion.
  • traffic switching/mirroring : Published by Blue/Green, traffic switching is performed proportionally according to the request information, and traffic replication is performed.

The practice described in this article is to implement full-link A/B testing based on request headers.

1.1 Function brief description

From the documentation of the Istio community, we can easily find documentation and examples on how to route traffic to a specific version of a service based on request headers. But this example only works on the first service of the full link.

For example, a request to access three services ABC, all three services have enversions and frversions. we hope:

  • The header value is user:enthe request, and the full link route isA1-B1-C1
  • The header value is user:frthe request, and the full link route isA2-B2-C2

The corresponding VirtualService configuration looks like this:

http:
- name: A|B|C-route
  match:
  - headers:
      user:
        exact: en
  route:
  - destination:
      host: A|B|C-svc
      subset: v1
- route:
  - destination:
      host: A|B|C-svc
      subset: v2

Through the actual measurement, we can find that only the routing of the service A is in line with our expectations. B and C cannot route to the specified version based on the Header value.

Why is this? For the microservices on the service mesh, this header appears out of thin air, that is, the microservice code is unaware. Therefore, when A service requests B service, this header will not be transparently transmitted; that is, when A requests B, this header has been lost. At this point, the VirtualService configuration that matches the header for routing is meaningless.

To solve this problem, from the business point of view of the microservice side, only the code can be modified (enumerate all headers concerned by the business and transmit them transparently). But this is an intrusive modification and does not have the flexibility to support emerging headers.

From the perspective of service mesh infrastructure, any header is a kv pair that has no business meaning and needs to be transparently transmitted. Only by doing this can the service mesh transparently transmit user-defined headers indiscriminately, thereby supporting the non-intrusive full-link A/B test function.

So how to achieve it?

1.2 Current status of the community

As mentioned above, in the case that the header cannot be transparently transmitted, this function cannot be achieved simply by configuring the header matching of VirtualService.

However, is there any other configuration in VirtualService that can implement header transparent transmission? If it exists, then simply use VirtualService, the cost is minimal.

After various attempts (including carefully configuring header-related set/add), I found that it is impossible to achieve. The reason is that VirtualService's intervention on the header occurs in the inbound phase, and transparent transmission requires intervention of the header in the outbound phase. The microservice workload does not have the ability to transparently transmit the header value that appears out of thin air, so the header will be lost when routing to the next service.

Therefore, we can draw a conclusion: it is impossible to simply use VirtualService to implement non-intrusive full-link A/B Test. Furthermore, the existing configurations provided by the community cannot be used directly to support this function.

Then, only the more advanced configuration of EnvoyFilter is left. This is a conclusion we were very unhappy with at first. There are two reasons:

  1. The configuration of EnvoyFilter is too complicated, and it is difficult for ordinary users to quickly learn and use it in the service mesh. Even if we provide examples, once the requirements change slightly, the examples are of little reference value for modifying EnvoyFilter.
  2. Even if EnvoyFilter is used, currently the built-in filter of Envoy does not directly support this function, and it needs to be developed with Lua or WebAssembly (WASM).

1.3 Implementation plan

Next, enter the technology selection. I can sum it up in one sentence:

  • The advantage of Lua is that it is small, the disadvantage is that the performance is not ideal
  • The advantage of WASM is good performance, but the disadvantage is that it is more difficult to develop and distribute than Lua.
  • The mainstream implementation of WASM is C++ and Rust, and the implementation of other languages ​​is immature or has performance problems. This article uses Rust.

We use Rust to develop a WASM, in the outbound phase, get the header defined by the user in the EnvoyFilter and pass it backwards.

The distribution of the WASM package is stored using Kubernetes' configmap, and the Pod obtains and loads the WASM configuration through the definition in the annotation. (Why this form of distribution is used will be discussed later.)

2 Technical realization

Relevant code described in this section :
https://github.com/AliyunContainerService/rust-wasm-4-envoy/tree/master/propagate-headers-filter

2.1 Implementing WASM using RUST

1 Define dependencies

The core dependency of the WASM project is only one crates, which is proxy-wasm , which is the basic package for developing WASM using Rust. In addition, there is package serde_json for deserialization and package log for printing logs . Cargo.tomlDefined as follows:

[dependencies]
proxy-wasm = "0.1.3"
serde_json = "1.0.62"
log = "0.4.14"

2 Define the build

The final build form of WASM is a c-compatible dynamic link library, which Cargo.tomlis defined as follows:

[lib]
name = "propaganda_filter"
path = "src/propagate_headers.rs"
crate-type = ["cdylib"]

3 Header transparent transmission function

First define the structure as follows, which head_tag_nameis the name of the user-defined header key and the name head_tag_valueof the corresponding value.

struct PropagandaHeaderFilter {
    config: FilterConfig,
}

struct FilterConfig {
    head_tag_name: String,
    head_tag_value: String,
}

{proxy-wasm}/src/traits.rsmethod is trait HttpContextdefined in on_http_request_headers. We complete the function of Header transparent transmission by implementing this method.

impl HttpContext for PropagandaHeaderFilter {
    fn on_http_request_headers(&mut self, _: usize) -> Action {
        let head_tag_key = self.config.head_tag_name.as_str();
        info!("::::head_tag_key={}", head_tag_key);
        if !head_tag_key.is_empty() {
            self.set_http_request_header(head_tag_key, Some(self.config.head_tag_value.as_str()));
            self.clear_http_route_cache();
        }
        for (name, value) in &self.get_http_request_headers() {
            info!("::::H[{}] -> {}: {}", self.context_id, name, value);
        }
        Action::Continue
    }
}

Lines 3-6 get the user-defined header key-value pair in the configuration file, and if it exists, call set_http_request_headerthe method to write the key-value pair into the current header.

Line 7 is a workaround for the current proxy-wasm implementation. If you are interested in this, you can read the following reference:

2.2 Local authentication (based on Envoy)

1 WASM build

Use the following command to build the WASM project. It should be emphasized that wasm32-unknown-unknownthis target currently only exists in nightly, so you need to temporarily switch the build environment before building.

rustup override set nightly
cargo build --target=wasm32-unknown-unknown --release

After the build is complete, we use docker compose to start Envoy locally to verify the WASM function.

2 Envoy configuration

This example needs to provide two files for Envoy to start, one is built propaganda_filter.wasmand the other is the Envoy configuration file envoy-local-wasm.yaml. The indication is as follows:

volumes:
  - ./config/envoy/envoy-local-wasm.yaml:/etc/envoy-local-wasm.yaml
  - ./target/wasm32-unknown-unknown/release/propaganda_filter.wasm:/etc/propaganda_filter.wasm

Envoy supports dynamic configuration, and local tests use static configuration:

static_resources:
  listeners:
    - address:
        socket_address:
          address: 0.0.0.0
          port_value: 80
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
...
                http_filters:
                  - name: envoy.filters.http.wasm
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: "header_filter"
                          root_id: "propaganda_filter"
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "head_tag_name": "custom-version",
                                "head_tag_value": "hello1-v1"
                              }
                          vm_config:
                            runtime: "envoy.wasm.runtime.v8"
                            vm_id: "header_filter_vm"
                            code:
                              local:
                                filename: "/etc/propaganda_filter.wasm"
                            allow_precompiled: true
...

The configuration of Envoy focuses on the following three points:

  • On line 15 we define a http_filtersnamedheader_filtertype.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
  • The 32-line local file path is/etc/propaganda_filter.wasm
  • Lines 20-26 are related to the configuration type is type.googleapis.com/google.protobuf.StringValue, and the content of the value is {"head_tag_name": "custom-version","head_tag_value": "hello1-v1"}. The custom Header key here is named custom-versionand the value is hello1-v1.

3 Local Authentication

Execute the following command to start docker compose:

docker-compose up --build

Request local service:

curl -H "version-tag":"v1" "localhost:18000"

At this point, the Envoy log should have the following output:

proxy_1        | [2021-02-25 06:30:09.217][33][info][wasm] [external/envoy/source/extensions/common/wasm/context.cc:1152] wasm log: ::::create_http_context head_tag_name=custom-version,head_tag_value=hello1-v1
proxy_1        | [2021-02-25 06:30:09.217][33][info][wasm] [external/envoy/source/extensions/common/wasm/context.cc:1152] wasm log: ::::head_tag_key=custom-version
...
proxy_1        | [2021-02-25 06:30:09.217][33][info][wasm] [external/envoy/source/extensions/common/wasm/context.cc:1152] wasm log: ::::H[2] -> custom-version: hello1-v1

2.3 Distribution of WASM

WASM distribution refers to the process of storing WASM packages in a distributed warehouse for the specified Pod to pull.

1 Local way of Configmap + Envoy

Although this method is not the final state of WASM distribution, because it is easier to understand and suitable for simple scenarios, this method is finally chosen as an example to explain. Although the job of configmap is not stored in WASM, the local modes of configmap and Envoy are very mature, and the combination of the two can just meet the current needs.

The Alibaba Cloud Service Mesh ASM product already provides this similar method. For details, please refer to Writing WASM Filter for Envoy and deploying it to ASM . 

The first thing to consider when cramming a WASM package into a configuration is the size of the package. We use wasm-gc for packet cutting, as shown below:

ls -hl target/wasm32-unknown-unknown/release/propaganda_filter.wasm
wasm-gc ./target/wasm32-unknown-unknown/release/propaganda_filter.wasm ./target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm
ls -hl target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm

The execution results are as follows, you can see the size comparison of the package before and after cutting:

-rwxr-xr-x  2 han  staff   1.7M Feb 25 15:38 target/wasm32-unknown-unknown/release/propaganda_filter.wasm
-rw-r--r--  1 han  staff   136K Feb 25 15:38 target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm

Create a configmap:

wasm_image=target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm
kubectl -n $NS create configmap -n $NS propaganda-header --from-file=$wasm_image

Patch the specified Deployment:

patch_annotations=$(cat config/annotations/patch-annotations.yaml)
kubectl -n $NS patch deployment "hello$i-deploy-v$j" -p "$patch_annotations"

patch-annotations.yamlas follows:

spec:
  template:
    metadata:
      annotations:
        sidecar.istio.io/userVolume: '[{"name":"wasmfilters-dir","configMap": {"name":"propaganda-header"}}]'
        sidecar.istio.io/userVolumeMount: '[{"mountPath":"/var/local/lib/wasm-filters","name":"wasmfilters-dir"}]'

2 Envoy's Remote method

Envoy supports localboth remoteform and resource definitions. The comparison is as follows:

vm_config:
  runtime: "envoy.wasm.runtime.v8"
  vm_id: "header_filter_vm"
  code:
    local:
      filename: "/etc/propaganda_filter.wasm"
vm_config:
  runtime: "envoy.wasm.runtime.v8"
  code:
    remote:
      http_uri:
        uri: "http://*.*.*.216:8000/propaganda_filter.wasm"
        cluster: web_service
        timeout:
          seconds: 60
      sha256: "da2e22*"

remoteThe way is the closest to the original Enovy, so this way would have been the first choice for this example. However, during the actual measurement, it was found that there was a problem with the hash verification of the package. For details, see the reference below. In addition, Daniel Zhou from the Envoy community gave feedback that I said that remoteit is not the future direction of Envoy to support WASM distribution. Therefore, this example finally abandons this approach.

3 ORAS + Local method

ORAS is the reference implementation of the OCI Artifacts project that significantly simplifies the storage of arbitrary content in the OCI registry.

Use the ORAS client or API/SDK to push the Wasm module with the allowed media type to the registration library (an OCI-compatible registration library), and then deploy the Wasm Filter to the Pod corresponding to the specified workload through the controller. Mount in Local mode.

The Alibaba Cloud Service Grid ASM product provides support for WebAssembly (WASM) technology. Service grid users can deploy the extended WASM Filter to the corresponding Envoy proxy in the data plane cluster through ASM. Through the ASMFilterDeployment Controller component, it can support dynamic loading of plug-ins, ease of use, and support for hot updates. Specifically, ASM products provide a new CRD ASMFilterDeployment and related controller components. This controller component will monitor the situation of the ASMFilterDeployment resource object and will do two things:

  • Create an Istio EnvoyFilter Custom Resource for the control plane and push it to the corresponding asm control plane istiod
  • Pull the corresponding wasm filter image from the OCI registry and mount it into the corresponding workload pod

For details, please refer to: Simplify and expand service mesh functions based on Wasm and ORAS .

Subsequent practice sharing will use this method to distribute WASM, so stay tuned.

Similarly, other friends in the industry are also promoting this approach, especially http://Solo.io provides a complete set of WASM development framework wasme, based on which you can develop-build-distribute WASM packages (OCI image) and deploy to the Webassembly Hub . The advantages of this solution are obvious, and it fully supports the life cycle of WASM from development to launch. But the shortcomings of this scheme are also very obvious. The self-contained nature of wasme makes it difficult to split it and expand it beyond the solo system.

The Alibaba Cloud Service Mesh ASM team is communicating with relevant industry teams including solo on how to jointly promote the OCI specification of Wasm filter and the corresponding lifecycle management to help customers easily extend Envoy's functions and integrate them into service meshes applications to new heights.

2.4 Cluster verification (based on Istio)

1 Experimental example

After WASM is distributed to Kubernetes' configmap, we can perform cluster verification. The example ( source code ) contains 3 Services: hello1- hello2- hello3, and each service contains 2 versions: v1/ enand v2/ fr.

Each Service is configured with VirtualService and DestinationRule to define matching Header and route to the specified version.

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: hello2-vs
spec:
  hosts:
    - hello2-svc
  http:
  - name: hello2-v2-route
    match:
    - headers:
        route-v:
          exact: hello2v2
    route:
    - destination:
        host: hello2-svc
        subset: hello2v2
  - route:
    - destination:
        host: hello2-svc
        subset: hello2v1
----
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: hello2-dr
spec:
  host: hello2-svc
  subsets:
    - name: hello2v1
      labels:
        version: v1
    - name: hello2v2
      labels:
        version: v2

Envoyfilter is shown as follows:

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: hello1v2-propaganda-filter
spec:
  workloadSelector:
    labels:
      app: hello1-deploy-v2
      version: v2
  configPatches:
    - applyTo: HTTP_FILTER
      match:
        context: SIDECAR_OUTBOUND
        proxy:
          proxyVersion: "^1\\.8\\.*"
        listener:
          filterChain:
            filter:
              name: envoy.filters.network.http_connection_manager
              subFilter:
                name: envoy.filters.http.router
      patch:
        operation: INSERT_BEFORE
        value:
          name: envoy.filters.http.wasm
          typed_config:
            "@type": type.googleapis.com/udpa.type.v1.TypedStruct
            type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
            value:
              config:
                name: propaganda_filter
                root_id: propaganda_filter_root
                configuration:
                  '@type': type.googleapis.com/google.protobuf.StringValue
                  value: |
                    {
                      "head_tag_name": "route-v",
                      "head_tag_value": "hello2v2"
                    }
                vm_config:
                  runtime: envoy.wasm.runtime.v8
                  vm_id: propaganda_filter_vm
                  code:
                    local:
                      filename: /var/local/lib/wasm-filters/propaganda-header-filter.wasm
                  allow_precompiled: true

2 Verification method

The request with the header curl -H "version:v1" "http://$ingressGatewayIp:8001/hello/xxx"enters through istio-ingressgateway, and the full link enters the specified version of the service according to the header value. Here, since is specified in the header version, v2the full link will
be hello1 v2- hello2 v2- hello3 v2. Results as shown below.

The verification process and results are shown below.

for i in {1..5}; do
    curl -s -H "route-v:v2" "http://$ingressGatewayIp:$PORT/hello/eric" >>result
    echo >>result
done
check=$(grep -o "Bonjour eric" result | wc -l)
if [[ "$check" -eq "15" ]]; then
    echo "pass"
else
    echo "fail"
    exit 1
fi

result:

Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182

We see that the output information Bonjour ericcomes from the version of each service fr, indicating that the functional verification has passed.

3 Performance Analysis

After adding EnvoyFilter+WASM, the function verification is passed, but how much delay will this cost? This is a matter of great concern to both providers and consumers of service meshes. This section will verify the following two concerns.

  • Incremental delay overhead after adding EnvoyFilter+WASM
  • Cost comparison between the WASM version and the Lua version

3.1 Lua implementation

The implementation of Lua can be written directly into EnvoyFilter without a separate project. An example is as follows:

patch:
  operation: INSERT_BEFORE
  value:
    name: envoy.lua
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua
      inlineCode: |
        function envoy_on_request(handle)
          handle:logInfo("[propagate header] route-v:hello3v2")
          handle:headers():add("route-v", "hello3v2")
        end

3.2 Pressure measurement method

1 Deployment

  • Deploy the same Deployment/Service/VirtualService/DestinationRule on 3 namespaces respectively
  • hello-abtest-luaDeploying a Lua-based EnvoyFilter in
  • hello-abtest-wasmDeploy WASM-based EnvoyFilter in
hello-abtest        基准环境
hello-abtest-lua    增加EnvoyFilter+LUA的环境
hello-abtest-wasm   增加EnvoyFilter+WASM的环境

2 tools

This example uses hey as the stress measurement tool. hey was formerly boom , used to replace ab (Apache Bench). The three environments were subjected to stress testing using the same stress testing parameters. The indication is as follows:

# 并发work数量
export NUM=2000
# 每秒请求数量
export QPS=2000
# 压测执行时常
export Duration=10s

hey -c $NUM -q $QPS -z $Duration -H "route-v:v2" http://$ingressGatewayIp:$PORT/hello/eric > $SIDECAR_WASM_RESULT

Please pay attention to the hey stress test result file. The result cannot appear at the end socket: too many open files, otherwise the result will be affected. You can use ulimit -n $MAX_OPENFILE_NUMcommands to configure and then adjust the pressure measurement parameters to ensure the accuracy of the results.

3.3 Reporting

We selected 4 key indicators from the three results reports, as shown in the following figure:

3.4 Conclusion

  1. Compared with the benchmark version, adding two versions of EnvoyFilter has an average delay of tens to hundreds of milliseconds, and the increase time-consuming ratio is
  • wasm 1.2% and 1% (0.6395-0.6317)/0.6317(1.3290-1.2078)/1.2078
  • lua 11% and 20% (0.7012-0.6317)/0.6317(1.4593-1.2078)/1.2078
  1. The performance of the WASM version is significantly better than the LUA version
Note: Compared with the LUA version, the implementation of WASM is a set of code multiple configurations. Therefore, the execution process of WASM has one more process of obtaining configuration variables than LUA.

4 Outlook

4.1 How to use

From the perspective of technical implementation, this article describes how to implement and verify a WASM that transparently transmits user-defined headers to support the requirement of non-intrusive full-link A/B Test.

However, as a user of service mesh, it is very tedious and error-prone to implement step by step according to this article.

The Alibaba Cloud Service Grid ASM team is launching an ASM plug-in directory mechanism. Users only need to select a plug-in in the plug-in directory and provide the plug-in with a very small number of kv configurations such as custom Header, and then the related modules can be automatically generated and deployed. EnvoyFilter+WASM+VirtualService+DestinationRule.

4.2 How to expand

This example only shows the matching and routing function based on Header. If we want to match and route based on Query Params, how to extend it?

This is a topic that the ASM plugin directory is closely following, and the final plugin directory will provide best practices.

above.

Original link

This article is original content of Alibaba Cloud and may not be reproduced without permission.

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324313973&siteId=291194637