1 background introduction
We all know that ServiceMesh can provide non-intrusive traffic management capabilities for microservices running on it. By configuring VirtualService and DestinationRule, functions such as traffic management, timeout retry, traffic replication, current limiting, and fuse can be realized without modifying the microservice code.
The premise of traffic management is that there are multiple versions of a service. We can classify them according to the purpose of deploying multiple versions. The brief description is as follows to facilitate the understanding of the rest of the article.
- traffic routing : According to the request information (Header/Cookie/Query Params), the request traffic is routed to the endpoint (Pod[]) of the specified version (Deployment) of the specified service (Service). This is what we call A/B Testing (A/B Testing).
- Traffic shifting : Publish via gray/canary, and route request traffic to the endpoints (Pod[]) of each version (Deployment[]) of the specified service (Service) indiscriminately and proportionally.
- traffic switching/mirroring : Announced in Blue/Green, traffic switching is performed proportionally according to the requested information, and traffic replication is performed.
The practice described in this article is to implement full-link A/B testing based on the request header.
1.1 Brief description of functions
From the documentation of the Istio community, we can easily find documentation and examples on how to route traffic to a specific version of a service based on the request header. But this example can only take effect on the first service of the entire link.
For example, a request needs to access three ABC services, and all three services have en
versions and fr
versions. we hope:
user:en
The request with the header value , the full link route isA1-B1-C1
user:fr
The request with the header value , the full link route isA2-B2-C2
The corresponding VirtualService configuration is as follows:
http:
- name: A|B|C-route
match:
- headers:
user:
exact: en
route:
- destination:
host: A|B|C-svc
subset: v1
- route:
- destination:
host: A|B|C-svc
subset: v2
Through actual measurement, we can find that only the route of the service A is in line with our expectations. B and C cannot be routed to the specified version based on the Header value.
Why is this? For the microservices on the service grid, this header appears out of thin air, that is, the microservice code is unaware. Therefore, when A service requests B service, this header will not be transparently transmitted; that is, when A requests B, this header has been lost. At this time, the VirtualService configuration that matches the header for routing is meaningless.
To solve this problem, from the business point of view of the microservice party, only the code can be modified (enumerate all headers that the business focuses on and transmit transparently). But this is an intrusive modification, and it cannot flexibly support new headers.
From the perspective of the infrastructure of the service grid, any header is a kv pair that has no business meaning and needs to be transparently transmitted. Only by doing this, the service grid can transparently transmit user-defined headers indiscriminately, thereby supporting the non-intrusive full-link A/B Test function.
So how can it be achieved?
1.2 Status of the community
As explained above, in the case that the header cannot be transmitted transparently, this function cannot be realized by simply configuring the header matching of the VirtualService.
However, is there any other configuration in VirtualService that can realize header transparent transmission? If it exists, the cost of using VirtualService is minimal.
After various attempts (including careful configuration of header-related set/add), I found it impossible to achieve. The reason is that the intervention of VirtualService on the header occurs in the inbound phase, and the transparent transmission needs to intervene in the header in the outbound phase. The microservice workload has no ability to transparently transmit the header value that appears out of thin air, so this header will be lost when it is routed to the next service.
Therefore, we can draw a conclusion: it is impossible to use VirtualService alone to achieve a non-intrusive full-link A/B Test. Furthermore, the existing configurations provided by the community cannot be used directly to support this function.
Then, only the more advanced configuration of EnvoyFilter is left. This is the conclusion we didn't want at the beginning. There are two reasons:
- The configuration of EnvoyFilter is too complicated, and it is difficult for general users to quickly learn and use it in the service grid. Even if we provide examples, once the requirements change slightly, the examples are of little reference value for modifying EnvoyFilter.
- Even if EnvoyFilter is used, the current built-in filter in Envoy does not directly support this function, and it needs to be developed with the help of Lua or WebAssembly (WASM).
1.3 Implementation scheme
Next enter the technology selection. I use one sentence to summarize:
- The advantage of Lua is its compactness, but its disadvantage is its unsatisfactory performance
- The advantage of WASM is good performance, but the disadvantage is that development and distribution are more difficult than Lua.
- The mainstream implementation of WASM is C++ and Rust, and the implementation of other languages is not yet mature or has performance problems. This article uses Rust.
We use Rust to develop a WASM. In the outbound phase, we get the header defined by the user in EnvoyFilter and pass it back.
The distribution of the WASM package uses the configmap storage of Kubernetes, and the Pod obtains and loads the WASM configuration through the definition in the annotation. (Why use this form of distribution will be discussed later.)
2 Technical realization
Related code described in this section:
https://github.com/AliyunContainerService/rust-wasm-4-envoy/tree/master/propagate-headers-filter
2.1 Use RUST to implement WASM
1 Define dependencies
The core of the WASM project depends on only one crates , which is proxy-wasm , which is the basic package for developing WASM using Rust. In addition, there is a package serde_json for deserialization and a package log for printing logs . Cargo.toml
It is defined as follows:
[dependencies]
proxy-wasm = "0.1.3"
serde_json = "1.0.62"
log = "0.4.14"
2 Define the build
The final build form of WASM is a dynamic link library compatible with c, which is Cargo.toml
defined as follows:
[lib]
name = "propaganda_filter"
path = "src/propagate_headers.rs"
crate-type = ["cdylib"]
3 Header transparent transmission function
First define the structure as follows, which head_tag_name
is the name of the user-defined header key and the name head_tag_value
of the corresponding value.
struct PropagandaHeaderFilter {
config: FilterConfig,
}
struct FilterConfig {
head_tag_name: String,
head_tag_value: String,
}
{proxy-wasm}/src/traits.rs
The method is trait HttpContext
defined in on_http_request_headers
. We implement this method to complete the Header transparent transmission function.
impl HttpContext for PropagandaHeaderFilter {
fn on_http_request_headers(&mut self, _: usize) -> Action {
let head_tag_key = self.config.head_tag_name.as_str();
info!("::::head_tag_key={}", head_tag_key);
if !head_tag_key.is_empty() {
self.set_http_request_header(head_tag_key, Some(self.config.head_tag_value.as_str()));
self.clear_http_route_cache();
}
for (name, value) in &self.get_http_request_headers() {
info!("::::H[{}] -> {}: {}", self.context_id, name, value);
}
Action::Continue
}
}
Lines 3-6 are to get the user-defined header key-value pair in the configuration file. If it exists, call the set_http_request_header
method to write the key-value pair into the current header.
Line 7 is a workaround for the current proxy-wasm implementation. If you are interested in this, you can read the following reference:
- https://github.com/istio/istio/issues/30545#issuecomment-783518257
- https://github.com/proxy-wasm/spec/issues/16
- https://www.elvinefendi.com/2020/12/09/dynamic-routing-envoy-wasm.html
2.2 Local authentication (based on Envoy)
1 WASM construction
Use the following commands to build the WASM project. It should be emphasized that wasm32-unknown-unknown
this target currently only exists in nightly
, so you need to temporarily switch the build environment before building.
rustup override set nightly
cargo build --target=wasm32-unknown-unknown --release
After the build is complete, we use docker compose to start Envoy locally to verify the WASM function.
2 Envoy configuration
In this example, two files need to be provided for Envoy startup, one is the built one propaganda_filter.wasm
, and the other is the Envoy configuration file envoy-local-wasm.yaml
. The schematic is as follows:
volumes:
- ./config/envoy/envoy-local-wasm.yaml:/etc/envoy-local-wasm.yaml
- ./target/wasm32-unknown-unknown/release/propaganda_filter.wasm:/etc/propaganda_filter.wasm
Envoy supports dynamic configuration, and local testing uses static configuration:
static_resources:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 80
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
...
http_filters:
- name: envoy.filters.http.wasm
typed_config:
"@type": type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
value:
config:
name: "header_filter"
root_id: "propaganda_filter"
configuration:
"@type": "type.googleapis.com/google.protobuf.StringValue"
value: |
{
"head_tag_name": "custom-version",
"head_tag_value": "hello1-v1"
}
vm_config:
runtime: "envoy.wasm.runtime.v8"
vm_id: "header_filter_vm"
code:
local:
filename: "/etc/propaganda_filter.wasm"
allow_precompiled: true
...
Envoy configuration focuses on the following 3 points:
- In line 15, we
http_filters
defined a nameheader_filter
intype.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
- The 32-line local file path is
/etc/propaganda_filter.wasm
- The type of line 20-26 is related to the configuration
type.googleapis.com/google.protobuf.StringValue
, and the content of the value is{"head_tag_name": "custom-version","head_tag_value": "hello1-v1"}
. Here the custom Header key name iscustom-version
and the value ishello1-v1
.
3 Local verification
Execute the following command to start docker compose:
docker-compose up --build
Request local service:
curl -H "version-tag":"v1" "localhost:18000"
At this time, Envoy's log should have the following output:
proxy_1 | [2021-02-25 06:30:09.217][33][info][wasm] [external/envoy/source/extensions/common/wasm/context.cc:1152] wasm log: ::::create_http_context head_tag_name=custom-version,head_tag_value=hello1-v1
proxy_1 | [2021-02-25 06:30:09.217][33][info][wasm] [external/envoy/source/extensions/common/wasm/context.cc:1152] wasm log: ::::head_tag_key=custom-version
...
proxy_1 | [2021-02-25 06:30:09.217][33][info][wasm] [external/envoy/source/extensions/common/wasm/context.cc:1152] wasm log: ::::H[2] -> custom-version: hello1-v1
2.3 How to distribute WASM
The distribution of WASM refers to the process of storing WASM packages in a distributed warehouse for the specified Pod to pull.
1 Configmap + Envoy's Local method
Although this method is not the final state of WASM distribution, because it is easier to understand and suitable for simple scenarios, this example finally chose this solution as an example to explain. Although configmap's own job is not for WASM, the local modes of configmap and Envoy are mature, and the combination of the two can meet current needs.
The ASM product of Alibaba Cloud Service Grid already provides this similar way. For details, please refer to Writing WASM Filter for Envoy and deploying it to ASM .
To pack the WASM package into the configuration, the first consideration is the size of the package. We use wasm-gc for packet cutting, as shown below:
ls -hl target/wasm32-unknown-unknown/release/propaganda_filter.wasm
wasm-gc ./target/wasm32-unknown-unknown/release/propaganda_filter.wasm ./target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm
ls -hl target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm
The execution result is as follows, you can see the comparison of the size of the bag before and after cutting:
-rwxr-xr-x 2 han staff 1.7M Feb 25 15:38 target/wasm32-unknown-unknown/release/propaganda_filter.wasm
-rw-r--r-- 1 han staff 136K Feb 25 15:38 target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm
Create configmap:
wasm_image=target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm
kubectl -n $NS create configmap -n $NS propaganda-header --from-file=$wasm_image
Patch for the specified Deployment:
patch_annotations=$(cat config/annotations/patch-annotations.yaml)
kubectl -n $NS patch deployment "hello$i-deploy-v$j" -p "$patch_annotations"
patch-annotations.yaml
as follows:
spec:
template:
metadata:
annotations:
sidecar.istio.io/userVolume: '[{"name":"wasmfilters-dir","configMap": {"name":"propaganda-header"}}]'
sidecar.istio.io/userVolumeMount: '[{"mountPath":"/var/local/lib/wasm-filters","name":"wasmfilters-dir"}]'
2 Envoy's Remote Way
Envoy supports both local
and remote
formal resource definitions. The comparison is as follows:
vm_config:
runtime: "envoy.wasm.runtime.v8"
vm_id: "header_filter_vm"
code:
local:
filename: "/etc/propaganda_filter.wasm"
vm_config:
runtime: "envoy.wasm.runtime.v8"
code:
remote:
http_uri:
uri: "http://*.*.*.216:8000/propaganda_filter.wasm"
cluster: web_service
timeout:
seconds: 60
sha256: "da2e22*"
remote
The method is closest to the original Enovy, so this method was originally the first choice for this example. However, during the actual measurement process, it was found that there was a problem in the hash verification of the package. For details, please refer to the reference below. In addition, the Envoy community’s Daniel Weekly praised me and said that it remote
is not the future direction of Envoy to support WASM distribution. Therefore, this example finally abandoned this approach.
- https://stackoverflow.com/questions/65871312/how-to-set-the-sha256-hex-in-envoy-wasm-remote-config
- https://envoyproxy.slack.com/archives/C78M4KW76/p1611496672017500
3 ORAS + Local method
ORAS is the reference implementation of the OCI Artifacts project, which can significantly simplify the storage of any content in the OCI registry.
Use ORAS client or API/SDK to push Wasm modules with allowed media types to the registration library (an OCI-compatible registration library), and then deploy the Wasm Filter to the Pod corresponding to the specified workload through the controller. Mount in local mode.
Alibaba Cloud Service Grid ASM product provides support for WebAssembly (WASM) technology. Service grid users can deploy the extended WASM Filter through ASM to the corresponding Envoy proxy in the data plane cluster. Through the ASMFilterDeployment Controller component, it can support dynamic loading of plug-ins, easy to use, and support for hot updates. Specifically, the ASM product provides a new CRD ASMFilterDeployment and related controller components. This controller component will monitor the ASMFilterDeployment resource object, and will do two things:
- Create an Istio EnvoyFilter Custom Resource for the control plane and push it to the corresponding asm control plane istiod
- Pull the corresponding wasm filter image from the OCI registry and mount it to the corresponding workload pod
For details, please refer to: Simplify and expand service grid functions based on Wasm and ORAS .
Follow-up practice sharing will use this method to distribute WASM, so stay tuned.
Similarly, other friends in the industry are also advancing this approach, especially http://Solo.io provides a complete set of WASM development framework wasme, based on which can develop-build-distribute WASM packages (OCI image) and deploy Go to the Webassembly Hub . The advantages of this solution are obvious, and it fully supports the life cycle of WASM from development to online. But the shortcomings of this scheme are also very obvious. The self-contained wasme makes it difficult to split it and extend it beyond the solo system.
Alibaba Cloud Service Grid ASM team is communicating with relevant industry teams including solo on how to jointly promote the OCI specification of Wasm filter and the corresponding life cycle management to help customers easily expand Envoy's functions and put it in the service grid. The application has been pushed to new heights.
2.4 Cluster verification (based on Istio)
1 Experimental example
After WASM is distributed to the configmap of Kubernetes, we can perform cluster verification. Example ( source code ) contains three-Service: hello1
- hello2
- hello3
, each service contains two versions: v1
/ en
and v2
/ fr
.
Each Service is configured with VirtualService and DestinationRule to define matching Header and route to the specified version.
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: hello2-vs
spec:
hosts:
- hello2-svc
http:
- name: hello2-v2-route
match:
- headers:
route-v:
exact: hello2v2
route:
- destination:
host: hello2-svc
subset: hello2v2
- route:
- destination:
host: hello2-svc
subset: hello2v1
----
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: hello2-dr
spec:
host: hello2-svc
subsets:
- name: hello2v1
labels:
version: v1
- name: hello2v2
labels:
version: v2
Envoyfilter is as follows:
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: hello1v2-propaganda-filter
spec:
workloadSelector:
labels:
app: hello1-deploy-v2
version: v2
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_OUTBOUND
proxy:
proxyVersion: "^1\\.8\\.*"
listener:
filterChain:
filter:
name: envoy.filters.network.http_connection_manager
subFilter:
name: envoy.filters.http.router
patch:
operation: INSERT_BEFORE
value:
name: envoy.filters.http.wasm
typed_config:
"@type": type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
value:
config:
name: propaganda_filter
root_id: propaganda_filter_root
configuration:
'@type': type.googleapis.com/google.protobuf.StringValue
value: |
{
"head_tag_name": "route-v",
"head_tag_value": "hello2v2"
}
vm_config:
runtime: envoy.wasm.runtime.v8
vm_id: propaganda_filter_vm
code:
local:
filename: /var/local/lib/wasm-filters/propaganda-header-filter.wasm
allow_precompiled: true
2 Verification method
The request carrying the header is curl -H "version:v1" "http://$ingressGatewayIp:8001/hello/xxx"
entered through the istio-ingressgateway, and the entire link enters the specified version of the service according to the header value. Here, as specified in the header version
as v2
, the whole link
is hello1 v2
- hello2 v2
- hello3 v2
. Results as shown below.
The verification process and results are as follows.
for i in {1..5}; do
curl -s -H "route-v:v2" "http://$ingressGatewayIp:$PORT/hello/eric" >>result
echo >>result
done
check=$(grep -o "Bonjour eric" result | wc -l)
if [[ "$check" -eq "15" ]]; then
echo "pass"
else
echo "fail"
exit 1
fi
result:
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
We see that the output information Bonjour eric
comes from the fr
version of each service , indicating that the function verification is passed.
3 performance analysis
After adding EnvoyFilter+WASM, the functional verification is passed, but how much delay will it bring? This is a problem that both providers and users of service grids are very concerned about. This section will verify the following two concerns.
- Increase the incremental delay overhead after EnvoyFilter+WASM
- Cost comparison between WASM version and Lua version
3.1 Lua implementation
The implementation of Lua can be written directly to EnvoyFilter without a separate project. Examples are as follows:
patch:
operation: INSERT_BEFORE
value:
name: envoy.lua
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua
inlineCode: |
function envoy_on_request(handle)
handle:logInfo("[propagate header] route-v:hello3v2")
handle:headers():add("route-v", "hello3v2")
end
3.2 Pressure measurement method
1 deployment
- Deploy the same Deployment/Service/VirtualService/DestinationRule on the 3 namespaces respectively
- In
hello-abtest-lua
deploying the EnvoyFilter in Lua - In
hello-abtest-wasm
deploying WASM in the EnvoyFilter
hello-abtest 基准环境
hello-abtest-lua 增加EnvoyFilter+LUA的环境
hello-abtest-wasm 增加EnvoyFilter+WASM的环境
2 tools
This example uses hey as a pressure measurement tool. The predecessor of hey is boom , used to replace ab (Apache Bench). Use the same pressure test parameters to pressure test the three environments respectively. The schematic is as follows:
# 并发work数量
export NUM=2000
# 每秒请求数量
export QPS=2000
# 压测执行时常
export Duration=10s
hey -c $NUM -q $QPS -z $Duration -H "route-v:v2" http://$ingressGatewayIp:$PORT/hello/eric > $SIDECAR_WASM_RESULT
Please pay attention to the hey pressure test result file, the result cannot appear at the end socket: too many open files
, otherwise it will affect the result. You can use ulimit -n $MAX_OPENFILE_NUM
commands to configure and then adjust the pressure test parameters to ensure the accuracy of the results.
3.3 Report
We select 4 key indicators from the three result reports, as shown in the figure below:
3.4 Conclusion
- Compared with the benchmark version, adding two versions of EnvoyFilter, the average delay is tens to hundreds of milliseconds longer, and the increase in time-consuming ratio is
- wasm 1.2%
(0.6395-0.6317)/0.6317
and 1%(1.3290-1.2078)/1.2078
- lua 11%
(0.7012-0.6317)/0.6317
and 20%(1.4593-1.2078)/1.2078
- The performance of the WASM version is significantly better than the LUA version
Note: Compared with the LUA version, the WASM implementation is a set of codes with multiple configurations. Therefore, the execution process of WASM has one more process of obtaining configuration variables than LUA.
4 Outlook
4.1 How to use
From the perspective of technical implementation, this article describes how to implement and verify a WASM that transparently transmits a user-defined header, so as to support the requirement of non-intrusive full-link A/B Test.
However, as a service grid user, it is very cumbersome and error-prone to implement it step by step according to this article.
The ASM team of Alibaba Cloud Service Grid is launching an ASM plug-in catalog mechanism. Users only need to select a plug-in in the plug-in catalog and provide a very small number of kv configurations such as a custom header for the plug-in to automatically generate and deploy related configurations. EnvoyFilter+WASM+VirtualService+DestinationRule.
4.2 How to expand
This example only shows the matching routing function based on Header. If we want to match and route based on Query Params, how can we expand it?
This is a topic that the ASM plug-in catalog is paying close attention to, and the final plug-in catalog will provide best practices.
the above.
This article is the original content of Alibaba Cloud and may not be reproduced without permission.