1. Envoy分布式跟踪
1.1 Trace由三个要素构成
- span: 基本工作单元,是指事务或请求中的一次调用.
- 它通过一个64位唯一标识,具有摘要,时间戳事件,关键之注释(Tags),Span ID,以及进度ID(通常是IP地址)和其他可选的元数据等属性
- 各span知道他们的父级span以及所属的Trace
- Trace树: 一系列相关联的Span组成的树状结构
- Annotation(标注): 用来及时记录一个事件的存在,一个核心annotation用来定义一个请求的开始和结束
2. Zipkin
由Twitter基于Dapper论文实现的开源分布式追踪系统,通过收集分布式服务执行时间的信息来达到追踪服务调用链路以及分析服务指向延迟等目的
架构组件
- collector 信息收集器守护进程
- storage 存储组件
- API search API查询进程
- web UI 用户界面
3. Jaeger
由Uber实现的一款开源分布式追踪系统,兼容OpenTracing API,与Zikpin兼容
4. Envoy的分布式跟踪系统
Envoy用三个功能来实现系统范围内的跟踪:
- 生成请求ID: Envoy会在需要的时候生成UUID,并填充x-request-id HTTP标头,应用可以转发这个标头以进行统一的记录和跟踪
- 集成外部跟踪服务: Envoy支持可插接的外部跟踪可视化服务,包括LightStep,Zipkin或者Zipkin兼容的后端(如Jaeger)
- 加入客户端跟踪ID: x-client-trace-id标头可以用来把不受信任的请求ID连接到受信的x-request-id标头
处理请求的HTTP连接管理器必须设置跟踪对象,有多重途径可以初始化跟踪
- 外部客户端,使用x-client-trace-id
- 内部服务,使用x-envoy-force-trace
- 随机采样使用运行时设置random_sampling
路由过滤器可以使用start_child_span来为egress调用创建子span
5. Zipkin示例
5.1 docker-compose
8个Service
- front-envoy:Front Proxy,地址为172.31.85.10
- 6个后端服务
- service_a_envoy和service_a:对应于Envoy中的service_a集群,会调用service_b和service_c;
- service_b_envoy和service_b:对应于Envoy中的service_b集群;
- service_c_envoy和service_c:对应于Envoy中的service_c集群;
- zipkin:Zipkin服务
version: '3.3'
services:
front-envoy:
image: envoyproxy/envoy-alpine:v1.21-latest
environment:
- ENVOY_UID=0
- ENVOY_GID=0
volumes:
- "./front_envoy/envoy-config.yaml:/etc/envoy/envoy.yaml"
networks:
envoymesh:
ipv4_address: 172.31.85.10
aliases:
- front-envoy
- front
ports:
- 8088:80
- 9901:9901
service_a_envoy:
image: envoyproxy/envoy-alpine:v1.20.0
volumes:
- "./service_a/envoy-config.yaml:/etc/envoy/envoy.yaml"
networks:
envoymesh:
aliases:
- service_a_envoy
- service-a-envoy
ports:
- 8786
- 8788
- 8791
service_a:
build: service_a/
network_mode: "service:service_a_envoy"
#ports:
#- 8081
depends_on:
- service_a_envoy
service_b_envoy:
image: envoyproxy/envoy-alpine:v1.20.0
volumes:
- "./service_b/envoy-config.yaml:/etc/envoy/envoy.yaml"
networks:
envoymesh:
aliases:
- service_b_envoy
- service-b-envoy
ports:
- 8789
service_b:
build: service_b/
network_mode: "service:service_b_envoy"
#ports:
#- 8082
depends_on:
- service_b_envoy
service_c_envoy:
image: envoyproxy/envoy-alpine:v1.20.0
volumes:
- "./service_c/envoy-config.yaml:/etc/envoy/envoy.yaml"
networks:
envoymesh:
aliases:
- service_c_envoy
- service-c-envoy
ports:
- 8790
service_c:
build: service_c/
network_mode: "service:service_c_envoy"
#ports:
#- 8083
depends_on:
- service_c_envoy
zipkin:
image: openzipkin/zipkin:2
networks:
envoymesh:
ipv4_address: 172.31.85.15
aliases:
- zipkin
ports:
- "9411:9411"
networks:
envoymesh:
driver: bridge
ipam:
config:
- subnet: 172.31.85.0/24
5.2 envoy.yaml
5.2.1 front envoy
front envoy上将所有流量转给service_a,并添加header x-b3-traceid和x-request-id
node:
id: front-envoy
cluster: front-envoy
admin:
profile_path: /tmp/envoy.prof
access_log_path: /tmp/admin_access.log
address:
socket_address:
address: 0.0.0.0
port_value: 9901
layered_runtime:
layers:
- name: admin
admin_layer: {
}
static_resources:
listeners:
- name: http_listener-service_a
address:
socket_address:
address: 0.0.0.0
port_value: 80
traffic_direction: OUTBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
generate_request_id: true
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: zipkin
collector_endpoint: "/api/v2/spans"
collector_endpoint_version: HTTP_JSON
codec_type: AUTO
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: backend
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: service_a
decorator:
operation: checkAvailability
response_headers_to_add:
- header:
key: "x-b3-traceid"
value: "%REQ(x-b3-traceid)%"
- header:
key: "x-request-id"
value: "%REQ(x-request-id)%"
http_filters:
- name: envoy.filters.http.router
clusters:
- name: zipkin
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: zipkin
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: zipkin
port_value: 9411
- name: service_a
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_a
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: service_a_envoy
port_value: 8786
5.2.2 service_a
ingress,将入流量交给cluster: service_a
egress,将出流量导向service_b和service_c
通过collector_cluster加载zipkin,并将收集到的数据提交给zipkin的9411
node:
id: service-a
cluster: service-a
admin:
profile_path: /tmp/envoy.prof
access_log_path: /tmp/admin_access.log
address:
socket_address:
address: 0.0.0.0
port_value: 9901
layered_runtime:
layers:
- name: admin
admin_layer: {
}
static_resources:
listeners:
- name: service-a-svc-http-listener
address:
socket_address:
address: 0.0.0.0
port_value: 8786
traffic_direction: INBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: zipkin
collector_endpoint: "/api/v2/spans"
collector_endpoint_version: HTTP_JSON
route_config:
name: service-a-svc-http-route
virtual_hosts:
- name: service-a-svc-http-route
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: local_service
decorator:
operation: checkAvailability
http_filters:
- name: envoy.filters.http.router
- name: service-b-svc-http-listener
address:
socket_address:
address: 0.0.0.0
port_value: 8788
traffic_direction: OUTBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: egress_http_to_service_b
codec_type: AUTO
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: zipkin
collector_endpoint: "/api/v2/spans"
collector_endpoint_version: HTTP_JSON
route_config:
name: service-b-svc-http-route
virtual_hosts:
- name: service-b-svc-http-route
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: service_b
decorator:
operation: checkStock
http_filters:
- name: envoy.filters.http.router
- name: service-c-svc-http-listener
address:
socket_address:
address: 0.0.0.0
port_value: 8791
traffic_direction: OUTBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: egress_http_to_service_c
codec_type: AUTO
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: zipkin
collector_endpoint: "/api/v2/spans"
collector_endpoint_version: HTTP_JSON
route_config:
name: service-c-svc-http-route
virtual_hosts:
- name: service-c-svc-http-route
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: service_c
decorator:
operation: checkStock
http_filters:
- name: envoy.filters.http.router
clusters:
- name: zipkin
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: zipkin
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: zipkin
port_value: 9411
- name: local_service
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: local_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8081
- name: service_b
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_b
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: service_b_envoy
port_value: 8789
- name: service_c
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_c
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: service_c_envoy
port_value: 8790
5.2.2 service_b
ingress,将入流量交给cluster: service_b
通过collector_cluster加载zipkin,并将收集到的数据提交给zipkin的9411
node:
id: service-b
cluster: service-b
admin:
profile_path: /tmp/envoy.prof
access_log_path: /tmp/admin_access.log
address:
socket_address:
address: 0.0.0.0
port_value: 9901
layered_runtime:
layers:
- name: admin
admin_layer: {
}
static_resources:
listeners:
- name: service-b-svc-http-listener
address:
socket_address:
address: 0.0.0.0
port_value: 8789
traffic_direction: INBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: zipkin
collector_endpoint: "/api/v2/spans"
collector_endpoint_version: HTTP_JSON
route_config:
name: service-b-svc-http-route
virtual_hosts:
- name: service-b-svc-http-route
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: local_service
decorator:
operation: checkAvailability
http_filters:
- name: envoy.filters.http.fault
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.fault.v3.HTTPFault
max_active_faults: 100
abort:
http_status: 503
percentage:
numerator: 15
denominator: HUNDRED
- name: envoy.filters.http.router
typed_config: {
}
clusters:
- name: zipkin
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: zipkin
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: zipkin
port_value: 9411
- name: local_service
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: local_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8082
5.2.2 service_c
ingress,将入流量交给cluster: service_c
通过collector_cluster加载zipkin,并将收集到的数据提交给zipkin的9411
node:
id: service-c
cluster: service-c
admin:
profile_path: /tmp/envoy.prof
access_log_path: /tmp/admin_access.log
address:
socket_address:
address: 0.0.0.0
port_value: 9901
layered_runtime:
layers:
- name: admin
admin_layer: {
}
static_resources:
listeners:
- name: service-c-svc-http-listener
address:
socket_address:
address: 0.0.0.0
port_value: 8790
traffic_direction: INBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: zipkin
collector_endpoint: "/api/v2/spans"
collector_endpoint_version: HTTP_JSON
route_config:
name: service-c-svc-http-route
virtual_hosts:
- name: service-c-svc-http-route
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: local_service
decorator:
operation: checkAvailability
http_filters:
- name: envoy.filters.http.fault
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.fault.v3.HTTPFault
max_active_faults: 100
delay:
fixed_delay: 3s
percentage:
numerator: 10
denominator: HUNDRED
- name: envoy.filters.http.router
typed_config: {
}
clusters:
- name: zipkin
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: zipkin
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: zipkin
port_value: 9411
- name: local_service
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: local_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8083
5.3 测试
启动后访问172.31.85.10,通过front envoy将流量转发给service_a的ingress,通过egress再分别转发给service_b和service_c
service_b 15%概率触发503
service_c 10%概率触发3秒延迟
root@k8s-node-1:/apps/envoy/servicemesh_in_practise/Monitoring-and-Tracing/zipkin-tracing# curl -vv 172.31.85.10
* Rebuilt URL to: 172.31.85.10/
* Trying 172.31.85.10...
* TCP_NODELAY set
* Connected to 172.31.85.10 (172.31.85.10) port 80 (#0)
> GET / HTTP/1.1
> Host: 172.31.85.10
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< date: Sun, 09 Oct 2022 07:04:24 GMT
< content-length: 85
< content-type: text/plain; charset=utf-8
< x-envoy-upstream-service-time: 6
< server: envoy
< x-b3-traceid: d54c7a67486decbd
< x-request-id: c7d5f29f-3740-9840-924b-af0f6847e9b4
<
Calling Service B: Hello from service B.
Hello from service A.
Hello from service C. # 这里有个delay注入
* Connection #0 to host 172.31.85.10 left intact
root@k8s-node-1:/apps/envoy/servicemesh_in_practise/Monitoring-and-Tracing/zipkin-tracing# curl -vv 172.31.85.10
* Rebuilt URL to: 172.31.85.10/
* Trying 172.31.85.10...
* TCP_NODELAY set
* Connected to 172.31.85.10 (172.31.85.10) port 80 (#0)
> GET / HTTP/1.1
> Host: 172.31.85.10
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< date: Sun, 09 Oct 2022 07:04:25 GMT
< content-length: 85
< content-type: text/plain; charset=utf-8
< x-envoy-upstream-service-time: 3
< server: envoy
< x-b3-traceid: b3a61e65ba0ccefc
< x-request-id: 544ff03c-c957-9e0b-9872-433064f10ba1
<
Calling Service B: Hello from service B.
Hello from service A.
Hello from service C.
* Connection #0 to host 172.31.85.10 left intact
root@k8s-node-1:/apps/envoy/servicemesh_in_practise/Monitoring-and-Tracing/zipkin-tracing# curl -vv 172.31.85.10
* Rebuilt URL to: 172.31.85.10/
* Trying 172.31.85.10...
* TCP_NODELAY set
* Connected to 172.31.85.10 (172.31.85.10) port 80 (#0)
> GET / HTTP/1.1
> Host: 172.31.85.10
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< date: Sun, 09 Oct 2022 07:04:26 GMT
< content-length: 81
< content-type: text/plain; charset=utf-8
< x-envoy-upstream-service-time: 5
< server: envoy
< x-b3-traceid: 420f2f6de10ab309
< x-request-id: ea487228-658e-9147-8ba5-e19fa25701e1
<
Calling Service B: fault filter abortHello from service A.
Hello from service C. # 这里有个503注入
* Connection #0 to host 172.31.85.10 left intact
503注入的error
有delay的
正常的
6. Jaeger示例
6.1 docker-compose
8个Service:
- front-envoy:Front Proxy,地址为172.31.88.10
- 6个后端服务
- service_a_envoy和service_a:对应于Envoy中的service_a集群,会调用service_b和service_c
- service_b_envoy和service_b:对应于Envoy中的service_b集群
- service_c_envoy和service_c:对应于Envoy中的service_c集群
- zipkin:Jaeger all-in-one服务
version: '3.3'
services:
front-envoy:
image: envoyproxy/envoy-alpine:v1.21-latest
environment:
- ENVOY_UID=0
- ENVOY_GID=0
volumes:
- "./front_envoy/envoy-config.yaml:/etc/envoy/envoy.yaml"
networks:
envoymesh:
ipv4_address: 172.31.88.10
aliases:
- front-envoy
- front
ports:
- 8080:80
- 9901:9901
service_a_envoy:
image: envoyproxy/envoy-alpine:v1.20.0
volumes:
- "./service_a/envoy-config.yaml:/etc/envoy/envoy.yaml"
networks:
envoymesh:
aliases:
- service_a_envoy
- service-a-envoy
ports:
- 8786
- 8788
- 8791
service_a:
build: service_a/
network_mode: "service:service_a_envoy"
#ports:
#- 8081
depends_on:
- service_a_envoy
service_b_envoy:
image: envoyproxy/envoy-alpine:v1.20.0
volumes:
- "./service_b/envoy-config.yaml:/etc/envoy/envoy.yaml"
networks:
envoymesh:
aliases:
- service_b_envoy
- service-b-envoy
ports:
- 8789
service_b:
build: service_b/
network_mode: "service:service_b_envoy"
#ports:
#- 8082
depends_on:
- service_b_envoy
service_c_envoy:
image: envoyproxy/envoy-alpine:v1.20.0
volumes:
- "./service_c/envoy-config.yaml:/etc/envoy/envoy.yaml"
networks:
envoymesh:
aliases:
- service_c_envoy
- service-c-envoy
ports:
- 8790
service_c:
build: service_c/
network_mode: "service:service_c_envoy"
#ports:
#- 8083
depends_on:
- service_c_envoy
jaeger:
image: jaegertracing/all-in-one:1.27
environment:
- COLLECTOR_ZIPKIN_HOST_PORT=9411
networks:
envoymesh:
ipv4_address: 172.31.88.15
aliases:
- zipkin
ports:
- "9411:9411"
- "16686:16686"
networks:
envoymesh:
driver: bridge
ipam:
config:
- subnet: 172.31.88.0/24
6.2 envoy.yaml
6.2.1 front envoy
front envoy上将所有流量转给service_a,并添加header x-b3-traceid和x-request-id
node:
id: front-envoy
cluster: front-envoy
admin:
profile_path: /tmp/envoy.prof
access_log_path: /tmp/admin_access.log
address:
socket_address:
address: 0.0.0.0
port_value: 9901
layered_runtime:
layers:
- name: admin
admin_layer: {
}
static_resources:
listeners:
- name: http_listener-service_a
address:
socket_address:
address: 0.0.0.0
port_value: 80
traffic_direction: OUTBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
generate_request_id: true
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: jaeger
collector_endpoint: "/api/v2/spans"
shared_span_context: false
collector_endpoint_version: HTTP_JSON
codec_type: AUTO
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: backend
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: service_a
decorator:
operation: checkAvailability
response_headers_to_add:
- header:
key: "x-b3-traceid"
value: "%REQ(x-b3-traceid)%"
- header:
key: "x-request-id"
value: "%REQ(x-request-id)%"
http_filters:
- name: envoy.filters.http.router
clusters:
- name: jaeger
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: jaeger
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: jaeger
port_value: 9411
- name: service_a
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_a
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: service_a_envoy
port_value: 8786
6.2.2 service_a
ingress,将入流量交给cluster: service_a
egress,将出流量导向service_b和service_c
通过collector_cluster加载jaeger,并将收集到的数据提交给zipkin的9411
node:
id: service-a
cluster: service-a
admin:
profile_path: /tmp/envoy.prof
access_log_path: /tmp/admin_access.log
address:
socket_address:
address: 0.0.0.0
port_value: 9901
layered_runtime:
layers:
- name: admin
admin_layer: {
}
static_resources:
listeners:
- name: service-a-svc-http-listener
address:
socket_address:
address: 0.0.0.0
port_value: 8786
traffic_direction: INBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: jaeger
collector_endpoint: "/api/v2/spans"
shared_span_context: false
collector_endpoint_version: HTTP_JSON
route_config:
name: service-a-svc-http-route
virtual_hosts:
- name: service-a-svc-http-route
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: local_service
decorator:
operation: checkAvailability
http_filters:
- name: envoy.filters.http.router
- name: service-b-svc-http-listener
address:
socket_address:
address: 0.0.0.0
port_value: 8788
traffic_direction: OUTBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: egress_http_to_service_b
codec_type: AUTO
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: jaeger
collector_endpoint: "/api/v2/spans"
shared_span_context: false
collector_endpoint_version: HTTP_JSON
route_config:
name: service-b-svc-http-route
virtual_hosts:
- name: service-b-svc-http-route
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: service_b
decorator:
operation: checkStock
http_filters:
- name: envoy.filters.http.router
- name: service-c-svc-http-listener
address:
socket_address:
address: 0.0.0.0
port_value: 8791
traffic_direction: OUTBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: egress_http_to_service_c
codec_type: AUTO
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: jaeger
collector_endpoint: "/api/v2/spans"
shared_span_context: false
collector_endpoint_version: HTTP_JSON
route_config:
name: service-c-svc-http-route
virtual_hosts:
- name: service-c-svc-http-route
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: service_c
decorator:
operation: checkStock
http_filters:
- name: envoy.filters.http.router
clusters:
- name: jaeger
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: jaeger
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: jaeger
port_value: 9411
- name: local_service
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: local_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8081
- name: service_b
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_b
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: service_b_envoy
port_value: 8789
- name: service_c
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_c
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: service_c_envoy
port_value: 8790
6.2.3 service_b
ingress,将入流量交给cluster: service_b
通过collector_cluster加载zipkin,并将收集到的数据提交给zipkin的9411
node:
id: service-b
cluster: service-b
admin:
profile_path: /tmp/envoy.prof
access_log_path: /tmp/admin_access.log
address:
socket_address:
address: 0.0.0.0
port_value: 9901
layered_runtime:
layers:
- name: admin
admin_layer: {
}
static_resources:
listeners:
- name: service-b-svc-http-listener
address:
socket_address:
address: 0.0.0.0
port_value: 8789
traffic_direction: INBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: jaeger
collector_endpoint: "/api/v2/spans"
shared_span_context: false
collector_endpoint_version: HTTP_JSON
route_config:
name: service-b-svc-http-route
virtual_hosts:
- name: service-b-svc-http-route
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: local_service
decorator:
operation: checkAvailability
http_filters:
- name: envoy.filters.http.fault
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.fault.v3.HTTPFault
max_active_faults: 100
abort:
http_status: 503
percentage:
numerator: 15
denominator: HUNDRED
- name: envoy.filters.http.router
typed_config: {
}
clusters:
- name: jaeger
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: jaeger
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: jaeger
port_value: 9411
- name: local_service
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: local_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8082
6.2.4 service_c
ingress,将入流量交给cluster: service_c
通过collector_cluster加载zipkin,并将收集到的数据提交给zipkin的9411
node:
id: service-c
cluster: service-c
admin:
profile_path: /tmp/envoy.prof
access_log_path: /tmp/admin_access.log
address:
socket_address:
address: 0.0.0.0
port_value: 9901
layered_runtime:
layers:
- name: admin
admin_layer: {
}
static_resources:
listeners:
- name: service-c-svc-http-listener
address:
socket_address:
address: 0.0.0.0
port_value: 8790
traffic_direction: INBOUND
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
tracing:
provider:
name: envoy.tracers.zipkin
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
collector_cluster: jaeger
collector_endpoint: "/api/v2/spans"
shared_span_context: false
collector_endpoint_version: HTTP_JSON
route_config:
name: service-c-svc-http-route
virtual_hosts:
- name: service-c-svc-http-route
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: local_service
decorator:
operation: checkAvailability
http_filters:
- name: envoy.filters.http.fault
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.fault.v3.HTTPFault
max_active_faults: 100
delay:
fixed_delay: 3s
percentage:
numerator: 10
denominator: HUNDRED
- name: envoy.filters.http.router
typed_config: {
}
clusters:
- name: jaeger
type: STRICT_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: jaeger
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: jaeger
port_value: 9411
- name: local_service
connect_timeout: 0.25s
type: strict_dns
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: local_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8083
6.3 测试
启动后访问172.31.88.10,通过front envoy将流量转发给service_a的ingress,通过egress再分别转发给service_b和service_c
service_b 15%概率触发503
service_c 10%概率触发3秒延迟
正常情况
# curl -vv 172.31.88.10
* Rebuilt URL to: 172.31.88.10/
* Trying 172.31.88.10...
* TCP_NODELAY set
* Connected to 172.31.88.10 (172.31.88.10) port 80 (#0)
> GET / HTTP/1.1
> Host: 172.31.88.10
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< date: Mon, 10 Oct 2022 01:03:48 GMT
< content-length: 85
< content-type: text/plain; charset=utf-8
< x-envoy-upstream-service-time: 2
< server: envoy
< x-b3-traceid: 955f82cfe41e8b8d
< x-request-id: 4e67dfcb-bd79-9565-9da3-ca66d975d56d
<
Calling Service B: Hello from service B.
Hello from service A.
Hello from service C.
* Connection #0 to host 172.31.88.10 left intact
这里正好一次既碰上了delay又有abort
# curl -vv 172.31.88.10
* Rebuilt URL to: 172.31.88.10/
* Trying 172.31.88.10...
* TCP_NODELAY set
* Connected to 172.31.88.10 (172.31.88.10) port 80 (#0)
> GET / HTTP/1.1
> Host: 172.31.88.10
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< date: Mon, 10 Oct 2022 01:03:51 GMT
< content-length: 81
< content-type: text/plain; charset=utf-8
< x-envoy-upstream-service-time: 3006
< server: envoy
< x-b3-traceid: 4d80e0408e190447
< x-request-id: a58608ca-2662-9696-96c6-08b48a161baa
<
Calling Service B: fault filter abortHello from service A.
Hello from service C.
* Connection #0 to host 172.31.88.10 left intact