流量的熔断

熔断

熔断，是创建弹性微服务应用程序的重要模式。熔断能够使应用程序具备应对来自故障、潜在峰值和其他未知网络因素影响的能力。

部署 httpbin 服务：

应用程序 httpbin 作为后端服务。

cat samples/httpbin/httpbin.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: httpbin
---
apiVersion: v1
kind: Service
metadata:
  name: httpbin
  labels:
    app: httpbin
    service: httpbin
spec:
  ports:
  - name: http
    port: 8000
    targetPort: 80
  selector:
    app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpbin
      version: v1
  template:
    metadata:
      labels:
        app: httpbin
        version: v1
    spec:
      serviceAccountName: httpbin
      containers:
      - image: docker.io/kennethreitz/httpbin
        imagePullPolicy: IfNotPresent
        name: httpbin
        ports:
        - containerPort: 80

kubectl apply -f samples/httpbin/httpbin.yaml

kubectl get pod httpbin-66cdbdb6c5-9w4p6

NAME                       READY   STATUS    RESTARTS   AGE
httpbin-66cdbdb6c5-9w4p6   2/2     Running   0          5m23s

配置熔断器：

创建一个 DestinationRule，在调用 httpbin 服务时应用熔断设置。

如果 Istio 启用了双向 TLS 身份验证，则必须在应用目标规则之前将 TLS 流量策略 mode：ISTIO_MUTUAL 添加到 DestinationRule 。否则请求将产生 503 错误。

mkdir samples/httpbin/networking

vim samples/httpbin/networking/destination-rule-httpbin.yaml

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: httpbin
spec:
  host: httpbin
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 1
      http:
        http1MaxPendingRequests: 1
        maxRequestsPerConnection: 1
    outlierDetection:
      consecutiveErrors: 1
      interval: 1s
      baseEjectionTime: 3m
      maxEjectionPercent: 100

kubectl apply -f samples/httpbin/networking/destination-rule-httpbin.yaml

kubectl get dr httpbin

NAME      HOST      AGE
httpbin   httpbin   3m42s

创建客户端程序：

增加一个客户，创建客户端程序以发送流量到 httpbin 服务。这是一个名为 fortio 的负载测试客户的，其可以控制连接数、并发数及发送 HTTP 请求的延迟。通过 fortio 能够有效的触发前面在 DestinationRule 中设置的熔断策略。

cat samples/httpbin/sample-client/fortio-deploy.yaml

apiVersion: v1
kind: Service
metadata:
  name: fortio
  labels:
    app: fortio
    service: fortio
spec:
  ports:
  - port: 8080
    name: http
  selector:
    app: fortio
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fortio-deploy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: fortio
  template:
    metadata:
      annotations:
        sidecar.istio.io/statsInclusionPrefixes: cluster.outbound,cluster_manager,listener_manager,http_mixer_filter,tcp_mixer_filter,server,cluster.xds-grpc
      labels:
        app: fortio
    spec:
      containers:
      - name: fortio
        image: fortio/fortio:latest_release
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
          name: http-fortio
        - containerPort: 8079
          name: grpc-ping

kubectl apply -f samples/httpbin/sample-client/fortio-deploy.yaml

登入客户端 Pod 并使用 fortio 工具调用 httpbin 服务。-curl 参数表明发送一次调用：

FORTIO_POD=$(kubectl get pod | grep fortio | awk '{ print $1 }')

kubectl exec -it $FORTIO_POD -c fortio -- /usr/bin/fortio load -curl  http://httpbin:8000/get

HTTP/1.1 200 OK
server: envoy
date: Mon, 01 Mar 2021 02:32:23 GMT
content-type: application/json
content-length: 622
access-control-allow-origin: *
access-control-allow-credentials: true
x-envoy-upstream-service-time: 55

{
    
    
  "args": {
    
    }, 
  "headers": {
    
    
    "Content-Length": "0", 
    "Host": "httpbin:8000", 
    "User-Agent": "fortio.org/fortio-1.11.3", 
    "X-B3-Parentspanid": "782cd308639c0f00", 
    "X-B3-Sampled": "0", 
    "X-B3-Spanid": "adbda1f9940d1821", 
    "X-B3-Traceid": "4ee996565afd9133782cd308639c0f00", 
    "X-Envoy-Attempt-Count": "1", 
    "X-Forwarded-Client-Cert": "By=spiffe://cluster.local/ns/default/sa/httpbin;Hash=44b5c92b9b7af426d81bc6e05b6c9a4819037d54acfdf890c5220619a0c0a869;Subject=\"\";URI=spiffe://cluster.local/ns/default/sa/default"
  }, 
  "origin": "127.0.0.1", 
  "url": "http://httpbin:8000/get"
}

可以看到，调用后端服务的请求已经成功。接下来可以测试熔断。

触发熔断器：

在 DestinationRule 配置中，已经定义了 maxConnections: 1 和 http1MaxPendingRequests: 1。这些规则意味着，如果并发的连接和请求数超过 1 个，在 istio-proxy 进行进一步的请求和连接时，后续的请求或连接将被阻止。

发送并发数为 2 的连接（-c 2），请求 20 次（-n 20）：

kubectl exec -it $FORTIO_POD -c fortio -- /usr/bin/fortio load -c 2 -qps 0 -n 20 -loglevel Warning http://httpbin:8000/get

02:42:27 I logger.go:127> Log level is now 3 Warning (was 2 Info)
Fortio 1.11.3 running at 0 queries per second, 2->2 procs, for 20 calls: http://httpbin:8000/get
Starting at max qps with 2 thread(s) [gomax 2] for exactly 20 calls (10 per thread + 0)
02:42:27 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
Ended after 72.128739ms : 20 calls. qps=277.28
Aggregated Function Time : count 20 avg 0.0064766836 +/- 0.003552 min 0.003330814 max 0.015536857 sum 0.129533672
# range, mid point, percentile, count
>= 0.00333081 <= 0.004 , 0.00366541 , 30.00, 6
> 0.004 <= 0.005 , 0.0045 , 50.00, 4
> 0.005 <= 0.006 , 0.0055 , 65.00, 3
> 0.006 <= 0.007 , 0.0065 , 75.00, 2
> 0.007 <= 0.008 , 0.0075 , 80.00, 1
> 0.008 <= 0.009 , 0.0085 , 85.00, 1
> 0.012 <= 0.014 , 0.013 , 95.00, 2
> 0.014 <= 0.0155369 , 0.0147684 , 100.00, 1
# target 50% 0.005
# target 75% 0.007
# target 90% 0.013
# target 99% 0.0152295
# target 99.9% 0.0155061
Sockets used: 3 (for perfect keepalive, would be 2)
Jitter: false
Code 200 : 19 (95.0 %)
Code 503 : 1 (5.0 %)
Response Header Sizes : count 20 avg 218.55 +/- 50.14 min 0 max 231 sum 4371
Response Body/Total Sizes : count 20 avg 821.5 +/- 133.2 min 241 max 853 sum 16430
All done 20 calls (plus 0 warmup) 6.477 ms avg, 277.3 qps

可以看到，几乎所有的请求都完成了。istio-proxy 允许存在一些误差。

将并发连接数提高到 3 个（-c 3），请求 30 次（-n 30）：

kubectl exec -it $FORTIO_POD -c fortio -- /usr/bin/fortio load -c 3 -qps 0 -n 30 -loglevel Warning http://httpbin:8000/get

02:46:45 I logger.go:127> Log level is now 3 Warning (was 2 Info)
Fortio 1.11.3 running at 0 queries per second, 2->2 procs, for 30 calls: http://httpbin:8000/get
Starting at max qps with 3 thread(s) [gomax 2] for exactly 30 calls (10 per thread + 0)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:46:45 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
Ended after 49.708473ms : 30 calls. qps=603.52
Aggregated Function Time : count 30 avg 0.0042163392 +/- 0.004712 min 0.000489901 max 0.014537558 sum 0.126490177
# range, mid point, percentile, count
>= 0.000489901 <= 0.001 , 0.000744951 , 36.67, 11
> 0.001 <= 0.002 , 0.0015 , 56.67, 6
> 0.002 <= 0.003 , 0.0025 , 66.67, 3
> 0.004 <= 0.005 , 0.0045 , 70.00, 1
> 0.006 <= 0.007 , 0.0065 , 73.33, 1
> 0.009 <= 0.01 , 0.0095 , 80.00, 2
> 0.01 <= 0.011 , 0.0105 , 86.67, 2
> 0.011 <= 0.012 , 0.0115 , 93.33, 2
> 0.014 <= 0.0145376 , 0.0142688 , 100.00, 2
# target 50% 0.00166667
# target 75% 0.00925
# target 90% 0.0115
# target 99% 0.0144569
# target 99.9% 0.0145295
Sockets used: 24 (for perfect keepalive, would be 3)
Jitter: false
Code 200 : 8 (26.7 %)
Code 503 : 22 (73.3 %)
Response Header Sizes : count 30 avg 61.433333 +/- 101.9 min 0 max 231 sum 1843
Response Body/Total Sizes : count 30 avg 404.03333 +/- 270.4 min 241 max 853 sum 12121
All done 30 calls (plus 0 warmup) 4.216 ms avg, 603.5 qps

可以看到预期的熔断行为，只有 26.7 % 的请求成功，其余的均被熔断器拦截：

Code 200 : 8 (26.7 %)
Code 503 : 22 (73.3 %)

通过 istio-proxy 状态查看熔断详情：

kubectl exec $FORTIO_POD -c istio-proxy -- pilot-agent request GET stats | grep httpbin | grep pending

cluster.outbound|8000||httpbin.default.svc.cluster.local.circuit_breakers.default.rq_pending_open: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.circuit_breakers.high.rq_pending_open: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_active: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_failure_eject: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_overflow: 23
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_total: 28

可以看到 upstream_rq_pending_overflow 值是 23，这意味着，目前为止已有 23 个调用被标记为熔断。

修改熔断器：

修改 DestinationRule，配置为 maxConnections: 3 和 http1MaxPendingRequests: 3。

vim samples/httpbin/networking/destination-rule-httpbin.yaml

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: httpbin
spec:
  host: httpbin
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 3
      http:
        http1MaxPendingRequests: 3
        maxRequestsPerConnection: 1
    outlierDetection:
      consecutiveErrors: 1
      interval: 1s
      baseEjectionTime: 3m
      maxEjectionPercent: 100

kubectl apply -f samples/httpbin/networking/destination-rule-httpbin.yaml

kubectl exec -it $FORTIO_POD -c fortio -- /usr/bin/fortio load -c 3 -qps 0 -n 30 -loglevel Warning http://httpbin:8000/get

03:13:34 I logger.go:127> Log level is now 3 Warning (was 2 Info)
Fortio 1.11.3 running at 0 queries per second, 2->2 procs, for 30 calls: http://httpbin:8000/get
Starting at max qps with 3 thread(s) [gomax 2] for exactly 30 calls (10 per thread + 0)
Ended after 143.869637ms : 30 calls. qps=208.52
Aggregated Function Time : count 30 avg 0.014009803 +/- 0.01431 min 0.005262414 max 0.05379173 sum 0.42029408
# range, mid point, percentile, count
>= 0.00526241 <= 0.006 , 0.00563121 , 23.33, 7
> 0.006 <= 0.007 , 0.0065 , 40.00, 5
> 0.007 <= 0.008 , 0.0075 , 56.67, 5
> 0.008 <= 0.009 , 0.0085 , 63.33, 2
> 0.009 <= 0.01 , 0.0095 , 66.67, 1
> 0.01 <= 0.011 , 0.0105 , 73.33, 2
> 0.012 <= 0.014 , 0.013 , 80.00, 2
> 0.025 <= 0.03 , 0.0275 , 90.00, 3
> 0.05 <= 0.0537917 , 0.0518959 , 100.00, 3
# target 50% 0.0076
# target 75% 0.0125
# target 90% 0.03
# target 99% 0.0534126
# target 99.9% 0.0537538
Sockets used: 3 (for perfect keepalive, would be 3)
Jitter: false
Code 200 : 30 (100.0 %)
Response Header Sizes : count 30 avg 230.26667 +/- 0.4422 min 230 max 231 sum 6908
Response Body/Total Sizes : count 30 avg 852.26667 +/- 0.4422 min 852 max 853 sum 25568
All done 30 calls (plus 0 warmup) 14.010 ms avg, 208.5 qps

此时可以看到，100% 的请求成功，所有的请求均未被熔断：

Code 200 : 30 (100.0 %)

熔断

猜你喜欢