Enhanced Kubernetes Observability: API Server Tracing Feature Arrives in Beta

title

In a distributed system, it can be difficult to figure out where the problem is.  Imagine a scenario, which is also the most common problem encountered by Kubernetes cluster administrators. Pods cannot be started normally. At this time, as an administrator, we will first think about which component may have a problem, and then search for the log of the corresponding component. It turns out that the problem may be caused by another component. At this time, we look for another log. This is the best situation, and we can find clues through the log. Sometimes the clues are not so obvious, and we may need to think hard about what went wrong. Sometimes we need to rely on guesswork and spend a lot of time locating the problem. At this time, the administrator needs to have a more comprehensive understanding of each component of the cluster. Understand, this makes learning and troubleshooting costly. In this case, if we have distributed tracing, we can clearly see which component is abnormal and quickly locate the problematic place.

Distributed systems often have non-deterministic problems, or are too complex to reproduce locally. Tracing makes debugging and understanding distributed systems less daunting by breaking down what happens as requests flow through them. Distributed tracing is a tool designed to help in these situations, and the Kubernetes API server is perhaps the most important Kubernetes component that can be debugged.

In Kubernetes, API Server is the core component for managing and scheduling all cluster resources. It receives and processes requests from various clients and converts them into underlying resource operations. Therefore, the stability and observability of the API Server is critical to the overall health of Kubernetes.

In order to improve the observability of Kubernetes API Server, it helps administrators to better manage and maintain Kubernetes clusters. To this end, Kubernetes introduces APIServer Tracing, which adds more trace information to the Kubernetes API Server and collects it into the backend collector . Through these tracking information, administrators can more easily trace the source and flow of requests, understand the processing time and results of requests, and thus find and solve problems more easily. This information can also be used for performance optimization and capacity planning.

Next, let's start exploring this feature.

01

Kubernetes
API Server Tracking

Design details: KEP APIServer Tracing #647

turn on

  • APIServerTracing feature gating (feature gating is no longer required for v1.27+)

  • tracing-config-file configuration file

status quo

Responsible Team: Responsible by sig instrumentation

Iterative version: 1.22 alpha, 1.27 reached the beta version

Traced components: API → etcd tracing

02

demo

step:

  1. Start Jaeger

  2. Start APIServer tracing (including starting etcd)

  3. View Jaeger UI Observation Tracing

Start the Jaeger container

Jaeger is a popular UI tool for distributed tracing, and it is also the seventh top-level project hosted by the Cloud Native Computing Foundation (CNCF) (graduated in October 2019). Here, Jaeger is selected as the backend for data collection and storage, as well as the UI for visualizing data. jaegertracing/all-in-one is an executable designed for quick local testing that starts the Jaeger UI, collectors, queries and agents, with an in-memory storage component.

docker run -d --name jaeger \
  -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \
  -e COLLECTOR_OTLP_ENABLED=true \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 4317:4317 \
  -p 4318:4318 \
  -p 14250:14250 \
  -p 14268:14268 \
  -p 14269:14269 \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.43

Details:  https://www.jaegertracing.io/docs/1.43/getting-started/

Start Kubernetes API Server tracing

Start Kubernetes APIServer tracing This article provides two ways to test. If you are a Kubernetes developer, you can directly test in the Kubernetes interactive test; if you are a Kubernetes cluster administrator, you can directly configure relevant parameters in the cluster.

Start Kubernetes local interaction test

Test file: test/integration/apiserver/tracing/tracing_test.go

Modify API Server tracing test code & configuration

The local interactive testing code needs to be configured to send the data it collects to Jaeger.

#test/integration/apiserver/tracing/tracing_test.go:125

if err := os.WriteFile(tracingConfigFile.Name(), []byte(fmt.Sprintf(`
apiVersion: apiserver.config.k8s.io/v1beta1
kind: TracingConfiguration
samplingRatePerMillion: 1000000
endpoint: %s`, "0.0.0.0:4317")), os.FileMode(0755)); err != nil {
  t.Fatal(err)
 }

start etcd

Parameters that need to be configured:

--experimental-enable-distributed-tracing=true
--experimental-distributed-tracing-address=0.0.0.0:4317
--experimental-distributed-tracing-service-name=etcd

Modify the code:

#test/integration/framework/etcd.go:82
customFlags := []string{
  "--experimental-enable-distributed-tracing",
  "--experimental-distributed-tracing-address=0.0.0.0:4317",
  "--experimental-distributed-tracing-service-name=etcd",
 }

currentURL, stop, err := RunCustomEtcd("integration_test_etcd_data", customFlags, output)

run test

cd ./test/integration/apiserver/tracing
go test -run TestAPIServerTracing

Configure API Server tracing in a Kubernetes cluster

Here we take the Kubernetes cluster installed by kubeadm as an example.

Configure the feature gate APIServerTracing=true in the kube-apiserver.yaml configuration list (1.27 and above versions no longer need to configure this feature gate).

Configure the tracing-config-file file, here we save this file in /etc/kubernetes/apitracing-config.yaml.

apiVersion: apiserver.config.k8s.io/v1beta1
kind: TracingConfiguration
endpoint: 10.6.9.3:4317
samplingRatePerMillion: 100000  #采样频率,根据自身需要设置
vim /etc/kubernetes/manifests/kube-apiserver.yaml
spec:
  containers:
  - command:
    - kube-apiserver
    - --feature-gates=APIServerTracing=true
    - --tracing-config-file=/etc/kubernetes/apitracing-config.yaml

Just save and exit, and the kubelet will automatically restart the APIServer.

Configure the following parameters in the etcd.yaml configuration list:

vim /etc/kubernetes/manifests/etcd.yaml
spec:
  containers:
    - command:
        - etcd
        - --experimental-distributed-tracing-address=<JaegerIP:4317>
        - --experimental-distributed-tracing-service-name=etcd
        - --experimental-enable-distributed-tracing=true

Just save and exit, and kubelet will automatically restart etcd.

View Jaeger

At this point we can access Jaeger. Address: http://<JaegerIP>:16686/ On the Jaeger interface, we can clearly see the tracking path of the request.

The cyan line comes from the API server, includes a service request to /api/v1/nodes, and issues a grpc Range RPC to etcd. The yellow line is from ETCD handling Range RPC.

03

epilogue

SIG instrumentation is actively promoting the traceability of Kubernetes components, and now both APIServer Tracing and kubelet Tracing have reached the Beta version in Kubernetes v1.27, so stay tuned!

References

[1]https://opentelemetry.io/docs/ kubernetes/enhancements#647

[2]https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/647-apiserver-tracing# kubernetes/kubernetes#94942 etcd-io/etcd#12919

[3]https://kubernetes.io/zh-cn/docs/concepts/cluster-administration/system-traces/

[4] https://www.jaegertracing.io/ 

[5]https://github.com/jaegertracing/jaegerhttps://medium.com/opentracing/take-opentracing-for-a-hotrod-ride-f6e3141f7941


 author of this article 

Liu Mengjiao

Current "DaoCloud Taoke" open source engineer

Kubernetes SIG Docs Approver 

Kubernetes WG structured logging Reviewer

 

Supongo que te gusta

Origin blog.csdn.net/DaoCloud_daoke/article/details/131174429
Recomendado
Clasificación