Illustration of the key design of etcd-based watch in kubernetes

This article introduces the watch scene of kubernetes for etcd, some designs of k8s in performance optimization, and introduces the scenarios of cache, timer, serialized cache, bookmark mechanism, forget mechanism, data index and ringbuffer and other components one by one and the problems solved. , I hope to help those friends who are interested in the implementation of the watch mechanism in apiserver

1. Event-driven and controller

image.png

In k8s, the specific processing logic of the business is not coupled to the rest interface. The rest interface is only responsible for data storage. Through the controller mode, the coupling between data storage and business logic is separated to ensure the simplicity of the apiserver business logic.image.png

The controller perceives the data change of the corresponding resource through the watch interface, so as to decide the control of the business logic according to the difference between the expected state in the resource object and the current state. What the watch does in essence is to sense the event. Occurs to controllers that pay attention to the event

2. The core mechanism of Watch

Here we first introduce the basic watch module based on etcd implementation

2.1 Event types and etcd

image.png

A data change is essentially nothing more than three types: new, updated and deleted. Among them, adding and deleting are relatively easy because they can be obtained through the current data, while the update may need to obtain the previous data, which is actually with the help of etcd The revision and mvcc mechanisms are implemented in this way, so that the previous state and the updated state can be obtained, and subsequent notifications can be obtained.

2.2 Event Pipeline

image.png

The event pipeline is responsible for the transmission of events. In the implementation of the watch, the distribution of messages is realized through a two-level pipeline. First, the events of interest are obtained through the key in the watch etcd, and the data is parsed to complete the process from bytes to internal events. Convert and send it to the input channel (incomingEventChan), and then there will be a thread in the background responsible for obtaining data from the input channel, parse it and send it to the output channel (resultChan), and then read the event from the channel and send it to the corresponding client

2.3 Event Buffer

The event buffer means that if the corresponding event handler does not match the rate at which the current event occurs, a certain buffer is required to temporarily store the events due to the rate mismatch. In go, we usually use a buffered chan to buildimage.png

At this point, a basic watch service is basically implemented, which listens for data through the watch interface of etcd, then starts an independent goroutine to consume events, and sends them to the event pipeline for other interfaces to call.

3. Hide

All data and systems in kubernetes are implemented based on etcd. How to reduce access pressure? The answer is caching, and so is watch. In this section, we will see how to implement the watch caching mechanism. The cacher here is for

3.1 Reflector

image.png

Reflector is a component in client-go. It obtains data through the listwatch interface and stores it in its own internal store. The cacher uses this component to perform watch operations from etcd to avoid creating an etcd watch for each component.

3.2 watchCache

image.png

The wacthCache is responsible for storing the events of the watch, and establishes the corresponding local index cache for the events of the watch. At the same time, when building the watchCache, it is also responsible for the delivery of the event, which passes the events of the watch to the upper-level Cacher component through the eventHandler.

3.3 cacheWatcher

image.png

As the name suggests, cacheWatcher is a watcher (watch.Interface) implementation for cache. The front-end watchServer is responsible for getting events from ResultChan and forwarding them

3.4 Hide

image.png

Cacher's etcd-based store combines the above watchCache and Reflector to build a cached REST store. For ordinary additions, deletions, and modifications, it is directly forwarded to etcd's store for underlying operations, while for watch operations, it intercepts, builds and returns cacheWatcher components

4. Cacher optimization

After reading the implementation of the basic components, let's take a look at the optimizations made in k8s for the watch scene, and learn the optimization scheme for similar scenes

4.1 Serialization Cache

image.png

If we have multiple watchers all wacth the same event, we all need to serialize them at the end. When the cacher is distributing, if it finds more than the specified number of watchers, it will build and build for it when dispatching. A cache function that will only be serialized once for multiple watchers

4.2 nonblocking

image.png

We mentioned the event buffer above, but if a watcher consumes too slowly, it will still affect the distribution of events. For this reason, the cacher divides the watcher into two parts by whether it is blocked (whether the data can be directly written to the pipeline). Class, for watchers that cannot deliver events immediately, they will be retried later

4.3 TimeBudget

When the blocked watcher is retrying, it will build a timer through dispatchTimeoutBudget to control the timeout. What is a Budget? In fact, if the retry succeeds immediately during this period, the remaining time will be , in the next timing, you can use the remaining balance before, but there is also a thread in the background for periodic reset

4.4 forget mechanism

image.png

For the above TimeBudget, if the retry cannot be successful within a given time, the corresponding watcher will be deleted through forget, so that the watcher with particularly slow consumption can re-establish the watch through subsequent retry, thus Reduce watch pressure on apiserver

4.5 bookmark mechanism

image.png

The bookmark mechanism is an optimization solution provided by Alibaba. Its core purpose is to avoid that a single resource has no corresponding event. At this time, the revision of the corresponding informer will lag behind the cluster by a large amount. Carry out revision transfer, so that the informer will not fall behind very much after restarting

4.6 ringbuffer in watchCache

image.png

In watchCache, the corresponding index cache is built through store, but during listwatch operation, it is usually necessary to obtain all the data after a revision. For this type of data, a ringbuffer is built in watchCache to cache historical data.

5. Design summary

image.png

This article introduces the watch scene of kubernetes for etcd, some designs of k8s in performance optimization, and introduces the scenarios of cache, timer, serialized cache, bookmark mechanism, forget mechanism, data index and ringbuffer and other components one by one and the problems solved. , I hope to help those friends who are interested in the implementation of the watch mechanism in apiserver

kubernetes study notes address: https://www.yuque.com/baxiaoshi/tyado3

> WeChat ID: baxiaoshi2020

> Pay attention to the bulletin number to read more source code analysis articlesGraphical source code

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324206427&siteId=291194637