This article introduces the watch scene of kubernetes for etcd, some designs of k8s in performance optimization, and introduces the scenarios of cache, timer, serialized cache, bookmark mechanism, forget mechanism, data index and ringbuffer and other components one by one and the problems solved. , I hope to help those friends who are interested in the implementation of the watch mechanism in apiserver
1. Event-driven and controller
In k8s, the specific processing logic of the business is not coupled to the rest interface. The rest interface is only responsible for data storage. Through the controller mode, the coupling between data storage and business logic is separated to ensure the simplicity of the apiserver business logic.
The controller perceives the data change of the corresponding resource through the watch interface, so as to decide the control of the business logic according to the difference between the expected state in the resource object and the current state. What the watch does in essence is to sense the event. Occurs to controllers that pay attention to the event
2. The core mechanism of Watch
Here we first introduce the basic watch module based on etcd implementation
2.1 Event types and etcd
A data change is essentially nothing more than three types: new, updated and deleted. Among them, adding and deleting are relatively easy because they can be obtained through the current data, while the update may need to obtain the previous data, which is actually with the help of etcd The revision and mvcc mechanisms are implemented in this way, so that the previous state and the updated state can be obtained, and subsequent notifications can be obtained.
2.2 Event Pipeline
The event pipeline is responsible for the transmission of events. In the implementation of the watch, the distribution of messages is realized through a two-level pipeline. First, the events of interest are obtained through the key in the watch etcd, and the data is parsed to complete the process from bytes to internal events. Convert and send it to the input channel (incomingEventChan), and then there will be a thread in the background responsible for obtaining data from the input channel, parse it and send it to the output channel (resultChan), and then read the event from the channel and send it to the corresponding client
2.3 Event Buffer
The event buffer means that if the corresponding event handler does not match the rate at which the current event occurs, a certain buffer is required to temporarily store the events due to the rate mismatch. In go, we usually use a buffered chan to build
At this point, a basic watch service is basically implemented, which listens for data through the watch interface of etcd, then starts an independent goroutine to consume events, and sends them to the event pipeline for other interfaces to call.
3. Hide
All data and systems in kubernetes are implemented based on etcd. How to reduce access pressure? The answer is caching, and so is watch. In this section, we will see how to implement the watch caching mechanism. The cacher here is for
3.1 Reflector
Reflector is a component in client-go. It obtains data through the listwatch interface and stores it in its own internal store. The cacher uses this component to perform watch operations from etcd to avoid creating an etcd watch for each component.
3.2 watchCache
The wacthCache is responsible for storing the events of the watch, and establishes the corresponding local index cache for the events of the watch. At the same time, when building the watchCache, it is also responsible for the delivery of the event, which passes the events of the watch to the upper-level Cacher component through the eventHandler.
3.3 cacheWatcher
As the name suggests, cacheWatcher is a watcher (watch.Interface) implementation for cache. The front-end watchServer is responsible for getting events from ResultChan and forwarding them
3.4 Hide
Cacher's etcd-based store combines the above watchCache and Reflector to build a cached REST store. For ordinary additions, deletions, and modifications, it is directly forwarded to etcd's store for underlying operations, while for watch operations, it intercepts, builds and returns cacheWatcher components
4. Cacher optimization
After reading the implementation of the basic components, let's take a look at the optimizations made in k8s for the watch scene, and learn the optimization scheme for similar scenes
4.1 Serialization Cache
If we have multiple watchers all wacth the same event, we all need to serialize them at the end. When the cacher is distributing, if it finds more than the specified number of watchers, it will build and build for it when dispatching. A cache function that will only be serialized once for multiple watchers
4.2 nonblocking
We mentioned the event buffer above, but if a watcher consumes too slowly, it will still affect the distribution of events. For this reason, the cacher divides the watcher into two parts by whether it is blocked (whether the data can be directly written to the pipeline). Class, for watchers that cannot deliver events immediately, they will be retried later
4.3 TimeBudget
When the blocked watcher is retrying, it will build a timer through dispatchTimeoutBudget to control the timeout. What is a Budget? In fact, if the retry succeeds immediately during this period, the remaining time will be , in the next timing, you can use the remaining balance before, but there is also a thread in the background for periodic reset
4.4 forget mechanism
For the above TimeBudget, if the retry cannot be successful within a given time, the corresponding watcher will be deleted through forget, so that the watcher with particularly slow consumption can re-establish the watch through subsequent retry, thus Reduce watch pressure on apiserver
4.5 bookmark mechanism
The bookmark mechanism is an optimization solution provided by Alibaba. Its core purpose is to avoid that a single resource has no corresponding event. At this time, the revision of the corresponding informer will lag behind the cluster by a large amount. Carry out revision transfer, so that the informer will not fall behind very much after restarting
4.6 ringbuffer in watchCache
In watchCache, the corresponding index cache is built through store, but during listwatch operation, it is usually necessary to obtain all the data after a revision. For this type of data, a ringbuffer is built in watchCache to cache historical data.
5. Design summary
This article introduces the watch scene of kubernetes for etcd, some designs of k8s in performance optimization, and introduces the scenarios of cache, timer, serialized cache, bookmark mechanism, forget mechanism, data index and ringbuffer and other components one by one and the problems solved. , I hope to help those friends who are interested in the implementation of the watch mechanism in apiserver
kubernetes study notes address: https://www.yuque.com/baxiaoshi/tyado3
> WeChat ID: baxiaoshi2020
> Pay attention to the bulletin number to read more source code analysis articles