How to efficiently control K8s resource changes? Analysis of K8s Informer Implementation Mechanism

overview

Entering the world of K8s, you will find that there are many Controllers, all of which are used to complete the tuning of certain types of resources (such as pods are managed through DeploymentController, ReplicaSetController), and the goal is to maintain the state expected by users.
There are dozens of types of resources in K8s. How to enable K8s internal and external users to conveniently and efficiently obtain the changes of certain types of resources is what this article Informer wants to achieve. This article will start from Reflector (reflector), DeltaFIFO (increment queue), Indexer (indexer), Controller (controller), SharedInformer (shared resource notifier), processorListener (event listener processor), workqueue (event processing work queue) ) and so on for analysis.
insert image description here

Starting with Reflector

The main responsibility of Reflector is to pull and continuously monitor (ListAndWatch) the Add/Update/Delete events of related resource types from the apiserver, and store them in the local cache (local Store) implemented by DeltaFIFO.

First look at the Reflector structure definition:

// staging/src/k8s.io/client-go/tools/cache/reflector.go
type Reflector struct {
    
    
 // 通过 file:line 唯一标识的 name
 name string

 // 下面三个为了确认类型
 expectedTypeName string
 expectedType     reflect.Type
 expectedGVK      *schema.GroupVersionKind

 // 存储 interface: 具体由 DeltaFIFO 实现存储
 store Store
 // 用来从 apiserver 拉取全量和增量资源
 listerWatcher ListerWatcher

 // 下面两个用来做失败重试
 backoffManager         wait.BackoffManager
 initConnBackoffManager wait.BackoffManager

 // informer 使用者重新同步的周期
 resyncPeriod time.Duration
 // 判断是否满足可以重新同步的条件
 ShouldResync func() bool
 
 clock clock.Clock
 
 // 是否要进行分页 List
 paginatedResult bool
 
 // 最后同步的资源版本号,以此为依据,watch 只会监听大于此值的资源
 lastSyncResourceVersion string
 // 最后同步的资源版本号是否可用
 isLastSyncResourceVersionUnavailable bool
 // 加把锁控制版本号
 lastSyncResourceVersionMutex sync.RWMutex
 
 // 每页大小
 WatchListPageSize int64
 // watch 失败回调 handler
 watchErrorHandler WatchErrorHandler
}

As can be seen from the structure definition, ListAndWatch can be performed by specifying the target resource type, and paging related settings can be performed.
After pulling the full amount of resources (target resource type) for the first time, use the syncWith function to replace (Replace) the entire amount into the DeltaFIFO queue/items, and then continuously monitor the Watch (target resource type) incremental events, and update to the DeltaFIFO queue/items by deduplication items, waiting to be consumed.

The watch target type is implemented through Go reflect reflection as follows:

// staging/src/k8s.io/client-go/tools/cache/reflector.go
// watchHandler watches w and keeps *resourceVersion up to date.
func (r *Reflector) watchHandler(start time.Time, w watch.Interface, resourceVersion *string, errc chan error, stopCh <-chan struct{
    
    }) error {
    
    

 ...
 if r.expectedType != nil {
    
    
  if e, a := r.expectedType, reflect.TypeOf(event.Object); e != a {
    
    
   utilruntime.HandleError(fmt.Errorf("%s: expected type %v, but watch event object had type %v", r.name, e, a))
   continue
  }
 }
 if r.expectedGVK != nil {
    
    
  if e, a := *r.expectedGVK, event.Object.GetObjectKind().GroupVersionKind(); e != a {
    
    
   utilruntime.HandleError(fmt.Errorf("%s: expected gvk %v, but watch event object had gvk %v", r.name, e, a))
   continue
  }
 }
 ...
}

The target resource type is confirmed by reflection, so it is more appropriate to name it Reflector; the target resource type of List/Watch is determined in NewSharedIndexInformer.ListerWatcher, but Watch will compare the target type again in watchHandler;

Meet DeltaFIFO

Or first look at the definition of the DeltaFIFO structure:

// staging/src/k8s.io/client-go/tools/cache/delta_fifo.go
type DeltaFIFO struct {
    
    
 // 读写锁、条件变量
 lock sync.RWMutex
 cond sync.Cond

 // kv 存储:objKey1->Deltas[obj1-Added, obj1-Updated...]
 items map[string]Deltas

 // 只存储所有 objKeys
 queue []string

 // 是否已经填充:通过 Replace() 接口将第一批对象放入队列,或者第一次调用增、删、改接口时标记为true
 populated bool
 // 通过 Replace() 接口将第一批对象放入队列的数量
 initialPopulationCount int

 // keyFunc 用来从某个 obj 中获取其对应的 objKey
 keyFunc KeyFunc

 // 已知对象,其实就是 Indexer
 knownObjects KeyListerGetter

 // 队列是否已经关闭
 closed bool

 // 以 Replaced 类型发送(为了兼容老版本的 Sync)
 emitDeltaTypeReplaced bool
}

DeltaType can be divided into the following types:

// staging/src/k8s.io/client-go/tools/cache/delta_fifo.go
type DeltaType string

const (
 Added   DeltaType = "Added"
 Updated DeltaType = "Updated"
 Deleted DeltaType = "Deleted"
 Replaced DeltaType = "Replaced" // 第一次或重新同步
 Sync DeltaType = "Sync" // 老版本重新同步叫 Sync
)

Through the above Reflector analysis, we can know that the responsibility of DeltaFIFO is to process through queue locking (queueActionLocked), deduplication (dedupDeltas), and store in the local cache (local Store) implemented by DeltaFIFO, including queue (only objKeys) and items (store objKeys and corresponding Deltas incremental changes), and consume continuously through Pop, and process related logic through Process(item).
insert image description here

Index Indexer

The resources obtained by ListAndWatch in the previous step have been stored in DeltaFIFO, and then Pop is called to consume from the queue. In actual use, the Process processing function is implemented by sharedIndexInformer.HandleDeltas. The HandleDeltas function performs Add/Update/Delete respectively according to the above different DeltaTypes, and creates, updates, and deletes the corresponding indexes at the same time.

The specific index implementation is as follows:

// staging/src/k8s.io/client-go/tools/cache/index.go
// map 索引类型 => 索引函数
type Indexers map[string]IndexFunc

// map 索引类型 => 索引值 map
type Indices map[string]Index

// 索引值 map: 由索引函数计算所得索引值(indexedValue) => [objKey1, objKey2...]
type Index map[string]sets.String

Index function (IndexFunc) : It is the function to calculate the index, which allows to expand a variety of different index calculation functions. The default and most commonly used index function is: MetaNamespaceIndexFunc.

Index value (indexedValue) : In some places it is called indexKey, which means the index value (such as ns1) calculated by the index function (IndexFunc).

Object key (objKey) : the unique key of the object obj (such as ns1/pod1), corresponding to a resource object one by one.

insert image description here
It can be seen that Indexer is integrated by ThreadSafeStore interface and finally implemented by threadSafeMap.

The difference between index functions IndexFunc (such as MetaNamespaceIndexFunc) and KeyFunc (such as MetaNamespaceKeyFunc): the former indicates how to calculate the index, and the latter indicates how to obtain the object key (objKey); the difference between index key (indexKey, indexedValue in some places) and object key (objKey)
: The former represents the index key (such as ns1) calculated by the index function (IndexFunc), and the latter is the unique key of obj (such as ns1/pod1);

Controller

As the core hub, the Controller integrates the above components Reflector, DeltaFIFO, Indexer, and Store, and becomes a bridge connecting downstream consumers.

Controller is implemented by controller structure:

It is a convention in K8s: the interface interface defined in uppercase is implemented by the corresponding structure defined in lowercase.

// staging/src/k8s.io/client-go/tools/cache/controller.go
type controller struct {
    
    
 config         Config
 reflector      *Reflector // 上面已分析的组件
 reflectorMutex sync.RWMutex
 clock          clock.Clock
}

type Config struct {
    
    
 // 实际由 DeltaFIFO 实现
 Queue

 // 构造 Reflector 需要
 ListerWatcher

 // Pop 出来的 obj 处理函数
 Process ProcessFunc

 // 目标对象类型
 ObjectType runtime.Object

 // 全量重新同步周期
 FullResyncPeriod time.Duration

 // 是否进行重新同步的判断函数
 ShouldResync ShouldResyncFunc

 // 如果为 true,Process() 函数返回 err,则再次入队 re-queue
 RetryOnError bool

 // Watch 返回 err 的回调函数
 WatchErrorHandler WatchErrorHandler

 // Watch 分页大小
 WatchListPageSize int64
}

Start the Run method in the goroutine mode in the Controller, which will start the ListAndWatch() of the Reflector, which is used to pull the full amount and monitor the incremental resources from the apiserver, and store them in the DeltaFIFO. Then, start processLoop to continuously consume from DeltaFIFO Pop. In the sharedIndexInformer, the function that pops out for processing is HandleDeltas. On the one hand, it maintains the Add/Update/Delete of the Indexer, and on the other hand, it calls the downstream sharedProcessor for handler processing.

Start Shared Informer

The SharedInformer interface is integrated by SharedIndexInformer, and is implemented by sharedIndexInformer (see here, it is an interface interface defined in uppercase, which is implemented by a structure defined in corresponding lowercase).

Take a look at the structure definition:

// staging/src/k8s.io/client-go/tools/cache/shared_informer.go
type SharedIndexInformer interface {
    
    
 SharedInformer
 // AddIndexers add indexers to the informer before it starts.
 AddIndexers(indexers Indexers) error
 GetIndexer() Indexer
}

type sharedIndexInformer struct {
    
    
 indexer    Indexer
 controller Controller

 // 处理函数,将是重点
 processor *sharedProcessor

 // 检测 cache 是否有变化,一把用作调试,默认是关闭的
 cacheMutationDetector MutationDetector

 // 构造 Reflector 需要
 listerWatcher ListerWatcher

 // 目标类型,给 Reflector 判断资源类型
 objectType runtime.Object

 // Reflector 进行重新同步周期
 resyncCheckPeriod time.Duration

 // 如果使用者没有添加 Resync 时间,则使用这个默认的重新同步周期
 defaultEventHandlerResyncPeriod time.Duration
 clock                           clock.Clock

 // 两个 bool 表达了三个状态:controller 启动前、已启动、已停止
 started, stopped bool
 startedLock      sync.Mutex

 // 当 Pop 正在消费队列,此时新增的 listener 需要加锁,防止消费混乱
 blockDeltas sync.Mutex

 // Watch 返回 err 的回调函数
 watchErrorHandler WatchErrorHandler
}

type sharedProcessor struct {
    
    
 listenersStarted bool
 listenersLock    sync.RWMutex
 listeners        []*processorListener
 syncingListeners []*processorListener // 需要 sync 的 listeners
 clock            clock.Clock
 wg               wait.Group
}

As can be seen from the structure definition, the integrated controller (analyzed above) performs Reflector ListAndWatch, stores it in DeltaFIFO, and starts the Pop consumption queue. The function that pops out in sharedIndexInformer for processing is HandleDeltas.

All listeners are added to the processorListener array slice through sharedIndexInformer.AddEventHandler, and different processes are performed by judging whether the current controller has been started as follows:

// staging/src/k8s.io/client-go/tools/cache/shared_informer.go
func (s *sharedIndexInformer) AddEventHandlerWithResyncPeriod(handler ResourceEventHandler, resyncPeriod time.Duration) {
    
    
 ...

 // 如果还没有启动,则直接 addListener 加入即可返回
 if !s.started {
    
    
  s.processor.addListener(listener)
  return
 }

 // 加锁控制
 s.blockDeltas.Lock()
 defer s.blockDeltas.Unlock()

 s.processor.addListener(listener)
 
 // 遍历所有对象,发送到刚刚新加入的 listener
 for _, item := range s.indexer.List() {
    
    
  listener.add(addNotification{
    
    newObj: item})
 }
}

Then, in HandleDeltas, call sharedProcessor.distribute to all listening listeners according to the delta type (Added/Updated/Deleted/Replaced/Sync) of obj.

Register SharedInformerFactory

SharedInformerFactory, as a factory class using SharedInformer, provides a factory class design pattern with high cohesion and low coupling. Its structure is defined as follows:

// staging/src/k8s.io/client-go/informers/factory.go
type SharedInformerFactory interface {
    
    
 internalinterfaces.SharedInformerFactory // 重点内部接口
 ForResource(resource schema.GroupVersionResource) (GenericInformer, error)
 WaitForCacheSync(stopCh <-chan struct{
    
    }) map[reflect.Type]bool

 Admissionregistration() admissionregistration.Interface
 Internal() apiserverinternal.Interface
 Apps() apps.Interface
 Autoscaling() autoscaling.Interface
 Batch() batch.Interface
 Certificates() certificates.Interface
 Coordination() coordination.Interface
 Core() core.Interface
 Discovery() discovery.Interface
 Events() events.Interface
 Extensions() extensions.Interface
 Flowcontrol() flowcontrol.Interface
 Networking() networking.Interface
 Node() node.Interface
 Policy() policy.Interface
 Rbac() rbac.Interface
 Scheduling() scheduling.Interface
 Storage() storage.Interface
}

// staging/src/k8s.io/client-go/informers/internalinterfaces/factory_interfaces.go
type SharedInformerFactory interface {
    
    
 Start(stopCh <-chan struct{
    
    }) // 启动 SharedIndexInformer.Run
 InformerFor(obj runtime.Object, newFunc NewInformerFunc) cache.SharedIndexInformer // 目标类型初始化
}

Take PodInformer as an example to illustrate how users build their own Informer. PodInformer is defined as follows:

// staging/src/k8s.io/client-go/informers/core/v1/pod.go
type PodInformer interface {
    
    
 Informer() cache.SharedIndexInformer
 Lister() v1.PodLister
}

由小写的 podInformer 实现(又看到了吧,大写接口小写实现的 K8s 风格)type podInformer struct {
    
    
 factory          internalinterfaces.SharedInformerFactory
 tweakListOptions internalinterfaces.TweakListOptionsFunc
 namespace        string
}

func (f *podInformer) defaultInformer(client kubernetes.Interface, resyncPeriod time.Duration) cache.SharedIndexInformer {
    
    
 return NewFilteredPodInformer(client, f.namespace, resyncPeriod, cache.Indexers{
    
    cache.NamespaceIndex: cache.MetaNamespaceIndexFunc}, f.tweakListOptions)
}

func (f *podInformer) Informer() cache.SharedIndexInformer {
    
    
 return f.factory.InformerFor(&corev1.Pod{
    
    }, f.defaultInformer)
}

func (f *podInformer) Lister() v1.PodLister {
    
    
 return v1.NewPodLister(f.Informer().GetIndexer())
}

The user passes in the target type (&corev1.Pod{}), the constructor (defaultInformer), calls SharedInformerFactory.InformerFor to realize the registration of the target Informer, and then calls SharedInformerFactory.Start to run, and starts the SharedIndexedInformer -> Controller analyzed above - > Reflector -> DeltaFIFO process.

Through the user's own input of the target type and constructor to register the Informer, the SharedInformerFactory design pattern of high cohesion and low coupling is realized.

callback processorListener

All listerners are implemented by processorListener and are divided into two groups: listeners and syncingListeners, which respectively traverse all listeners in their group and deliver the data to processorListener for processing.

Because the resyncPeriod set by each listener may be inconsistent, those without setting (resyncPeriod = 0) will be classified into the listeners group, and those with resyncPeriod set will be grouped into the syncingListeners group; if a listener is in multiple places (sharedIndexInformer.resyncCheckPeriod,
sharedIndexInformer.AddEventHandlerWithResyncPeriod ) are set resyncPeriod, then take the minimum value minimumResyncPeriod;

// staging/src/k8s.io/client-go/tools/cache/shared_informer.go
func (p *sharedProcessor) distribute(obj interface{
    
    }, sync bool) {
    
    
 p.listenersLock.RLock()
 defer p.listenersLock.RUnlock()

 if sync {
    
    
  for _, listener := range p.syncingListeners {
    
    
   listener.add(obj)
  }
 } else {
    
    
  for _, listener := range p.listeners {
    
    
   listener.add(obj)
  }
 }
}

From the code, we can see that processorListener cleverly uses two channels (addCh, nextCh) and one pendingNotifications (rolling Ring implemented by slice) for buffer buffering. The default initialBufferSize = 1024. It not only achieves efficient data transfer without blocking upstream and downstream processing, it is worth learning.
insert image description here

workqueue get busy

Through the processorListener callback function in the previous step, it is handed over to the internal ResourceEventHandler for real addition, deletion, modification (CUD) processing, and the OnAdd/OnUpdate/OnDelete registration functions are called for processing.

In order to process quickly without blocking the callback function of processorListener, workqueue is generally used for asynchronous decoupling processing, which is implemented as follows: As can be seen from the
insert image description here
figure, workqueue.RateLimitingInterface integrates DelayingInterface, DelayingInterface integrates Interface, and finally implemented by rateLimitingType, providing The three core capabilities are rateLimit speed limit, delay delay enqueue (realized by the priority queue through the small top heap), and queue queue processing.

In addition, you can see in the code that K8s implements three RateLimiters: BucketRateLimiter, ItemExponentialFailureRateLimiter, and ItemFastSlowRateLimiter. The Controller uses the first two by default as follows:

// staging/src/k8s.io/client-go/util/workqueue/default_rate_limiters.go
func DefaultControllerRateLimiter() RateLimiter {
    
    
 return NewMaxOfRateLimiter(
  NewItemExponentialFailureRateLimiter(5*time.Millisecond, 1000*time.Second),
  // 10 qps, 100 bucket size.  This is only for retry speed and its only the overall factor (not per item)
  &BucketRateLimiter{
    
    Limiter: rate.NewLimiter(rate.Limit(10), 100)},
 )
}

In this way, flexible queue processing can be performed on the user side by calling workqueue-related methods, such as no retrying after a number of failures, time control for delaying queue entry after failure, queue speed limit control (QPS), etc., to achieve non-blocking Asynchronous logic processing.

summary

This paper analyzes the Reflector (reflector), DeltaFIFO (incremental queue), Indexer (indexer), Controller (controller), SharedInformer (shared resource notifier), processorListener (event listener), workqueue (event processing) in K8s. Work Queue) and other components, analyzed the Informer implementation mechanism, explained the relevant process processing through source code, graphics and text, in order to better understand the K8s Informer operation process.

It can be seen that in order to achieve efficient and non-blocking core processes, K8s uses a large number of methods such as goroutine, channel, queue, index, and map deduplication; and through a good interface design mode, it is open to users. A lot of expansion capabilities; the adoption of a unified interface and implementation of the naming method, etc., these are worth learning and learning from.

original

Guess you like

Origin blog.csdn.net/weixin_45804031/article/details/125893897