Client-go之Informer机制本地存储Indexer

在这里插入图片描述
若想与作者沟通交流问题，请关注微信公众号“云原生手记”

背景

client-go Informer机制的主要逻辑是controller控制reflector从apiserver获取感兴趣的资源对象的数据，然后将数据放入DeltaFIFO队列中，controller将从队列中消费出对象增量数据，然后将增量存入本地存储，同时根据增量类型（Added,Update,Delete）进行事件通知。那么client-go中的本地存储是什么样的存在呢？我们之前常听说k8s中的list-watch机制能够让用户在取对象数据时直接在本地捞数据，而不是请求apiserver获取，从而减轻apiserver的压力，其中的本地数据就是本地存储Indexer中所存储的数据。所以本文就研读client-go的本地存储Indexer，探求其实现原理。
在这里插入图片描述

Indexer

在读Indexer部分代码时，有些概念有点搞不清，也不明白这样设计的原因，直到我看到了阳明大佬的文章（https://mp.weixin.qq.com/s/xCa6yZTk0X76IZhOx6IbHQ)。大佬对于Indexer理解以及举例简直是妙啊，这边也推荐各位小伙伴看下阳明大佬的文章，加深理解。
本地存储Indexer涉及的源码文件有：
client-go\tools\cache\index.go
client-go\tools\cache\store.go
client-go\tools\cache\thread_safe_store.go

Indexer接口

本地存储接口定义为Indexer，中文可以意为索引器，那么为什么要叫索引器呢？比如，我们client-go中一般是根据namespace或者资源名称等来取对象，这个取对象的过程其实就是在本地存储取的。那把本地存储当成数据库，为了能更快的获取数据，你需要使用索引。同样这边的Indexer就是充当索引的功能，但是，就像你数据库中一张表可能不止一个索引吧，对于一个资源对象，你获取它的方式除了通过名称，还能通过命名空间，如果对象是pod，还能通过nodename来找出满足条件的对象，所以针对一类资源的索引器可以有多个的。了解了Indexer的用途，我们来看下索引器Indexer接口中的函数：

type Indexer interface {
    
    
	Store // store接口
	// 根据索引器名称和对象，获取存储的对象
	Index(indexName string, obj interface{
    
    }) ([]interface{
    
    }, error)
	// 根据索引器名称和被索引的值获取被存储对象的key
	IndexKeys(indexName, indexedValue string) ([]string, error)
	// 根据索引器名称获取该索引器被索引的值
	ListIndexFuncValues(indexName string) []string
	// 根据索引名称和被索引的值获取对象
	ByIndex(indexName, indexedValue string) ([]interface{
    
    }, error)
	// 获取索引器
	GetIndexers() Indexers
	// 添加新的索引器
	AddIndexers(newIndexers Indexers) error
}

继续该文件中定义有关Indexer的类型：理解IndexFunc、Index、Indexers、Indice这几个结构，对于理解indexer至关重要，整个indexer就围绕这个几个结构进行操作了。

// 定义索引函数类型，根据一个对象可以计算一系列可被索引的值，一般被索引值只有一个，见下面的函数。
type IndexFunc func(obj interface{
    
    }) ([]string, error)

// 默认的indexFunc，根据Namespace定义的索引函数，返回数据为命名空间
func MetaNamespaceIndexFunc(obj interface{
    
    }) ([]string, error) {
    
    
	meta, err := meta.Accessor(obj)
	if err != nil {
    
    
		return []string{
    
    ""}, fmt.Errorf("object has no meta: %v", err)
	}
	return []string{
    
    meta.GetNamespace()}, nil // 索引值集就是Namespace
}

type Index map[string]sets.String // 索引值对应对象键集合，比如命名空间索引，命名空间名称对应命名空间下的对象键集合，根据对象键可以获取对象

type Indexers map[string]IndexFunc  // 根据索引器名称获取索引函数，一个索引必须有自己的索引函数

type Indices map[string]Index // 索引器集合（相当于一张表有很多索引），根据索引器名称获取索引器

Indexer的实现cache

属性

type cache struct {
    
    
	cacheStorage ThreadSafeStore // 安全存储,是个接口
	keyFunc KeyFunc // 用于计算对象键
}

因为cache中很多方法都是调用了ThreadSafeStore中的方法，所以有必要先看下ThreadSafeStore接口及其实现类。

ThreadSafeStore接口及其实现

看下ThreadSafeStore接口：

type ThreadSafeStore interface {
    
    
	Add(key string, obj interface{
    
    })
	Update(key string, obj interface{
    
    })
	Delete(key string)
	Get(key string) (item interface{
    
    }, exists bool)
	List() []interface{
    
    }
	ListKeys() []string
	Replace(map[string]interface{
    
    }, string)
	Index(indexName string, obj interface{
    
    }) ([]interface{
    
    }, error)
	IndexKeys(indexName, indexKey string) ([]string, error)
	ListIndexFuncValues(name string) []string
	ByIndex(indexName, indexKey string) ([]interface{
    
    }, error)
	GetIndexers() Indexers
	AddIndexers(newIndexers Indexers) error // 添加索引器
	Resync() error // 方法被弃
}

继续看下ThreadSafeStore的实现threadSafeMap
属性：

type threadSafeMap struct {
    
    
	lock  sync.RWMutex // 一把读写锁
	items map[string]interface{
    
    } // 一个Map。是用来存对象的，根据对象键可以在该Map中获取对象
	indexers Indexers //  map[string]IndexFunc ，索引名称->indexfunc
	indices Indices // map[string]Index , 索引名称->索引器
}

初始化方法NewThreadSafeStore：

只需要传入Indexers和Indices两个map即可

func NewThreadSafeStore(indexers Indexers, indices Indices) ThreadSafeStore {
    
    
	return &threadSafeMap{
    
    
		items:    map[string]interface{
    
    }{
    
    },
		indexers: indexers,
		indices:  indices,
	}
}

方法(增删改)：
这几个方法在修改完items中的对象后，还需要对Indices操作，因为你操作对象后还需要修改下索引后面的对象键集合

func (c *threadSafeMap) Add(key string, obj interface{
    
    }) {
    
    
	c.lock.Lock() // 加锁
	defer c.lock.Unlock() //解锁
	oldObject := c.items[key] // 获取该key下的老对象
	c.items[key] = obj // 覆盖，key为对象键直接存入，index存储的key也需要添加。（获取对象时先去索引器获取对象键集合，然后再根据对象键到items中拿对象）
	c.updateIndices(oldObject, obj, key) // 要是有老对象就先删除老对象，在更新新对象到index
}

func (c *threadSafeMap) Update(key string, obj interface{
    
    }) {
    
    
	c.lock.Lock()
	defer c.lock.Unlock()
	oldObject := c.items[key]
	c.items[key] = obj
	c.updateIndices(oldObject, obj, key)
}

func (c *threadSafeMap) Delete(key string) {
    
    
	c.lock.Lock()
	defer c.lock.Unlock()
	if obj, exists := c.items[key]; exists {
    
    
		c.deleteFromIndices(obj, key) // 从Indice中删除老对象
		delete(c.items, key)
	}
}

updateIndices方法：

在这里插入图片描述

func (c *threadSafeMap) updateIndices(oldObj interface{
    
    }, newObj interface{
    
    }, key string) {
    
    
	if oldObj != nil {
    
     // 老对象不为空，那就从Indice中删除老对象
		c.deleteFromIndices(oldObj, key) // 将obj从每个管理的Index中删除
	}
	for name, indexFunc := range c.indexers {
    
     // 遍历索引器
		indexValues, err := indexFunc(newObj) // 获取对象在这个索引下索引值集合。比若说，default命名空间下在Node1上的pod1，它的对应的索引有命名空间索引、Pod名称索引，nodename索引，pod在每个索引里的索引键集合是default,pod1,Node1
		if err != nil {
    
    
			panic(fmt.Errorf("unable to calculate an index entry for key %q on index %q: %v", key, name, err))
		}
		index := c.indices[name] // 根据索引名称获取索引器
		if index == nil {
    
     // 索引器为空，该新对象之前没建过索引，这边给它新建一个
			index = Index{
    
    }
			c.indices[name] = index // 新建这个索引
		}
		// 索引不为空
		// 将获取到的索引键集合中索引值对应的对象数据添加上
		for _, indexValue := range indexValues {
    
     // 遍历索引键集合
			set := index[indexValue] // 在索引器中获取索引值集合
			if set == nil {
    
     // 对象键集合为空
				set = sets.String{
    
    }
				index[indexValue] = set // 集合为空，就新建一个
			}
			set.Insert(key) // 将新的对象键添加进去
		}
	}
}

Get方法：

根据对象键获取对象

func (c *threadSafeMap) Get(key string) (item interface{
    
    }, exists bool) {
    
    
	c.lock.RLock()
	defer c.lock.RUnlock()
	item, exists = c.items[key] // 从item中根据对象键拿对象
	return item, exists
}

List():

返回本地存储中资源对象的一份拷贝

func (c *threadSafeMap) List() []interface{
    
    } {
    
    
	c.lock.RLock()
	defer c.lock.RUnlock()
	list := make([]interface{
    
    }, 0, len(c.items))
	for _, item := range c.items {
    
    
		list = append(list, item) 
	}
	return list // 返回item中所有的对象
}

ListKey：
获取所有的对象键

func (c *threadSafeMap) ListKeys() []string {
    
    
	c.lock.RLock()
	defer c.lock.RUnlock()
	list := make([]string, 0, len(c.items))
	for key := range c.items {
    
    
		list = append(list, key)
	}
	return list
}

ListKeys:
获取所有对象键

func (c *threadSafeMap) ListKeys() []string {
    
    
	c.lock.RLock()
	defer c.lock.RUnlock()
	list := make([]string, 0, len(c.items))
	for key := range c.items {
    
    
		list = append(list, key)
	}
	return list
}

Replace：

func (c *threadSafeMap) Replace(items map[string]interface{
    
    }, resourceVersion string) {
    
    
	c.lock.Lock()
	defer c.lock.Unlock()
	c.items = items // map替换

	// 重新构建所有的索引
	c.indices = Indices{
    
    } // indice清空
	for key, item := range c.items {
    
    
		c.updateIndices(nil, item, key)// 构建新的索引器
	}
}

Index:
根据索引名称和查询对象，获取一组该索引下的对象键集合

func (c *threadSafeMap) Index(indexName string, obj interface{
    
    }) ([]interface{
    
    }, error) {
    
    
	c.lock.RLock()
	defer c.lock.RUnlock()

	indexFunc := c.indexers[indexName] // 获取索引器即索引函数
	if indexFunc == nil {
    
    // 索引器的索引函数不能为空，否则会出错
		return nil, fmt.Errorf("Index with name %s does not exist", indexName)
	}

	indexedValues, err := indexFunc(obj) // 获取对象在该索引下的值集合
	if err != nil {
    
    
		return nil, err
	}
	index := c.indices[indexName] //获取索引器

	var storeKeySet sets.String
	if len(indexedValues) == 1 {
    
     // 只有一个Index
		// 大多数情况下，这里只有一个值，
		storeKeySet = index[indexedValues[0]] // 获取索引里的对象键集合
	} else {
    
    
		// 去重
		storeKeySet = sets.String{
    
    }
		for _, indexedValue := range indexedValues {
    
    
			for key := range index[indexedValue] {
    
    
				storeKeySet.Insert(key)
			}
		}
	}
	// 返回对象键集合的拷贝
	list := make([]interface{
    
    }, 0, storeKeySet.Len())
	for storeKey := range storeKeySet {
    
    
		list = append(list, c.items[storeKey])
	}
	return list, nil
}

ByIndex:
根据索引名称和被索引的值，获取对象键集合，是一份拷贝

// ByIndex returns a list of the items whose indexed values in the given index include the given indexed value
func (c *threadSafeMap) ByIndex(indexName, indexedValue string) ([]interface{
    
    }, error) {
    
    
	c.lock.RLock()
	defer c.lock.RUnlock()

	indexFunc := c.indexers[indexName] // 获取索引函数
	if indexFunc == nil {
    
     // 这边的判空也是对索引合法性的检查，如果都不存在索引函数，那就没必要继续了
		return nil, fmt.Errorf("Index with name %s does not exist", indexName)
	}

	index := c.indices[indexName] // 根据索引器名字查找索引器

	set := index[indexedValue] // 根据被索引的值获取索引的对象key集合
	list := make([]interface{
    
    }, 0, set.Len()) // 构造一个拷贝返回出去
	for key := range set {
    
    
		list = append(list, c.items[key])
	}

	return list, nil
}

IndexKeys：
根据索引名称和索引值获取对象键集合

func (c *threadSafeMap) IndexKeys(indexName, indexedValue string) ([]string, error) {
    
    
	c.lock.RLock()
	defer c.lock.RUnlock()
	// indexFunc用于判断索引是否存在
	indexFunc := c.indexers[indexName] // 获取索引函数
	if indexFunc == nil {
    
    
		return nil, fmt.Errorf("Index with name %s does not exist", indexName)
	}

	index := c.indices[indexName] // 获取索引

	set := index[indexedValue] // 获取对象键集合 
	return set.List(), nil
}

ListIndexFuncValues：
根据索引器名称获取该索引下的所有对象键集合

func (c *threadSafeMap) ListIndexFuncValues(indexName string) []string {
    
    
	c.lock.RLock()
	defer c.lock.RUnlock()

	index := c.indices[indexName] // 获取索引器
	names := make([]string, 0, len(index))
	for key := range index {
    
    
		names = append(names, key)
	}
	return names// 返回改索引下的所有Key的集合
}

AddIndexers:
添加索引器

func (c *threadSafeMap) AddIndexers(newIndexers Indexers) error {
    
    
	c.lock.Lock()
	defer c.lock.Unlock()

	if len(c.items) > 0 {
    
     // 已经有对象数据了，就不能添加索引器了
		return fmt.Errorf("cannot add indexers to running index")
	}

	oldKeys := sets.StringKeySet(c.indexers) // 老的索引器名称集合
	newKeys := sets.StringKeySet(newIndexers) // 新的索引器名称集合

	if oldKeys.HasAny(newKeys.List()...) {
    
     // 新老索引集合索引名称有重合就不允许了
		return fmt.Errorf("indexer conflict: %v", oldKeys.Intersection(newKeys))
	}
	// 添加新的索引器
	for k, v := range newIndexers {
    
    
		c.indexers[k] = v
	}
	return nil
}

deleteFromIndices:
从indice中指定的Index下删除对象键

func (c *threadSafeMap) deleteFromIndices(obj interface{
    
    }, key string) {
    
    
	for name, indexFunc := range c.indexers {
    
     // 遍历索引器几个
		indexValues, err := indexFunc(obj) // 获取对象索引值集合
		if err != nil {
    
    
			panic(fmt.Errorf("unable to calculate an index entry for key %q on index %q: %v", key, name, err))
		}

		index := c.indices[name] // 根据索引器名字获取索引器
		if index == nil {
    
     //找不到index
			continue
		}
		// 找到对应的索引器了
		for _, indexValue := range indexValues {
    
     // 遍历对象的索引值数组
			set := index[indexValue] // 获取对象键结合
			if set != nil {
    
     // 对象键集合不为空
				set.Delete(key) // 从对象键集合中删除待删除对象的key
				if len(set) == 0 {
    
     // 如果删除后，对象键集合为空，就删除该索引
					delete(index, indexValue)
				}
			}
		}
	}
}

cache的方法

1、增删改方法：
看完threadSafeMap再来看这个，你会发现这边的增删改方法就是调用了threadSafeMap中的增删改方法。

func (c *cache) Add(obj interface{
    
    }) error {
    
    
	key, err := c.keyFunc(obj)
	if err != nil {
    
    
		return KeyError{
    
    obj, err}
	}
	c.cacheStorage.Add(key, obj)
	return nil
}

// Update sets an item in the cache to its updated state.
func (c *cache) Update(obj interface{
    
    }) error {
    
    
	key, err := c.keyFunc(obj)
	if err != nil {
    
    
		return KeyError{
    
    obj, err}
	}
	c.cacheStorage.Update(key, obj)
	return nil
}

// Delete removes an item from the cache.
func (c *cache) Delete(obj interface{
    
    }) error {
    
    
	key, err := c.keyFunc(obj)
	if err != nil {
    
    
		return KeyError{
    
    obj, err}
	}
	c.cacheStorage.Delete(key)
	return nil
}

2、其他方法
同样都是对threadSafeMap的一层封装

func (c *cache) List() []interface{
    
    } {
    
    
	return c.cacheStorage.List()
}

// ListKeys returns a list of all the keys of the objects currently
// in the cache.
func (c *cache) ListKeys() []string {
    
    
	return c.cacheStorage.ListKeys()
}

// GetIndexers returns the indexers of cache
func (c *cache) GetIndexers() Indexers {
    
    
	return c.cacheStorage.GetIndexers()
}

// Index returns a list of items that match on the index function
// Index is thread-safe so long as you treat all items as immutable
func (c *cache) Index(indexName string, obj interface{
    
    }) ([]interface{
    
    }, error) {
    
    
	return c.cacheStorage.Index(indexName, obj)
}

func (c *cache) IndexKeys(indexName, indexKey string) ([]string, error) {
    
    
	return c.cacheStorage.IndexKeys(indexName, indexKey)
}

// ListIndexFuncValues returns the list of generated values of an Index func
func (c *cache) ListIndexFuncValues(indexName string) []string {
    
    
	return c.cacheStorage.ListIndexFuncValues(indexName)
}

func (c *cache) ByIndex(indexName, indexKey string) ([]interface{
    
    }, error) {
    
    
	return c.cacheStorage.ByIndex(indexName, indexKey)
}

func (c *cache) AddIndexers(newIndexers Indexers) error {
    
    
	return c.cacheStorage.AddIndexers(newIndexers)
}

// Get returns the requested item, or sets exists=false.
// Get is completely threadsafe as long as you treat all items as immutable.
func (c *cache) Get(obj interface{
    
    }) (item interface{
    
    }, exists bool, err error) {
    
    
	key, err := c.keyFunc(obj)
	if err != nil {
    
    
		return nil, false, KeyError{
    
    obj, err}
	}
	return c.GetByKey(key)
}

// GetByKey returns the request item, or exists=false.
// GetByKey is completely threadsafe as long as you treat all items as immutable.
func (c *cache) GetByKey(key string) (item interface{
    
    }, exists bool, err error) {
    
    
	item, exists = c.cacheStorage.Get(key)
	return item, exists, nil
}

// Replace will delete the contents of 'c', using instead the given list.
// 'c' takes ownership of the list, you should not reference the list again
// after calling this function.
func (c *cache) Replace(list []interface{
    
    }, resourceVersion string) error {
    
    
	items := make(map[string]interface{
    
    }, len(list))
	for _, item := range list {
    
    
		key, err := c.keyFunc(item)
		if err != nil {
    
    
			return KeyError{
    
    item, err}
		}
		items[key] = item
	}
	c.cacheStorage.Replace(items, resourceVersion)
	return nil
}

// Resync is meaningless for one of these
func (c *cache) Resync() error {
    
    
	return nil
}

cache初始化

初始化的时候只传入了一个KeyFunc用于获取对象Key,然后在里面新建了ThreadSafeStore对象。并初始化了Indexer和Indices。

func NewStore(keyFunc KeyFunc) Store {
    
    
	return &cache{
    
    
		cacheStorage: NewThreadSafeStore(Indexers{
    
    }, Indices{
    
    }),
		keyFunc:      keyFunc,
	}
}

不知道读者是否好奇传进来的KeyFunc是什么样的？我在controller.go中找到了使用NewStore的调用者：

func NewInformer(
	lw ListerWatcher,
	objType runtime.Object,
	resyncPeriod time.Duration,
	h ResourceEventHandler,
) (Store, Controller) {
    
    
	clientState := NewStore(DeletionHandlingMetaNamespaceKeyFunc) // 新建Indexer，传入DeletionHandlingMetaNamespaceKeyFunc作为keyfunc

	return clientState, newInformer(lw, objType, resyncPeriod, h, clientState)
}

DeletionHandlingMetaNamespaceKeyFunc函数：

func DeletionHandlingMetaNamespaceKeyFunc(obj interface{
    
    }) (string, error) {
    
    
	if d, ok := obj.(DeletedFinalStateUnknown); ok {
    
     // DeletedFinalStateUnknown表示因为与apiserver失联导致的对象本地存在，而api-server已经删除的意外状态
		return d.Key, nil
	}
	return MetaNamespaceKeyFunc(obj) // 正常使用的keyFunc是这个
}

MetaNamespaceKeyFunc函数：

func MetaNamespaceKeyFunc(obj interface{
    
    }) (string, error) {
    
    
	if key, ok := obj.(ExplicitKey); ok {
    
     // 对象是个字符串，就返回字符串
		return string(key), nil
	}
	meta, err := meta.Accessor(obj) // 获取对象的Meta数据
	if err != nil {
    
    
		return "", fmt.Errorf("object has no meta: %v", err)
	}
	if len(meta.GetNamespace()) > 0 {
    
     // 如果对象存在命名空间，那就返回namespaxce/name的形式作为对象键。（像node这样的资源，是不存在namespace设置的，一般为空）
		return meta.GetNamespace() + "/" + meta.GetName(), nil
	}
	return meta.GetName(), nil // 没有namespace的对象，就返回name作为对象键

}

基于以上初始化过程以及调用，我们了解到，新建Indexer是只需要传入KeyFunc，然后内部初始化ThreadSafeStore。对于KeyFunc到底是什么样的，我们也探究过了，对象键取决于对象，分为三种返回：

对象是string，那就返回string
对象有Meta数据，对象Namespace不为空，对象键为：namespaxce/name
对象有Meta数据，对象Namespace为空，对象键为：name

总结

本篇文章主要介绍了client-go种的Indexer源码，Indexer充当的是内部存储的角色，用途是在本地缓存一份用户感兴趣的数据，用户取对象数据一般不直接访问apiserver，先去缓存取数据，这样就减轻了apiserver和etcd集群的压力。Indexer作为内存存储的同时，还实现了针对资源对象的索引机制，且可以自己定义索引器，使得本地存储获取资源比较快，这边的索引简单理解就是个Map,Map的特性就是便于查询。
在这里插入图片描述