REDIS18_integer array, Dict, DictTb, DictEntry, progressive rehash

①. Intset - integer array

  • ①. IntSet is an implementation method of set collection in Redis. It is implemented based on integer array and has the characteristics of variable length and ordering . The structure is as follows:
typedof struct intset{
    
    
	uint32_t encoding;/*编码方式,支持放16位、32位、64位整数**/
	uint32_t length;/*元素个数*/
	uint8_t contents[];/*整数数组,保存整数数据*/
}intset;
  • ②. The encoding includes three modes, indicating that the stored integers have different sizes:
* INTSET_ENC_INT16 < INTSET_ENC_INT32 < INTSET_ENC_INT64 *
# define INTSET_ENC_INT16(sizeof(int16_t)) /*2字节整数,范围类似java的short*/
# define INTSET_ENC_INT32(sizeof(int32_t)) /*4字节整数,范围类似java的int*/
# define INTSET_ENC_INT64(sizeof(int64_t)) /*8字节整数,范围类似java的long*/
  • ③. In order to facilitate search, Redis will save all the integers in the intset in ascending order in the contents array. The structure is as follows:
    Insert image description here
  • ④. Now, each number in the array is within the range of int16_t, so the encoding method used is INTSET_ENC_INT16, and the byte size occupied by each part is
  1. encoding:4 bytes
  2. length:4 bytes
  3. contents:2 bytes * 3 = 6 bytes
  • ⑤. We add a number to it: 50000. This number exceeds the range of int16_t, and intset will automatically upgrade the encoding method to the appropriate size. In the current case, the process is as follows
  1. The upgraded encoding is INTSET_ENC_INT32, each integer occupies 4 bytes, and the array is expanded according to the new encoding method and number of elements.
  2. Copy the elements in the array to the correct position after expansion in reverse order.
  3. Put the element to be added at the end of the array
  4. Finally, change the encoding attribute of inset to INTSET_ENC_INT32 and the length attribute to 4
    Insert image description here

Insert image description hereInsert image description hereInsert image description here
Insert image description hereInsert image description hereInsert image description here

typedef struct intset {
    
    
    uint32_t encoding;
    uint32_t length;
    int8_t contents[];
} intset;

intset *intsetAdd(intset *is, int64_t value, uint8_t *success);
intset *intsetAdd(intset *is, int64_t value, uint8_t *success)
{
    
    
    // 获取当前元素的编码,如果是16位的就用16位的编码,超过就用32位,依次这样下去
    // 应该要知道传入进来的value值要用什么编码
    uint8_t valenc = _intsetValueEncoding(value);
    // 要插入位置
    uint32_t pos;
    if (success)
        *success = 1;

    /* Upgrade encoding if necessary. If we need to upgrade, we know that
     * this value should be either appended (if > 0) or prepended (if < 0),
     * because it lies outside the range of existing values. */
    // 判断valenc编码是不是超过了当前intSet的编码
    if (valenc > intrev32ifbe(is->encoding))
    {
    
    
        /* This always succeeds, so we don't need to curry *success. */
        // 超出编码,需要升级
        return intsetUpgradeAndAdd(is, value);
    }
    else
    {
    
    
        /* Abort if the value is already present in the set.
         * This call will populate "pos" with the right position to insert
         * the value when it cannot be found. */
        // 在当前intSet中查找值与value一样的元素角标pos
        if (intsetSearch(is, value, &pos))
        {
    
    
            // 如果找到了就直接返回,因为intSet是不能重复元素存在的
            if (success)
                *success = 0;
            return is;
        }
        // 数组扩容
        is = intsetResize(is, intrev32ifbe(is->length) + 1);
        if (pos < intrev32ifbe(is->length))
            intsetMoveTail(is, pos, pos + 1);
    }

    _intsetSet(is, pos, value);
    is->length = intrev32ifbe(intrev32ifbe(is->length) + 1);
    return is;
}
/* Upgrades the intset to a larger encoding and inserts the given integer. */
static intset *intsetUpgradeAndAdd(intset *is, int64_t value)
{
    
    
    // 获取当前intSet编码
    uint8_t curenc = intrev32ifbe(is->encoding);
    // 获取最新的编码
    uint8_t newenc = _intsetValueEncoding(value);
    // 获取元素个数
    int length = intrev32ifbe(is->length);
    // 判断新元素是大于0还是小于0,小于0插入队首,大于0插入队尾
    int prepend = value < 0 ? 1 : 0;

    /* First set new encoding and resize */
    // 重新设置编码格式
    is->encoding = intrev32ifbe(newenc);
    // 重置数组的大小
    is = intsetResize(is, intrev32ifbe(is->length) + 1);

    /* Upgrade back-to-front so we don't overwrite values.
     * Note that the "prepend" variable is used to make sure we have an empty
     * space at either the beginning or the end of the intset. */
    // 倒序遍历,逐个搬运元素到新的位置
    while (length--)
        _intsetSet(is, length + prepend, _intsetGetEncoded(is, length, curenc));

    /* Set the value at the beginning or the end. */
    // 插入新元素,prepend决定了是队首还是队尾
    if (prepend)
        _intsetSet(is, 0, value);
    else
        _intsetSet(is, intrev32ifbe(is->length), value);
    // 修改数组长度
    is->length = intrev32ifbe(intrev32ifbe(is->length) + 1);
    return is;
}
static uint8_t intsetSearch(intset *is, int64_t value, uint32_t *pos)
{
    
    
    // 初始化二分查找,min max mid
    int min = 0, max = intrev32ifbe(is->length) - 1, mid = -1;
    int64_t cur = -1;

    /* The value can never be found when the set is empty */
    // 数组不为空,判断value是否大于最大值,小于最小值
    if (intrev32ifbe(is->length) == 0)
    {
    
    
        // 数组位空,插入第一个元素
        if (pos)
            *pos = 0;
        return 0;
    }
    else
    {
    
    
        /* Check for the case where we know we cannot find the value,
         * but do know the insert position. */
        // 数组不位空,值是否大于最大值,大于最大值,插入队尾
        if (value > _intsetGet(is, max))
        {
    
    
            if (pos)
                *pos = intrev32ifbe(is->length);
            return 0;
        }
        // 小于最小值,不用找了,插入队首
        else if (value < _intsetGet(is, 0))
        {
    
    
            if (pos)
                *pos = 0;
            return 0;
        }
    }
    // 二分查找
    while (max >= min)
    {
    
    
        mid = ((unsigned int)min + (unsigned int)max) >> 1;
        cur = _intsetGet(is, mid);
        if (value > cur)
        {
    
    
            min = mid + 1;
        }
        else if (value < cur)
        {
    
    
            max = mid - 1;
        }
        else
        {
    
    
            break;
        }
    }

    if (value == cur)
    {
    
    
        if (pos)
            *pos = mid;
        return 1;
    }
    else
    {
    
    
        if (pos)
            *pos = min;
        return 0;
    }
}
  • ⑥. Intset can be regarded as a special integer array with some characteristics:
  1. Redis will ensure that the elements in Intset are unique and ordered
  2. It has a type upgrade mechanism (such as the above from the original 2byte - to 4byte), which can save memory space.
  3. The bottom layer uses binary search to query

②. Dict - DictTb - DictEntry

  • ①. Dict: Dictionary, used to save key value arrays and expand database capacity
typedof struct dict{
    
    
	dictType *type;// dict类型,内置不同的hash函数
	void *privdata; // 私有数据,在做特殊hash运算时用
	dictht ht[2]; // 一个Dict包含两个哈希表,其中一个是当前数据,另一个一般是空,rehash是使用
	long rehashidx; // rehash的进度,-1表示未进行
	int16_t pauserehash; // rehash是否暂停,1则暂停,0则继续
}dict;
  • ②. dictht: hashTable that saves data. The default size
    is 4. There are 2 because it is necessary to maintain the old and new arrays during expansion.
typedef struct dictht{
    
    
	//entry数组
	//数组中保存的是指向entry的指针
	dictEntry **table;
	// 哈希表大小
	unsigned long size;
	// 哈希表大小的掩码,总等于size-1
	unsigned long sizemask;
	// entry个数
	unsigned long used;
}dictht;
  • ③. dictEntry: the data type that stores key-value pairs and saves the data of a single key value
typedef struct dictEntry {
    
    
    void *key; // key 存储redis中的K,该类型即为SDS
    union {
    
    
        void *val; // 引用类型
        uint64_t u64;// 无符号
        int64_t s64; // 有符号
        double d;  // double类型
    } v;// val 存储redis中的V,该类型为redisObject
    struct dictEntry *next; // 下一个Entry的指针
} dictEntry;
  • ④. Let’s summarize the dictionary (Dict), hash table (DictHashTable), and hash node (DictEntry):
    For example: when we add a key-value pair to the Dict, Redis first calculates the hash value (h) based on the key, and then uses h&sizemask to calculate which index position in the array the element should be stored into. We store k1=v1. Assume that the hash value of k1 is h =1, then 1&3 =1, so k1=v1 should be stored in the array index 1 position.
typedef struct dict{
    
    
	dictType *type; // dict类型,内置不同的hash函数
	void *privdata; // 私有数据,在做特殊hash运算时用
	dictht ht[2]; // 一个Dict包含两个哈希表,其中一个是当前数据,另一个一般是空,rehash时使用
	long rehashidx; // rehash的进度,-1表示未进行
	int16_t pauserehash; // rehash是否暂停,1则暂停,0则继续
}dict;
typedef struct dictht{
    
    
	//entry数组
	//数组中保存的是指向entry的指针
	dictEntry **table;
	// 哈希表大小
	unsigned long size;
	// 哈希表大小的掩码,总等于size - 1
	unsigned long sizemask;
	// entry 个数
	unsigned long used;
}dictht;
typedef struct dictEntry{
    
    
	void *key; //键
	union{
    
    
		void *val;
		uint64_t u64;
		int64_t s64;
		double d;
	}v;//值
	//下一个Entry的指针
	struct dictEntry *next;
}dictEntry;

Insert image description hereInsert image description here

③. Dict’s expansion and contraction mechanism

  • ①. The HashTable in Dict is an implementation of an array combined with a one-way linked list. When there are many elements in the collection, it will inevitably lead to an increase in hash conflicts. If the linked list is too long, the query efficiency will be greatly reduced.

  • ②. Dict will check the load factor (LoadFactor = used/size) every time a new key-value pair is added. Hash table expansion will be triggered when the following two conditions are met:

  1. The LoadFactor of the hash table >= 1, and the server does not execute background processes such as BGSAVE or BGREWRITEAOF (equivalent to expansion processing when the CPU is not busy)
  2. LoadFactor for hash table > 5
  • ③. Expansion source code analysis:
static int _dictExpandIfNeeded(dict *d){
    
    
	//如果正在rehash,则返回ok
	if(dictIsRehashing(d))return DICK_OK;
	//如果哈希表为空,则初始化哈希表为默认大小:4
	if(d>ht[0].size == 0)return dictExpand(d,DICT_HT_INITIAL_SIZE);
	//当负载因子(used/size)达到1以上,并且当前没有进行bgrewrite等子进程操作
	//或者负载因子超过5,则进行dictExpand,也就是扩容
	if(d->ht[0].used >= d->ht[0].size &&
	  (dict_can_resize || d->ht[0].used/d ->ht[0].size > dict_force_resize_ratio)){
    
    
			//扩容大小为used +1,底层会对扩容大小做判断,实际上找的是第一个大于等于 used+1 的 2^n
			return dictExpand(d,d->ht[0].used +1);
	}
	return DICT_OK;
}
  • ④. In addition to expansion, Dict will also check the load factor every time an element is deleted. When LoadFactor < 0.1, the hash table will be shrunk.
// t_hash.c #hashTypeDeleted()
if(dictDelete((dict*)o->prt,field)==C_OK){
    
    
	delete = 1;
	// 删除成功后,检查是否需要重置Dict大小,如果需要则调用dictResize重置
	if(htNeedsResize(o->ptr))dictResize(o->ptr);
}
// server.c文件
int htNeedsResize(dict *dict){
    
    
	long long size,used;
	// 哈希表大小
	size = dictSlots(dict);
	// entry数量
	used = dictSize(dict);
	// siez > 4(哈希表初始大小) 并且负载因子低于0.1
  return (size>DICT_HT_INITIAL_SIZE && (used*100/size < HASHTABLE_MIN_FILL))
}
int dictResize(dict *d){
    
    
	unsigned long minimal;
	// 如果正在做bgsave或bgrewrireof或rehash,则返回错误
	if(!dict_can_resize || dictIsRehashing(d))
	return DICT_ERR;
	// 获取used,也就是entry个数
	minimal = d ->ht[0].used;
	// 如果used小于4,则重置为4
	if(minimal < DICT_HT_INITIAL_SIZE)
		minimal = DICT_HT_INITLAL_SIZE;
		// 重置大小为minimal,其实是第一个大于等于minimal的2^n
		return dictExpand(d,minimal );
}

④. Progressive rehash of Dict

  • ①. The rehash of Dict is not completed at one time. Just imagine, if the Dict contains millions of entries, and it needs to be completed in one rehash, it is very likely that the main thread will be blocked. Therefore, the rehash of Dict is completed multiple times and gradually, so it is called progressive rehash. The process is as follows:

  • ②. Calculate the size of the new hash table. The value depends on whether the current expansion or contraction is to be done:

  1. If it is expansion, the new size is the first 2^n that is greater than or equal to dict.ht[0].used + 1
  2. If it is shrinking, the new size is the first 2^n that is greater than or equal to dict.ht[0].used (not less than 4)
  • ③. Apply for memory space according to the new size, create dicttht, and assign it to dict.ht[1]

  • ④. Set dict.rehahidx = 0 to indicate the start of rehash

  • ⑤. Every time you perform a new addition, query, modification, or deletion operation, check whether dict.rehashidx is greater than -1. If so, rehash the entry list of dict.ht[0].table[rehashidx] to dict.ht [1], and will rehashidx++. All data up to dict.ht[0] is rehash to dict.ht[1]

  • ⑥. Assign dict.ht[1] to dict.ht[0], initialize dict.ht[1] to an empty hash table, and release the memory of the original dict.ht[0]

  • ⑦. Assign the value of rehashidx to -1, which means the rehash is completed.

  • ⑧. During the rehash process, new operations will be written directly to ht[1]. Query, modification and deletion will be searched and executed in dict.ht[0] and dict.ht[1] in sequence. This ensures that the data of ht[0] only decreases but does not increase, and will eventually become empty with rehash.
    Insert image description here
    Insert image description here
    Insert image description hereInsert image description here

Guess you like

Origin blog.csdn.net/TZ845195485/article/details/129655157