The Set type is an unordered and unique key-value collection, and its storage order will not be stored in the order of insertion. Collections in Redis are implemented through hash tables, so the complexity of adding, deleting, and searching is O(1). Compared with lists, collections also have two characteristics: unordered and non-repeatable
A set can store at most
2^32-1
elements. The concept is basically similar to the set of individuals in mathematics. The concept of a mathematical set refers to a collection of concrete or abstract objects with certain properties.In short, Redis collections are combinations of unique values. Using the data structure of set (Set), Redis can store some set-type data. Redis also supports basic operations of sets such as intersection, union, and difference through some simple commands.
Article directory
-
- @[toc]
-
-
- 1. Set data type
-
- 2. Set underlying structure
-
- 3. Set common commands
-
- 3.1. Add collection elements
- 3.2. View all values of the collection
- 3.3. Determine whether a value is in the set
- 3.4. View the number of stored values in a collection
- 3.5. Delete the element with the specified value in the collection
- 3.6. Randomly select an element in a set
- 3.7. Randomly delete an element in a collection
- 3.8. Move a value in one set to another set
- 3.9. Set operation: difference set
- 3.10. Set operation: intersection
- 3.11. Set operation: union
Article directory
-
- @[toc]
-
-
- 1. Set data type
- 2. Set underlying structure
- 3. Set common commands
-
- 3.1. Add collection elements
- 3.2. View all values of the collection
- 3.3. Determine whether a value is in the set
- 3.4. View the number of stored values in a collection
- 3.5. Delete the element with the specified value in the collection
- 3.6. Randomly select an element in a set
- 3.7. Randomly delete an element in a collection
- 3.8. Move a value in one set to another set
- 3.9. Set operation: difference set
- 3.10. Set operation: intersection
- 3.11. Set operation: union
-
1. Set data type
1.1. Introduction to Set type
The Set type is an unordered and unique key-value collection, and its storage order will not be stored in the order of insertion. Collections in Redis are implemented through hash tables, so the complexity of adding, deleting, and searching is O(1). Compared with lists, collections also have two characteristics: unordered and non-repeatable
A set can store at most 2^32-1
elements. The concept is basically similar to the set of individuals in mathematics. The concept of a mathematical set refers to a collection of concrete or abstract objects with certain properties.
In short, Redis collections are combinations of unique values. Using the data structure of set (Set), Redis can store some set-type data. Redis also supports basic operations of sets such as intersection, union, and difference through some simple commands.
1.2. Set application scenarios
Common application scenarios include: voting system, labeling system, mutual friends, common attention, common hobbies, lottery, product screening column, access IP statistics, etc.
scenes to be used:
- Like, dislike, favorite: Set type can guarantee that a user can only like one;
- Mutual attention, tags: Set type supports intersection operation, so it can be used to calculate mutual attention of friends, official accounts, etc.;
- Sweepstakes: Store the usernames of the winning users in a certain activity. Because the Set type has the function of deduplication, it can ensure that the same user will not win the prize twice
2. Set underlying structure
2.1. Introduction to the underlying structure of List
The underlying storage of Redis Set adopts integer set IntSet and hash table, and the two are mutually converted. The following two conditions must be met for using IntSet storage, otherwise, HashTable is used, and the conditions are as follows:
- All elements held by the binding object are integer values;
- The number of elements saved by the collection object does not exceed 512
Taking the SADD command of Set as an example, the whole adding process is as follows:
- Check if the Set exists or not, create a Set combination.
- Add one by one according to the incoming Set collection, and memory compression is required when adding.
- setTypeAdd will determine whether to perform encoding conversion during the Set adding process
void saddCommand(redisClient *c) {
robj *set;
int j, added = 0;
// 取出集合对象
set = lookupKeyWrite(c->db,c->argv[1]);
// 对象不存在,创建一个新的,并将它关联到数据库
if (set == NULL) {
set = setTypeCreate(c->argv[2]);
dbAdd(c->db,c->argv[1],set);
// 对象存在,检查类型
} else {
if (set->type != REDIS_SET) {
addReply(c,shared.wrongtypeerr);
return;
}
}
// 将所有输入元素添加到集合中
for (j = 2; j < c->argc; j++) {
c->argv[j] = tryObjectEncoding(c->argv[j]);
// 只有元素未存在于集合时,才算一次成功添加
if (setTypeAdd(set,c->argv[j])) added++;
}
// 如果有至少一个元素被成功添加,那么执行以下程序
if (added) {
// 发送键修改信号
signalModifiedKey(c->db,c->argv[1]);
// 发送事件通知
notifyKeyspaceEvent(REDIS_NOTIFY_SET,"sadd",c->argv[1],c->db->id);
}
// 将数据库设为脏
server.dirty += added;
// 返回添加元素的数量
addReplyLongLong(c,added);
}
A little in-depth analysis of the adding process of a single element of the set, first of all, if it is already a HashTable code, then we will add the normal HashTable element, if it turns out to be an IntSet, then we need to make the following judgment:
- If it can be converted into an int object (isObjectRepresentableAsLongLong), then use IntSet to save it.
- If it is saved with IntSet, if the length exceeds 5 12 (REDIS_SET_MAX_INTSET_ENTRIES), it will be converted to HashTable encoding.
- In other cases, HashTable is used for storage.
2.2. Integer set IntSet
The integer set IntSet is a data structure used by Redis to store the set of integer values. It can be used to store int type data, and it can guarantee that no duplicate elements will appear. Therefore, when a set contains only integer elements and the number is not large, Redis will choose to use the integer set as the underlying implementation.
The inside of IntSet is actually an array (int8_t coentents[] array), and the data is stored in order, because the search for data is achieved by binary search.
If your collection has only integer-valued elements, and the number is lightweight, then Redis will use the integer collection as the underlying data structure of the Redis collection. Refer to the following code:
typedef struct IntSet{
// 编码格式
uint32_t encoding;
// 集合中的元素个数
uint32_t length;
// 保存元素数据
int8_t contents[];
} IntSet;
Let's break it down:
Attributes | illustrate |
---|---|
“encoding” | Encoding |
“length” | The number of elements in the array, that is, the overall length of the array |
“contents[]” | A collection of integers, each element of which is an array item (item) of the array. Features: Arranged in ascending order of value, does not contain any duplicates |
"contents" is the underlying implementation of the integer collection, which saves each element of the integer collection, and each element is arranged in order from small to large in the array, and does not repeat (how to ensure order and uniqueness, we will discuss insertion later time is talking). Although the "contents" array is declared as int8_t type, the actual type depends on the value of "encoding". When operating an integer set, the value of "encoding" will be obtained first.
For example, when we SADD numbers 1 3 5
insert data into a collection object, the structure of the collection object in memory is as follows:
2.3, hash table HashTable
The key-value in Redis is implemented through the dictEntry object, and the hash table is obtained by packaging the dictEntry again. This is the hash table object dicttht:
typedef struct dictht {
dictEntry **table;//哈希表数组
unsigned long size;//哈希表大小
unsigned long sizemask;//掩码大小,用于计算索引值,总是等于size-1
unsigned long used;//哈希表中的已有节点数
} dictht;
PS: table is an array, each element of which is a dictEntry object.
hashtable
The encoded collection object uses a dictionary as the underlying implementation. Each key of the dictionary is a string object, each string object corresponds to a collection element, and the value of the dictionary is NULL
. When we execute SADD fruits "apple" "banana" "cherry"
to insert data into the collection object, the memory structure of the collection object is as follows:
3. Set common commands
3.1. Add collection elements
Use the SADD command to add collection elements
SADD set value
If the value already exists, do not add it and return 0
3.2. View all values of the collection
Use the SMEMBERS command to view all values of the collection
SMEMBERS set
3.3. Determine whether a value is in the set
Use the SISMEMBER command to determine whether a value is in the set
3.4. View the number of stored values in a collection
Use the SCARD command to view the number of stored values in a collection
SCARD set
3.5. Delete the element with the specified value in the collection
Use SREM to remove elements with specified values from a collection
SREM set value
3.6. Randomly select an element in a set
Use the SRANDMEMBER command to randomly select an element in a collection
SRANDMEMBER set
3.7. Randomly delete an element in a collection
Use the SPOP command to randomly delete an element in a collection
SPOP set
3.8. Move a value in one set to another set
Use the SMOVE command to move a value from one set to another
SMOVE source target value
3.9. Set operation: difference set
Set operations using the SDIFF command: difference
SDIFF set1 set2
3.10. Set operation: intersection
Set Operations Using the SINTER Command: Intersection
SINTER set1 set2
3.11. Set operation: union
Use the SUNION command for set operations: union
SUNION set1 set2