Common data types in Redis

The content of this article refers to Redis development and operation and maintenance.

1、String

        The string type is the most basic data structure of Redis. First of all, the keys are all string types, and several other data structures are built on the basis of the string type, so the string type can lay the foundation for the learning of the other four data structures.

        As shown in the figure above, the value of the string type can actually be a string (simple string, complex string (such as JSON, XML)), number (integer, floating point number), or even binary (picture, audio, video ), but the maximum value cannot exceed 512MB.

        Order

        There are many string type commands. This article will explain them in two dimensions: commonly used and uncommon. However, commonly used and uncommon here are relative, so it does not mean that there is no need to understand them if they are not commonly used.

         A summary of related commands can be viewed in the usage of Redis.

        Common commands

        (1) Setting value
set key value [ex seconds] [px milliseconds] [nx|xx]

         The following operation is set to hello and the value is the key-value pair of world. The return result is OK, which means the setting is successful.

        The set command has several options:

  • ex seconds: Set the second-level expiration time for the key.
  • px milliseconds: Set the millisecond expiration time for the key.
  • nx: The key must not exist before it can be set successfully and used for addition.
  • xx: Contrary to nx, the key must exist before it can be successfully set and used for updates.
  • In addition to the set option, Redis also provides two commands: setex and setnx:
    setex key seconds value
    setnx key value

    Their functions are the same as the ex and nx options. The following example illustrates the difference between set, setnx, and set xx.

         First is the example of setnx:

        We can find that when the key we want to create does not exist, we can use the setnx command to create it successfully, but when the key we want to create already exists, we use the setnx command to create it, and the creation will fail.

        set:

        Compared with setnx, we can find that the difference between set and setnx is that when the key we want to create already exists, setnx cannot be created successfully, but set can still be created successfully.

        set  xx:

        From the figure, we can find that when we use set xx, only existing keys can be created successfully. When the key we want to create does not exist, the creation will fail.

        So, what are the actual application scenarios of setnx and set xx? some.

        Due to the single-threaded command processing mechanism of Redis, if multiple clients execute setnx key value at the same time, according to the characteristics of setnx, only one client can set it successfully. Setnx can be used as an implementation solution for distributed locks. Redis officially provides How to implement distributed locks using setnx: http://redis.io/topics/distlock

        In some cases you may only want to update a key if it already exists, to ensure that a new key is not created if it is invalid. SET key value XXThe command ensures that the value is updated only if the key exists.

        (2) Get value
get key

        From the figure, we can find that when the key exists, we can get the value of the key, and when the obtained key does not exist, nil (empty) is returned.

        (3) Batch setting values ​​and batch obtaining values
mset key value [key value ...]
mget key1 key2 ....

        It can be found that if some keys do not exist, nil (empty) will be returned, and the result will be returned in the order of the keys passed in.

        Batch operation commands can effectively improve the efficiency of development. If there is no such command as mget, to execute the get command n times, it needs to be executed as shown in the figure below. The specific time consumption is as follows:

        n times get time = n times network time + n times command time

        After using the mget command, to execute the get command n times, you only need to complete it as shown below. The specific benefits are as follows:

        n get times = 1 network time + n command times

        Redis can perform tens of thousands of read and write operations per second, but this refers to the processing capability of the Redis server. For the client, in addition to the command time, a command also has network time. Assume that the network time is 1ms and the command time is 0.1ms (calculated based on processing 1w commands per second), then the difference between executing 1000 get commands and one mget command is as follows:

Comparison table between 1000 get and 1 mget
operate time
1000 times get 1000 × 1 + 1000 × 0.1 = 1100ms = 1.1s
1 mget (assembled 1000 key-value pairs) 1 × 1 + 1000 × 0.1 = 101ms = 0.101s

        Because the processing power of Redis is high enough, for developers, the network may become a performance bottleneck. Learning to use batch operations will help improve business processing efficiency, but what should be paid attention to is the command sent for each batch operation. The number is not unlimited. If the number is too large, it may cause Redis congestion or network congestion.

        (4) Counting
incr key

        The incr command is used to perform increment operations on values, and the returned results are divided into three situations:

  • Value is not an integer, an error is returned.
  • It is just an integer and returns the result after incrementing.
  • If the key does not exist, it will be incremented according to the value of 0 and the returned result will be 1.

        For example, after executing the incr operation on a non-existent key, the return result is 1. If the incr command is executed on the key again, the return result is 2. If the value is not an integer, an error will be returned. As shown in the picture:

        In addition to the incr command, Redis provides decr (auto-decrement), incrby (auto-increment to a specified number), decrby (auto-decrement to a specified number), and incrbyfloat (auto-increment of a floating point number):

decr key
incrby key increment
decrby key increment
incrbyfloat key increment

        Many storage systems and programming languages ​​use the CAS mechanism internally to implement counting functions, which will cause a certain amount of CPU overhead. However, this problem does not exist in Redis because Redis is a single-threaded architecture, and any command must be executed sequentially when it reaches the Redis server.

        Uncommonly used commands

        (1) Additional value
append key value

        append can append a value to the end of a string, for example:

        (2) String length
strlen key

        From the picture, we can find that the key redis occupies 6 bytes, because a Chinese character in Redis occupies three bytes.

        (3) Set and return the original value
getset key value

        getset sets the value like set, but the difference is that it also returns the original value of the key, for example:

        (4) Set the character at the specified position
setrange key offset value

        The following operation changes the value from hello to fallo

        (5) Get part of the string
getrange key start end

        start and end are the offsets of the start and end respectively. The offset is calculated from 0. For example, the operation in the above figure obtains the ell in the middle of the value fello.

        String type command time complexity

String type command time complexity
Order time complexity
set key value O(1)
get key O(1)
del key [key ...] O(k), k is the number of keys
mset key value [key value ...] O(k), k is the number of keys
mget key [key ...] O(k), k is the number of keys
incr key O(1)
decr key O(1)
incrby key increment O(1)
decrby key increment O(1)
incrbyfloat key increment O(1)
append key value O(1)
strlen key O(1)
setrange key offset value O(1)
getrange key start end O(n), n is the length of the string. Since obtaining the string is very fast, it can be regarded as O(1) if the string is not very long.

        internal encoding

        There are three internal encodings for string types:

  • int: 8-byte long integer.
  • embstr: A string of less than or equal to 39 bytes.
  • raw: A string larger than 39 bytes.

        Redis will decide which internal encoding implementation to use based on the type and length of the current value.

        Examples are as follows:

# 整数类型示例
127.0.0.1:6379> set key 8653
OK
127.0.0.1:6379> object encoding key
"int"

# 短字符串示例
# 小于等于39个字节的字符串:embstr
127.0.0.1:6379> set key "hello world"
OK
127.0.0.1:6379> object encoding key
"embstr"

# 长字符串示例
# 大于39个字节的字符串:raw
127.0.0.1:6379> set key "one string greater than 39 byte .................................."
OK
127.0.0.1:6379> object encoding key
"raw"
127.0.0.1:6379> strlen key
(integer) 66

        Typical usage scenarios

        1. Caching function

        As shown above, it is a typical cache usage scenario, in which Redis serves as the cache layer and MySQL serves as the storage layer. Most of the requested data is obtained from Redis. Since Redis has the characteristics of supporting high concurrency, caching can usually play a role in accelerating reading and writing and reducing back-end pressure.

        The following pseudocode simulates the access process in the above figure:

// 假设有一个函数getUserInfo,通过传入的id,获取用户信息
UserInfo getUserInfo(Long id){
    // 首先从Redis中获取用户信息:
    // 定义键
    userRedisKey = "user:info:" + id;
    // 从Redis中获取值
    value = redis.get(userRedisKey);
    // 如果没有从Redis中获取到用户信息,需要从MySQL中进行获取,并将结果回写到Redis,添加1小时(3600)过期时间;
    UserInfo userInfo;
    if (value != null) {
        userInfo = deserialize(value);
    } else{
        // 从MySQL中获取用户信息
        userInfo = mysql.get(id);
        if(userInfo != null)
            // 将userInfo序列化,并存入Redis中
            redis.setex(userRedisKey, 3600, serialize(userInfo));
    }
    // 返回结果
    return userInfo;
}

        2. Counting

        Many applications use Redis as the basic tool for counting. It can implement fast counting and query caching functions, and data can be asynchronously landed to other data sources. For example, it can be used in a video play count system. You can use Redis as the basic component for counting video play counts. Every time a user plays a video, the corresponding video play count will increase by 1.

        3. Speed ​​limit

        Many times, for security reasons, applications will ask users to enter a mobile phone verification code every time they log in to determine whether they are the users themselves. However, in order to prevent the SMS interface from being accessed frequently, the frequency of users obtaining verification codes per minute will be limited, for example, no more than 5 times per minute. As shown in the picture:

        This function can be implemented using Redis. The following pseudocode gives the basic implementation idea:

String phoneNum = "188xxxxxxxx"
String key = "shortMsg:limit:" + phoneNum
// SET Key EX 60 NX
isExists = redis.set(key, 1 "EX 60", "NX")
if (isExists != null || redis.incr(key)<=5){
    // 通过
}else{
    // 限速
}

        The above is the use of Redis to implement the speed limit function. For example, if a website restricts an IP address from being accessed more than n times in one second, a similar idea can be used.

        In addition to the centralized application scenarios introduced above, strings have many applicable scenarios. Developers can give full play to their imagination by combining the relevant commands provided by strings.

2、Hash

        Almost all programming languages ​​provide hash types, which may be called hashes, dictionaries, and associative arrays. In Redis, the hash type refers to the key value itself which is a key-value pair structure, in the form of value={ { field1:value},....,{fieldn, valuen}}, Redis key-value pair and hash The relationship between the two types can be represented by the following figure.

       Order

        A summary of related commands can be viewed in the usage of Redis.

                (1) Setting value
hset key field value
                (2) Get value
hget key field
                (3) Delete field
hdel key field [field ...]
                (4) Calculate the number of fields
hlen key
                (5) Set or get field-value in batches
hmget key field [field ...]
hmset key field value [field value ...]
                (6) Determine whether the field exists
hexists key field
                (7) Get all fields
hkeys key
                (8) Get all values
hvals key
                 (9) Get all field-values
hgetall key
                (10)hincrby bincrbyfloat
hincrby key field increment
hincrbyfloadt key field increment
                (11) Calculate the string length of value (requires Redis3.2 or above)
hstrlen key field

        The following table describes the time complexity of hash type commands

Time complexity of hash type commands
Order time complexity
hset key field value O(1)
hget key field O(1)
hdel key field [field ...] O(k), k is the number of fields
hlen key O(1)
hgetall key O(n), n is the total number of fields
hmget field [field ...] O(k), k is the number of fields
hmset field value [field value ...] O(k), k is the number of fields
hexists key field O(1)
hkeys key O(n), n is the total number of fields
hvals key O(n), n is the total number of fields
hsetnx key field value O(1)
hincrby key field value O(1)
hincrbyfloat key field increment O(1)
hstrlen key field O(1)

        internal encoding

        There are two internal encodings for hash types:

  • ziplist(压缩列表):当哈希类型元素个数小于hash-max-ziplist-entries配置(默认512个)、同时所有值都小于hash-max-ziplist-value配置(默认64字节)时,Redis会使用ziplist作为哈希的内部实现,ziplist使用更加紧凑的结构实现多个元素的连续存储,所以在节省内存方面比hashtable更加优秀。
  • hashtable(哈希表):当哈希类型无法满足ziplist的条件时,Redis会使用hashtable作为哈希的内部实现,因为此时ziplist的读写效率会下降,而hashtable的读写时间复杂度为O(1)。

        1、当field个数比较少且没有大的value时,内部编码为ziplist。

        2、当有value大于64字节,内部编码会有ziplist变为hashtable。

        3、当field个数超过512,内部编码也会有ziplist变为hashtable。

        但是在实际操作中我们会发现哈希内部的编码是listpack,既不是ziplist,也不是hashtable,如图:

        这是因为Redis 在内部自动进行了编码转换,将哈希键的编码方式从哈希表(hashtable)转换为 Listpack。这是 Redis 为了优化内存使用和性能而采取的一种策略。

        当哈希键的大小和字段的大小满足一定条件时,Redis 会自动选择将其编码方式从哈希表转换为 Listpack。Listpack 是一种紧凑的二进制格式,用于存储键值对,类似于 Redis 中的列表(List)数据结构。这种转换可以减少内存占用并提高一些操作的性能。

        虽然哈希键的内部编码方式显示为 "listpack",但用户仍然可以使用通常的哈希操作命令来访问和操作哈希键的数据,就像它们仍然是哈希表一样。这种编码方式的转换是 Redis 内部的优化策略,对于用户来说,不需要直接操作 Listpack。

        那什么情况下会发生这样的事情呢?

        Redis 会自动选择将哈希键(Hash Key)的编码方式从哈希表(hashtable)转换为 Listpack 编码方式的情况通常涉及以下条件:

  1. 哈希键的大小:当哈希键包含的字段数量相对较少,并且字段的大小适中时,Redis 可能会考虑将其编码方式转换为 Listpack。具体的阈值可能因 Redis 的版本和配置而有所不同。

  2. 字段的大小:如果哈希键的字段的键和值的大小都比较小,那么 Redis 更有可能选择 Listpack 编码方式。

  3. 哈希键的使用模式:哈希键的使用模式也会影响 Redis 的编码选择。如果哈希键主要用于插入、删除或迭代操作,并且不需要频繁的哈希键查找操作,那么 Listpack 编码方式可能更合适。

  4. 内存优化策略:Redis 会考虑系统的内存状况和性能,以决定是否切换编码方式。它的目标是提高内存使用效率和执行操作的速度。

        需要注意的是,Redis 的编码方式转换是自动进行的,用户无需干预。Redis 会根据上述条件自动选择合适的编码方式以提高性能和内存利用率。这个转换是 Redis 的内部优化机制,它使 Redis 能够在不同情况下充分利用内存,并提供高性能。

        如果您想了解特定版本的 Redis 在何时选择转换编码方式以及如何调整这些条件,请参考该版本的 Redis 文档或源代码。不同版本的 Redis 可能会在这方面有些许不同。

        使用场景

        相比于使用字符串序列化缓存用户信息,哈希类型变得更加直观,并且在更新操作上会更加便捷。可以将每个用户的id定义为键后缀,多对field-value对应每个用户的属性。

        但是需要注意的是哈希类型和关系数据库有两点不同之处:

  • 哈希类型是稀疏的,而关系型数据库是完全结构化的,例如哈希类型每个键可以有不同的field,二关系型数据库一旦添加新的列,所有行都要为其设置值(即使为NULL)。
  • 关系型数据库可以做复杂的关系查询,去模拟关系型复杂查询开发困难,维护成本高。

        到目前为止,我们已经给出三种方法缓存用户信息,如下:

        1、原生字符串类型:每个属性一个键

set user:1:name tom
set user:1:age 23
set user:1:city beijing

        优点:简单直观,每个属性都支持更新操作。

        缺点:占用过多的键,内存占用量较大,同时用户信息内聚性比较差,所以这种方式一般不会在生产环境中使用。

        2、序列化字符串类型:将用户信息序列化后用一个键保存。

set user:1 serialize(userInfo)

        优点:简化编程,如果合理的使用序列化可以提高内存的使用率。

        缺点:序列化和反序列化有一定的开销,同时每次更新属性都需要把全部数据取出来进行反序列化,更新后再序列化到Redis中。

        3、哈希类型:每个用户属性使用一对field-value,但是只用一个键保存。

hmset user:1 name tom age 23 city beijing

        优点:简单直观,如果使用合理可以减少内存空间的使用。

        缺点:要控制哈希在ziplist和hashtable两种内部编码的转换,hashtable会消耗更多内存。

3、List

        列表(list)类型是用来存储多个有序的字符串的,可以理解为是一个双向链表。

        如上图中第一个列表所示,a、b、c、d、e五个元素从左到右组成了一个有序的列表,列表中的每个字符串被称为元素,一个列表最多可以存储2^32-1个元素。在Redis中,可以对列表两端插入和弹出,还可以获取指定范围内的元素列表、获取指定索引下标的元素等。列表是一种比较灵活的数据结构,他可以充当栈和队列的角色,在实际开发上有很多应用场景。

        列表类型有两个特点:第一、列表中的元素是有序的,这就意味着可以通过索引下标来获取某个元素或者某个范围内的元素列表如上图的第二个列表所示。第二,列表中的元素可以是重复的。

        命令

        相关命令的汇总可以前往Redis的使用进行查看

                基本命令
# 从左/右边插入元素
lpush/rpush key value [value ...]

# 向某个元素前或者后插入元素
linsert key before/after pivot value

# 获取指定范围内的元素列表
lrange key start end

# 获取列表指定索引下标的元素
lindex key index

# 获取列表长度
llen key

# 从左/右侧弹出元素
lpop/rpop key

# 按照索引范围修剪列表
ltrim key start end

# 修改指定索引下标的元素
lset key index newValue

        删除元素命令
# 删除指定元素
lrem key count value

        lrem命令会从列表中找到等于value的元素进行删除,根据count的不同分为如下情况:

  • count>0,从左到右,删除最多count个元素。
  • count<0,从右到左,删除最多count绝对值个元素。
  • count=0,删除所有。

       阻塞式弹出
# 阻塞式弹出
blpop key [key ...] timeout
brpop key [key ...] timeout

        blpop与brpop是lpop和rpop的阻塞版本,它们除了弹出的方向不同,使用方法基本相同,所以下面以brpop命令进行说明,brpop命令包含2个参数:

  • key [key ...]:多个列表的键。
  • timeout:阻塞时间(单位:秒)。
  1. 列表为空:如果timeout=3,那么客户要等到3秒结束后返回,如果timeout=0,那么客户端一直阻塞下去。
    client1:

    此时如果另一个客户端在该列表里添加了数据则:
  2. 列表不为空:客户端会立即返回。

        在使用brpop时需要注意两个点:

  1. 如果是多个键,那么brpop会从左至右遍历键,一旦有一个键能弹出元素,客户端会立即返回。
  2. 如果多个客户端对同一个键执行brpop,那么最先执行brpop命令的客户端可以获取到弹出的值。

         下表描述的是列表类型命令的时间复杂度

列表类型命令的时间复杂度
操作类型 命        令 时间复杂度
添加 rpush key value [value ....] O(k),k是元素个数
lpush  key value [value ....] O(k),k是元素个数
linsert key before | after pivot value O(n),n是pivot距离列表头或尾的距离
查找 lrange key start end O(s+n),s是start偏移量,n是start到end的范围
lindex key index O(n),n是索引的偏移量
llen key O(1)
删除 lpop key O(1)
rpop key O(1)
lrem count value O(n),n是列表长度
ltrim key start end O(n),n是要裁剪的元素总数
修改 lset key index value O(n),n是索引的偏移量
阻塞操作 blpop brpop O(1)

        内部编码

        列表的类型的内部编码有两种:

  • ziplist(压缩列表):列表内部的元素个数小于list-max-ziplist-entries配置(默认512个),同时列表中每个元素的值都小于list-max-ziplist-value配置时(默认64字节),Redis会选用ziplist来作为列表内部实现来减少内存的使用。
  • linkedlist(链表):当列表类型无法满足ziplist的条件时,Redis会使用linkedlist作为列表的内部实现。
  • quicklist(快速列表):考虑到链表的附加空间相对太高,prev 和 next 指针就要占去 16 个字节 (64bit 系统的指针是 8 个字节),另外每个节点的内存都是单独分配,会加剧内存的碎片化,影响内存管理效率。因此Redis3.2版本开始对列表数据结构进行了改造,使用 quicklist 代替了 ziplist 和 linkedlist
        快速列表

        quicklist 实际上是 zipList 和 linkedList 的混合体,它将 linkedList 按段切分,每一段使用 zipList 来紧凑存储,多个 zipList 之间使用双向指针串接起来。

详情介绍可以前往https://www.cnblogs.com/hunternet/p/12624691.html

        使用场景

        1)消息队列

        Redis中的lpush+brpop命令组合可以实现阻塞队列,生产者客户端用lrpish从列表左侧插入元素,多个消费者客户端使用brpop命令阻塞式的“抢”列表尾部的元素,多个客户端保证了消费的负载均衡和高可用性。

        2)文章列表

        每个用户有属于自己的文章列表,现需要分页展示文章列表。此时可以考虑使用列表,因为列表不但是有序的,同时支持按照索引范围获取元素。

  1. 每篇文章使用户哈希结构存储,例如每篇文章有三个属性,title、timestamp、content。

  2. 向文章列表添加文章,user:{id}:articles作为用户文章列表的键。

  3. 分页获取用户文章列表。

# 伪代码
articles = lrange user:1:article 0 9
for article in {articles}:
    hgetall {article}
  • lpush + lpop = Stack(栈)

  • lpush + rpop = Queue(队列)

  • lpush + ltrim = Capped Collection(集合)

  • lpush + brpop = Message Queue(消息队列)

4、Set

待更新

5、ZSet

待更新

6、BitMaps

待更新

7、HypeLogLog

pending upgrade

8、GEO

pending upgrade

Guess you like

Origin blog.csdn.net/Deikey/article/details/132639340