The principle and use of bitmap in Redis

principle

First declare: Redis has 5 data types, and BitMap is not a new data type in Redis, and the bottom layer is Redis implementation.

Usually, we store a string in redis, such as: "big", its bitmap is as follows:

0.001kb = 1b = 8bit

Therefore, the string "big" occupies 3 characters, which is 24 bits.

Redis has added several bitmap related commands such as setbit, getbit, and bitcount since version 2.2.0. Although it is a new command, it does not add new data types, because commands such as setbit are just extensions to set.

Using the above commands, Redis can manipulate binary bits, and can take/change the value corresponding to each bit, simply write a few:

127.0.0.1:6379 > set hello big
"OK"
 
127.0.0.1:6379 > getbit hello 0
"0"
 
127.0.0.1:6379 > getbit hello 1
"1"

127.0.0.1:6379 > setbit hello 7 1
"0"

127.0.0.1:6379 > get hello
"cig"

From the above example, we can find:

  • getbit, setbit can perform bit operations on strings, and can get/modify the value of a bit;
  • After the bit of the string is modified, the string itself has also fundamentally changed, big -> cig.

The original meaning of BitMap is to use a bit to set 0 or 1 to map the state of an element.

Since a bit can only represent two states of 0 and 1, that is to say, the maximum amount of information that a bit can store is 2, so BitMap can map a limited state, but the advantage of using bits is that it can save a lot of memory space.


 BitMap related commands

In Redis, Bitmap is a series of consecutive binary numbers (0 or 1). Therefore, Bitmaps can be imagined as an array of bits, and the position of each bit of the array is offset (offset) , the subscript of the array is called offset in Bitmaps, and AND, OR, XOR and other bit operations can be performed on the bitmap.

# 设置值,其中value只能是 0 和 1
setbit key offset value

# 获取值
getbit key offset

# 获取指定范围内值为 1 的个数
# start 和 end 以字节为单位
bitcount key start end

# BitMap间的运算
# operations 位移操作符,枚举值
  AND 与运算 &
  OR 或运算 |
  XOR 异或 ^
  NOT 取反 ~
# result 计算的结果,会存储在该key中
# key1 … keyn 参与运算的key,可以有多个,空格分割,not运算只能一个key
# 当 BITOP 处理不同长度的字符串时,较短的那个字符串所缺少的部分会被看作 0。返回值是保存到 destkey 的字符串的长度(以字节byte为单位),和输入 key 中最长的字符串长度相等。
bitop [operations] [result] [key1] [keyn…]

# 返回指定key中第一次出现指定value(0/1)的位置
bitpos [key] [value]

BitMap Spatial Computing

Because the bits in BitMap are the mapping of strings, and the storage of strings in value is limited, the value storage space of BitMap can be calculated in the same way.

The maximum length of a string in Redis is 512M, so the maximum offset (offset) of BitMap is:

512 * 1024 * 1024 * 8  =  2^32

scenes to be used

1. User sign in

Many websites provide a check-in function, and need to display the check-in situation of the last month, which can be achieved by using BitMap.
According to date offset = (day of the year today) % (day of the year), key = year: user id.

If you need to store the user's detailed check-in information, you can consider using a one-step thread to complete it.

# 2021年第一天,用户Id = userId 的用户签到
setbit 2021:userId 1 1

2. Statistics of active users (user login status)

Use the date as the key, then the user id is offset, set to 1 if it was active on that day. The specific criteria for how to be active can be specified by yourself.

if:

  • 20220101 Active users are: [1, 0, 1, 1, 0]
  • 20220102 Active users are: [ 1, 1, 0, 1, 0 ]

Count the total number of active users for two consecutive days:

bitop and dest1 20220101 20220102 
# dest1 中值为1的offset,就是连续两天活跃用户的ID
bitcount dest1

Statistics of the total number of active users from 20220101 to 20220102:

bitop or dest2 20220101 20220102
# dest2 中值为1的offset,就是两天都活跃的用户的ID
bitcount dest2

3. Statistical user online status

If you need to provide an interface for querying whether the current user is online, you can also consider using BitMap, which saves space and is highly efficient. Only one key is needed, and then the user id is offset, which is set to 1 if it is online, and 0 if it is not online.

# userId 登录,设置状态为1
setbit key userId 1

# 获取 userId 的状态:1 - 在线;0 - 不在线
getbit key userId

4. Implement the Bloom filter

Bloom filters address cache penetration.


Summarize

  1. Bigmap is stored based on the smallest unit bit, and the biggest advantage is that it is very space-saving;
  2. The time complexity is O(1) when setting and O(n) when reading, and the operation is very fast;
  3. The storage of binary data is very fast when performing related calculations, and it can also be easily expanded;
  4. Don't set a very short bigmap with a very long offset value, as this may block.

Guess you like

Origin blog.csdn.net/weixin_44259720/article/details/122364337