BitMap
What is BitMap
BitMap, which is the bitmap, in fact, is a byte array, using the binary representation, only two digits 0 and 1.
as the picture shows:
Important API
command | meaning |
---|---|
getbit key offset | Key string stored value, obtaining bits (bit) on the specified offset |
setbit key offset value | String stored key value, set or clear the bits (bit) in the specified offset 1. The return value of this bit value before setbit 2. value takes only 0 or. 1 3. 0 offset from the start, even though FIG 10 only in situ, offset can take 1000 |
bitcount key [start end] | Get the number of bits in the bitmap specified range of value 1 if not specified start and end, then take all |
Ganzi op achievement key1 key2 ...] | And a plurality of doing BitMap (intersection), or (and set), not (non), xor (exclusive OR) operation and stores the result in the destKey |
bitpos key tartgetBit [start end] | Calculated value of the first bit map specified range corresponding to an offset position equal TargetBit 1 not found return -1 2. Start and end is not provided, then take all 3. targetBit takes only 0 or 1 |
Show
Scenarios
Log statistics of the number of daily users. Every identifies a user ID, when a user visits our website or perform an action, in the bitmap to identify the user's bit is set to 1.
Here to do a comparison using the set and BitMap store.
Scene 1: 1 million users, 50 million independent
type of data | Each userid space | Users need to store the amount of | All the amount of memory |
---|---|---|---|
set | 32 (userid with the assumption that integer, many websites actually using a long integer) | 50,000,000 | 32 * 50,000,000 = 200 MB |
BitMap | First place | 100,000,000 | 1 位 * 100,000,000 = 12.5 MB |
一天 | 一个月 | 一年 | |
---|---|---|---|
set | 200M | 6G | 72G |
BitMap | 12.5M | 375M | 4.5G |
场景2:只有 10 万独立用户
数据类型 | 每个 userid 占用空间 | 需要存储的用户量 | 全部内存量 |
---|---|---|---|
set | 32位(假设userid用的是整型,实际很多网站用的是长整型) | 1,000,000 | 32位 * 1,000,000 = 4 MB |
BitMap | 1 位 | 100,000,000 | 1 位 * 100,000,000 = 12.5 MB |
通过上面的对比,我们可以看到,如果独立用户数量很多,使用 BitMap 明显更有优势,能节省大量的内存。但如果独立用户数量较少,还是建议使用 set 存储,BitMap 会产生多余的存储开销。
使用经验
- type = string,BitMap 是 sting 类型,最大 512 MB。
- 注意 setbit 时的偏移量,可能有较大耗时
- 位图不是绝对好。