Are you sure you do not come under the principle of the right to know Redis string

Foreword
There are five basic Redis data structures: string, list, set, zset, hash string which is the most easiest and most commonly used this type of data is simple but the design of the internal structure of a very delicate...
basic introduction
Compared to the Java, Redis in the string can be modified, a dynamic string (Simple Dynamic String abbreviated SDS) he is an internal structure like the ArrayList, maintains a byte array and pre-allocate space to reduce the redundancy memory frequent allocation. when the length of the string is less than 1MB, each expansion space available is doubled, if the string is longer than 1MB, 1MB is only extended at each expansion.
ps: the maximum length of a string of 512MB.
> set name test OK > get name "test" > mset name1 test1 name2 test2 OK > mget name1 name2 1) "test1" 2) "test2" > del name (integer) 1
The above is the basic operation command string mget mset and saves network overhead to read a plurality of strings
Moreover redis string may also be used to store an integer (less like a Java String), and may increment operation of character strings stored in the integer type $ -2 $ ^ {64} to 2 ^ {$ 64 $} -1
If you save this number is greater than the range will become an ordinary character type can not increment operator. This will be the string encoding format decision.
String consists of a plurality of bytes, each byte has 8bit. Such data structures may also be used to as a bitmap.
> set foo 1 OK > get foo "1" > incr foo (integer) 2 > get foo "2"
Internal principle
Basically

The figure shows the basic structure of the string, wherein the content which is stored in string content, and c are the same as the end with a 0x0 character is not the end character code len calculated as follows:
struct SDS {T capacity; // array capacity T len; // actual length byte flages; // flag indicates the type of three low byte [] content; //} contents of the array
You can see the capacity and len are generic, why not just use int? Because internal Redis done a lot of optimization, in order to reduce memory usage string will use different data types to represent different lengths. And create a string len time and capacity will be as big, there is no redundancy in space, because the modified string scenes rarely. (Redis really the memory optimization to the extreme)
Encoding format
Redis string encoding format there are so few: int coding, embstr raw coding and encoding the following details on the difference between these types of coding the next.
Prior to first to talk about RedisObject. Redis object head, all of Redis objects have the following header structure.
struct RedisObject {int4 type; // 5 kinds of data types int4 encoding; // int key internal encoding format or the like embstr int24 lru; // clear the LRU algorithm uses an object memory when the memory gauge int32 refcount; // change key referenced by the number of void * ptr; //} target content
int coding
When the values ​​are stored 64-bit signed integral type int will be used when encoding, then you can use the key increment operator .Redis startup 1w creates a shared object redisObject will be mentioned below, the value [0, 1000). If the value of the integer stored in Redis will not be created in the [0,1000) a new object, but directly to a shared object key does not take up extra space.
Use object encoding command can be viewed using the debug object encoding format command to view more information
> set foo 1 OK > object encoding foo "int" > set foo2 1 OK > debug object foo Value at:0x7f44b020aca0 refcount:2147483647 encoding:int serializedlength:2 lru:14691591 lru_seconds_idle:72588 > debug object foo2 Value at:0x7f44b020aca0 refcount:2147483647 encoding:int serializedlength:2 lru:14691591 lru_seconds_idle:72594
See foo and foo2 are 0x7f44b020aca0 here points to the same object
embstr coding
When a shorter time length of a string (len <= 44 bytes), will use the Redis embstr i.e. embedded string embedded coding .embstr string. RedisObject the SDS structure embedded object, a method used to allocate memory malloc address is continuous.
as the picture shows:

raw coding
When a longer length of a string (len> 44 bytes), the Redis raw coding will be used, and the difference between the maximum and SDS RedisObject embstr is not together, the memory address is no longer continuous.
as the picture shows:

Think
Why are there two string format and raw format embstr and the dividing line is 44 bytes?
The default allocator jemalloc Redis memory size allocated in units of $ 2 ^ n $ th power, to accommodate a full embstr objects, will be at least 32 bytes of space allocation, is 64 bytes longer than this, then it is considered after It is not suitable for a large string embstr storage, and use raw encoded.
So the question is, the string length of 64 bytes of space is how much? The answer is 44 bytes.
FIG lower content length by subtracting 45 bytes of the end of 0x0, the rest of the 44 bytes.
Welcome to work one to five years of Java engineer friends to join Java programmers: 721 575 865
Java architecture to provide free learning materials within the group (which has high availability, high concurrency, high performance and distributed, Jvm performance tuning, Spring Source, MyBatis, Netty, Redis, Kafka, Mysql, Zookeeper, Tomcat, Docker, Dubbo, more knowledge of information architecture Nginx, etc.) rational use of every minute of their own time to enhance their learning, do not use the "no time" to hide his ideological laziness! Young, hard fight, give an account of their own future!


Reproduced in: https: //juejin.im/post/5cf4984c6fb9a07eae2a4834

Guess you like

Origin blog.csdn.net/weixin_34111790/article/details/91447581