redis source code analysis (three) - rdb persistence

Redis rdb persistence

  Redis supports two persistent way: rdb and aof. rdb data on a memory node of the sequence stored to disk, the data is stored in a sequence of space-saving manner possible, is not completely ascii FIG. It has the advantage of space saving, fast recovery, the disadvantage that every sequence of operations required for all the memory data, and the process of modifying the persistence is lost. And aof the operation command data so as to store, recover the data that is read from aof aof command from the file and then execute the command. The advantage is that persistence command generation process may be recorded, but the disadvantage that in fully encoded ascii, more use of space, and the recovery is slower. This chapter focuses on the persistent file rdb way for making a brief introduction.

  Redis There are several types of data, such as string, set, zset, hash, list, etc., and having a plurality of db (db i.e. a is a dictionary, stored string, list, set and other types of data), to perform different types of serialized data, Redis designed corresponding persistence format.

1. serialized string

  When serializing string, the string length is stored first, then the actual value stored in the string. To save space, for various lengths of string, Redis use as little space to store its length, so the design of the length of the string stored in the following format:     

  1. len <1 << 6, using a 1-byte code length, the high byte is 2bits 00, representative length low 6bits
  2. 1 << 6 <= len <1 << 14, uses 2 bytes code length, the high byte of the first two 01 bits, 14bits subsequent representative length
  3. 1 << 14 <= ​​len <= UINT32_MAX, code length using 5 bytes, the first byte is 0x80, the length of the subsequent 4 bytes represent
  4. UINT32_MAX <len, using 9 bytes code length, the first byte is 0x81, the subsequent 8 bytes representative length

Redis corresponding definitions as follows:

/* Defines related to the dump file format. To store 32 bits lengths for short
 * keys requires a lot of space, so we check the most significant 2 bits of
 * the first byte to interpreter the length:
 *
 * 00|XXXXXX => if the two MSB are 00 the len is the 6 bits of this byte
 * 01|XXXXXX XXXXXXXX =>  01, the len is 14 byes, 6 bits + 8 bits of next byte
 * 10|000000 [32 bit integer] => A full 32 bit len in net byte order will follow
 * 10|000001 [64 bit integer] => A full 64 bit len in net byte order will follow
 * 11|OBKIND this means: specially encoded object will follow. The six bits
 *           number specify the kind of object that follows.
 *           See the RDB_ENC_* defines.
 *
 * Lengths up to 63 are stored using a single byte, most DB keys, and may
 * values, will fit inside. */

For example, the encode string "hello world" as follows:

0xC “hello, world”

0xC represented in 1-byte string length is 12 bytes, 12 bytes is the true value of the follow-up of the string. Since the non-prefix code, so they could be based on the value of the first byte to distinguish what format is, first read the contents of a byte deserialization, and to select the correct operation of the value of this byte.

  In rdb.c redis source file rdbSaveLen function completion code length, and rdbLoadLen been decoded length value.

2. sequence of int

  When the value of the required sequence of integer values, binary integer value will be directly serialize, rather than converting to ascii code string sequence, thereby further saving space. Further, according to the size of the integer value, likewise Redis as little space to store the value. Similarly string, a first byte is set to high 2bits 11 represents the subsequent value is an integer value, while a low value of the number of bytes used to distinguish 6bits.

  1.   (Value> = - (1 << 7) && value <= (1 << 7) -1), low 6bit is 0, 1 byte of storage subsequent integer value.
  2.   (Value> = - (1 << 15) && value <= (1 << 15) -1), low 6btis value of 1, the subsequent 2 bytes of storage integer values.
  3.   (Value> = - ((long long) 1 << 31) && value <= ((long long) 1 << 31) -1), low 6bits is 2, the subsequent 4 bytes store integer values.

Overall form:  

11000000[8 bit integer]
11000001[16 bit integer]
11000002[32 bit integer]

For example, the sequence of integer value 0x300 The results are as follows:

0xC1 0x0300

0xC1 indicates a subsequent shaping to a value stored as int16, 0x0300 that is the integer value. Since the first byte is the high 2bits 11, the format type indicates string length is not the same, this can be distinguished as a string or int.

  Further, when the first byte is the high 2bits 11, 6bits low as 3 (RDB_ENC_LZF), then the subsequent data is compressed data, followed by two bytes after the length values, respectively, the length of data before compression after compression data length, back then the real data.

3. RDB_TYPE_ * and RDB_OPCODE_ *

  Redis is a key-value cache system, all the values ​​are in the form of key-value stored in the dictionary db. However, in redis value is not limited to the type of string, it may also be the type of structure, such as a set, list, hash like. In order to serialize these types, the first Redis will store a byte of data type, and if the composite type, stores the number of members of this type, and to then traverse members basic int value or string type is stored.

  Redis type defined in the following values:

/* Map object types to RDB object types. Macros starting with OBJ_ are for
 * memory storage and may change. Instead RDB types must be fixed because
 * we store them on disk. */
#define RDB_TYPE_STRING 0
#define RDB_TYPE_LIST   1
#define RDB_TYPE_SET    2
#define RDB_TYPE_ZSET   3
#define RDB_TYPE_HASH   4
#define RDB_TYPE_ZSET_2 5 /* ZSET version 2 with doubles stored in binary. */
#define RDB_TYPE_MODULE 6
#define RDB_TYPE_MODULE_2 7 /* Module value with annotations for parsing without
                               the generating module being loaded. */
/* NOTE: WHEN ADDING NEW RDB TYPE, UPDATE rdbIsObjectType() BELOW */

/* Object types for encoded objects. */
#define RDB_TYPE_HASH_ZIPMAP    9
#define RDB_TYPE_LIST_ZIPLIST  10
#define RDB_TYPE_SET_INTSET    11
#define RDB_TYPE_ZSET_ZIPLIST  12
#define RDB_TYPE_HASH_ZIPLIST  13
#define RDB_TYPE_LIST_QUICKLIST 14
#define RDB_TYPE_STREAM_LISTPACKS 15
/* NOTE: WHEN ADDING NEW RDB TYPE, UPDATE rdbIsObjectType() BELOW */

These values ​​represent the types of sequence is a type of value, such as a list, set the like, in addition to representing the value of the specific implementation, such as a hash table may be used to achieve set, the corresponding value type RDB_TYPE_SET, may be ordered using a shaping the array realization, the corresponding type is RDB_TYPE_SET_INTSET. Deserializing first reads a byte value determines the type of the subsequent data, the appropriate type and then reconstructed.

For example, a string of key-value type sequence of results "key1" "hello, world" as:

RDB_TYPE_STRING 0x4 “key1” 0xC “hello, world”

First, the 1-byte type value RDB_TYPE_STRING, then a value of 0x4 bytes in length, i.e. 4 bytes behind keyword strings, and a byte length value 0xC, i.e. followed by 12-byte character Chuanzhi.

Deserialize when:

  1. Reading a first byte available RDB_TYPE_STRING, obtained subsequent to a string object
  2. Then 0x4 values ​​read length, and read the corresponding 4 bytes obtained keyword,
  3. Finally, read length 0xC, and read the corresponding value to a value of 12 bytes worth.

For example, a list of key-value pairs Type: serialization of the "key2" 2 3 4 0x300 is:

RDB_TYPE_LIST 0x4 “key2” 0x4 0xc0 0x2 0xc0 0x3 0xc0 0x4 0xC1 0x0300

First, one-byte type value RDB_TYPE_LIST, then the length of the value 0x4, represents 4 bytes keyword strings, then the 0x4 indicates that the list has four members, and then four serial format to sequentially shaping members of.

Deserialize when:

  1. 1 first reads the byte obtained RDB_TYPE_LIST, as a follow-up to give the object list,
  2. Read key length of 0x4, to give 4-byte read "key2"
  3. Read length 0x4, to give the number of list members 4
  4. Read length, and read the integer value, four cycles to complete the reconstruction of the object list.

  In addition to the redis RDB_TYPE * corresponding to the data type stored, there is a class RDB_OPCODE * represent some other data, such as: RDB_OPCODE_EXPIRETIME indicates that the following data of the object timeout; RDB_OPCODE_SELECTDB indicates a subsequent data is indexed db, until met RDB_OPCODE_SELECTDB prior to the next, all of deserialized data to be stored in the index in the dictionary db. Redis defined RDB_OPCODE * values ​​of the following types:

/* Special RDB opcodes (saved/loaded with rdbSaveType/rdbLoadType). */
#define RDB_OPCODE_MODULE_AUX 247   /* Module auxiliary data. */
#define RDB_OPCODE_IDLE       248   /* LRU idle time. */
#define RDB_OPCODE_FREQ       249   /* LFU frequency. */
#define RDB_OPCODE_AUX        250   /* RDB aux field. */
#define RDB_OPCODE_RESIZEDB   251   /* Hash table resize hint. */
#define RDB_OPCODE_EXPIRETIME_MS 252    /* Expire time in milliseconds. */
#define RDB_OPCODE_EXPIRETIME 253       /* Old expire time in seconds. */
#define RDB_OPCODE_SELECTDB   254   /* DB number of the following keys. */
#define RDB_OPCODE_EOF        255   /* End of the RDB file. */

The length of these particular types of subsequent values ​​is usually fixed, as RDB_OPCODE_EXPIRETIME to 32bit expressed timeout time is S; behind RDB_OPCODE_EXPIRETIME_MS is timeout 64bit expressed in units of ms; RDB_OPCODE_SELECTDB follow is db index used to len encoding storing, while RDB_OPCODE_AUX represent some key-value pairs, and these key, value string and int are basic types.

  In addition to fixing the beginning of the code byte magic rdb file, all rdb serialized data has a value or upstream RDB_TYPE_ * RDB_OPCODE_ * value, which indicates a subsequent data is stored, in order to take a correct operation deserialization .

4. RDB serialization process

1. The sequence of preamble information, such as magic identifier, version number, time stamp

2. Db traversal, each sequence of a db

   2.1 Serialization RDB_OPCODE_SELETCTDB

   2.2 serialization RDB_OPCODE_RESIZEDB, the storage size of the db

   2.3 Each sequence of key-value pairs in db

       2.3.1 serialization timeout RDB_OPCODE_EXPIRETIME

       2.3.2 lru serialized values, RDB_OPCODE_IDLE

       2.3.3 LFU serialized values, RDB_OPCODE_FREQ

       2.3.4 serialized value type

       2.3.5 serialization key (string)

       2.3.6 serialization value

Rio redis introduced abstraction layer of a first file, which is to rio serialization as the interface to the destination rio, may be output to a file content sequence, the output sequence may be content to in multiple sockets. Common operations using persistent file as output target, and the data in master-slave synchronization to the sockets may be used as an output target, by abstracting rio it will be decoupled serialized underlying io.

  Redis serialization function call stack is as follows:

          

 

Left is the output of a serialized object to call the relationship a file, the right is the output target for the call relation serialization of sockets.

5. RDB deserialization process

 1. 9 magic reading flag bytes, and verify

Cycle steps 2-3

2. Read byte flag RDB_TYPE_ * or RDB_OPCODE_ *

3. The corresponding processing according to the value of RDB_OPCODE_ * or RDB_TYPE_ *

In the third step, if necessary to read this basic type int or string, the process is:

  1. Call rdbLoadLen read length
  2. Return value string or integer value read according rdbLoadLen

If the corresponding type is a complex type, such as a list, set the like, the process is:

  1. Call rdbLoadLen read the number of members of composite types
  2. Cycle to read the membership value until the number of specified value is read out. I.e., the reading operation to read the membership or substantially string type int.

Deserialized input file, even if the output destination is serialized Sockets, the receiving end will be stored in a first data file, the file then deserialized. Call stack is deserialized

  1. rdbLoad, rio initialization file stream
  2. rio rdbLoadRio as input to read data from the file and deserialization is complete.

6. daemon

 Since redis a single thread mode, it selects persistence operations performed on the child, otherwise the process will stop responding to requests persistence.

  Fork function according to the characteristics of the child process created after the memory contents and have the same parent, so the child process after fork function call to obtain the complete contents of the db at this time. And since the copy-on-write characteristics, not a lot of memory copy occurs only when the write operation occurs, only the corresponding page memory for a copy of a copy, that is, the operation will not be extremely time consuming.

  But the corresponding parent process continues to accept client orders, and modify the contents of memory will not react to the child process, so rdb persistent modifications arising in the course will be lost.

Guess you like

Origin www.cnblogs.com/yang-zd/p/11627779.html