Decryption Redis persistence [translation]

The author from Redis, he saw all of the great misconceptions Redis persistence in the forum, so write this article discusses the persistence

Write process

First we look at the database writes are carried out in the end what had been done thing, there are the following five processes.

  1. The client sends to the server writes (data in memory clients)
  2. Database server receives data (data in the server's memory) write request
  3. Server call write (2) system call, the data is written to disk (data in system memory buffer)
  4. Operating system to transfer data in the buffer to the disk controller (data in the disk cache)
  5. Disk controller writes data to the physical disk media (data is actually falls on disk)

Write roughly the top 5 processes, we combine the following top five procedures a look at the various levels of fault.

  • When the database system failure, this time the system kernel is OK, then finished at this time as long as we perform step 3, then the data is safe, because the operating system will follow to complete a few steps behind, to ensure that the data will eventually fall to disk .
  • When the system is powered down, this time in the above-mentioned five of all caches will be disabled, and the database and operating system will stop working. Therefore, only when the data after the completion of step 5, power off the machine in order to ensure data is not lost, the data in the above four steps will be lost.

By understanding the 5 steps above, we may want to find out some of the following questions:

  • How long database called once write (2), the data is written to the kernel buffer
  • How long will the system kernel data buffer is written to the disk controller
  • Disk controller and at what time the data is written to the cache on physical media

For the first question, the database level will usually full control. And the second question, the operating system has its default policy, but we can also provide a series by POSIX API fsync command forces the operating system to write data from the core area on the disk controller. For the third question, as if the database has been unable to reach, but in fact, in most cases the disk cache is set off. Or it is open only to read cache, write operations that is not cached, written directly to disk. The recommended practice is only turned on when you write cache when the disk devices have battery backup.

The so-called data corruption, data can not be recovered is, above all we are talking about how to ensure that the data is actually written to disk up, but that does not mean the data may be written on the disk will not be damaged. For example, we might write a request will be two different write operation, when an accident may result in a write operation to complete safety, but another has not been carried out. If the data file structure organization of the database unreasonable, might result in data completely unrecoverable situation occurs.

Here there are usually three strategies to organize data, to prevent damage to data files can not be restored in case:

  1. The first processing is the roughest, is no guarantee recoverability of data in the form of organizational data. Instead, after the data file is damaged to recover the data backed up by the backup data synchronization arranged. In fact, MongoDB does not open journaling log, by configuring Replica Sets is the case.
  2. Another is to add on the basis of the above operating on a log, remember what operating behavior of each operation, so that we can recover data by operating log. Because the operation log is appended, in order to write the way, so the situation can not be restored operation log does not appear. This is similar to the MongoDB open the case journaling log.
  3. Safer approach is to modify the database without the old data, but with an additional way to complete a write operation, so that the data itself is a log, so that the data can not be recovered case never appeared. In fact CouchDB is an excellent example of this approach.

RDB Snapshot

Let's first talk about a persistent policy of Redis, RDB snapshot. Redis supports snapshot of the current data is saved as a persistence mechanism for data files. The database is how to generate a sustained write a snapshot of it. With a copy fork Redis commands on write mechanism. When generating a snapshot, the current process will fork a child process, then loop through all of the data in the child process, the data is written as RDB file.

We can configure the timing RDB snapshots generated by the Redis save command, for example, you can configure when 10 minutes or less 100 writes to take a snapshot, you can configure when there are 1000 written on a snapshot in 1 hour, you can embodiment with a plurality of rules. These rules are defined in Redis configuration file, you can also set rules Redis Redis running through the CONFIG SET command, no need to restart Redis.

Redis of RDB file is not broken, because of its write operation is performed in a new process, when generating a new RDB file, Redis generated child process will first write data to a temporary file, and then by atom of the system call rename the temporary file is renamed to RDB files so fails at any time, Redis of RDB files are always available.

Meanwhile, the RDB file Redis Redis master-slave synchronization is implemented in the inside of a ring.

However, we can clearly see, RDB has his shortcomings, once the database is a problem, then save the file in our RDB data is not new, RDB file generated from the last period of time to stop Redis data all lost. In some businesses, this can be tolerated, we also recommend ways to use these services RDB be persistent, because the price is not high RDB open. But for some additional high data security requirements of the application, the application can not tolerate data loss, RDB can not do anything, so Redis introduces another important persistence mechanism: AOF log.

AOF log

Aof the full name of the log is append only file, from the name we can tell, it is written to a log file append. And binlog general database is different, AOF file is recognizable plain text, its content is one of the standard Redis commands. For example, we carried out the following experiment, Redis2.6 version, set to open aof command function in the startup parameters:

./redis-server --appendonly yes

We then execute the following command:

redis 127.0.0.1:6379> set key1 Hello
OK
redis 127.0.0.1:6379> append key1 " World!"
(integer) 12
redis 127.0.0.1:6379> del key1
(integer) 1
redis 127.0.0.1:6379> del non_existing_key
(integer) 0

Then we look at AOF log file, you will get the following:

$ cat appendonly.aof
*2
$6
SELECT
$1
0
*3
$3
set
$4
key1
$5
Hello
*3
$6
append
$4
key1
$7
 World!
*2
$3
del
$4
key1

You can see, the write operation generates a corresponding command as a log. It is worth noting that last a del command, it has not been recorded in the log AOF, because Redis judge this command does not make changes to the current data set. There is no need to record the write command useless. Further AOF log is not entirely according to the client's request to generate a log, such as command INCRBYFLOAT the chronograph AOF logs have been recorded into a SET record, because floating point operations may be different in different systems, so to avoid with a log generate different sets of data on different systems, so here only the result of the recording operation by the sET.

AOF rewrite

You may be thinking, every write command generates a log, then the file is not AOF will be great? The answer is yes, AOF documents will become increasingly large, so Redis also provides a feature called AOF rewrite. Its function is to re-generate an AOF document, the new AOF file a record of the operation only once, rather than as an old document, may be recorded in a number of operations on the same values. RDB and its generation process is similar, but also fork a process directly through the data, write new AOF temporary files. In the process of writing a new file, all written or will write the log old AOF original file, while also recorded in the memory buffer. When the operation is complete re-finished, all buffers will be written to the log-time temporary file. Then call the atomic rename command with a new document to replace the old AOF AOF file.

We can see from the above process, RDB and AOF operations are the order of IO operations, performance are high. While at the same time by the RDB database recovery log file or AOF, the read data are sequentially loaded into memory. So it will not cause random read disk.

AOF reliability setting

AOF is a write file operation, its purpose is to operate the log to disk, so it will also say that we encountered above write operation five processes. So write AOF operational safety but also how much of it. In fact, this can be set in Redis call write on the AOF (2) after writing, when to call fsyncit written to disk is controlled by appendfsync option, the following appendfsync three settings, security strength gradually Become stronger.

appendfsync no

When set appendfsync to no time, Redis will not take the initiative to call fsyncto the AOF log sync to disk, so all of this is completely dependent on the debugging of the operating system. For most Linux operating system, is carried out every 30 seconds fsync, the data buffer is written to disk.

appendfsync everysec

When set to appendfsync everysectime, Redis will default once every second fsynccall, the data buffer is written to disk. But when this time of fsyncduration more than one second when calling. Redis will take delay fsyncof strategy, wait a second. Is carried out in about 2 seconds after fsync, this time fsyncon no matter how long the implementation will be carried out. Since this time fsyncwill be blocked when the file descriptor, so the current write operation will be blocked. Therefore, the conclusion is that, in most cases, Redis will be carried out once every second fsync. In the worst case, two seconds will perform an fsyncoperation.

This operation is referred to in most database systems group commit, it is a combination of multiple data write operation, a one-time write logs to disk.

appednfsync always

When set to appendfsync alwaystime, every write operation will be called once fsync, when the data is the safest, of course, since each would perform fsync, so its performance will be affected.

What is the difference for pipelining

For the operation of pipelining, which is a particular client process sends N-time command, and then waits for the command to return the N results are returned together. By using pipilining it would mean giving up the confirmation of the return value of each command. Since in this case, N is the command executed in the same execution. When the set appendfsync to everysec, there may be some deviation, because it is possible to perform N commands for more than 1 seconds or 2 seconds. But we can guarantee that will not exceed the maximum time the execution time and the N command.

Comparison postgreSQL and MySQL

This piece is not to say, because the data security on top of the operating system level has been talked about a lot, so in fact different databases in achieving much the same. In short final conclusion is that, in the case of Redis open AOF, the stand-alone data security is not weaker than those of mature SQL database.

These data persistence What's the use, of course, it is for data recovery after the restart. Redis is an in-memory database, whether or RDB AOF, just to ensure that their data recovery measures. So Redis in the use of RDB and AOF recovery, will read RDB or AOF file, re-loaded into memory. MySQL database and other start-up time is relative, a lot of the president, because MySQL would have been no need to load data into memory.

But relatively speaking, the provision of services after startup MySQL, which is accessed hot data will gradually loaded into memory, we usually call it warm-up, but before the completion of warming up, its performance will not be too high. The advantage of Redis is a one-time load data into memory, the one-time warm-up. So long as Redis startup is complete, then the speed of service is very fast.

And in the use of RDB and start using the AOF, its start time there are some differences. RDB's start-up time will be shorter, there are two reasons, first, RDB file only one record for each piece of data, it will not be as likely to have multiple operations recorded a data log as AOF. So just write data once each on the line. Another reason is the storage format and the data encoding format Redis RDB file in memory is the same, no further data coding. On the CPU consumption is far less than the load AOF log.

 

Guess you like

Origin www.cnblogs.com/supperhan/p/11456885.html