redis persistence strategy RDB and AOF

Redis persistence:

Redis provides a variety of persistence methods at different levels: one is RDB, the other is AOF.

RDB persistence can generate point-in-time snapshots of datasets at specified time intervals.

AOF persistently records all write commands performed by the server and restores the dataset by re-executing these commands when the server starts. The commands in the AOF file are all saved in the format of the Redis protocol, and new commands will be appended to the end of the file. Redis can also rewrite the AOF file in the background, so that the size of the AOF file does not exceed the actual size required to save the state of the dataset. Redis can also use both AOF persistence and RDB persistence. In this case, when Redis restarts, it will preferentially use the AOF file to restore the dataset, because the AOF file usually saves a more complete dataset than the RDB file. You can even turn off persistence so that the data only exists while the server is running.

It is important to understand the similarities and differences between RDB persistence and AOF persistence. The following subsections will introduce these two persistence features in detail, and explain their similarities and differences.

Advantages of RDBs:

An RDB is a very compact file that holds a Redis dataset at a point in time. This kind of file is great for backups: for example, you can back up an RDB file every hour for the last 24 hours, and an RDB file every day of the month. This way, you can always restore your dataset to a different version, even if you run into problems. RDB is great for disaster recovery: it's a single file, and it's all so compact that it can be sent (after encryption) to another data center, or to Amazon S3. RDB can maximize the performance of Redis: the only thing the parent process has to do when saving the RDB file is to fork a child process, and then the child process will handle all the next save work, and the parent process does not need to perform any disk I/O operations . RDB is faster than AOF when recovering large datasets.

Disadvantages of RDB:

If you need to try to avoid losing data in the event of a server failure, then RDB is not for you. Although Redis allows you to set various save points to control how often the RDB file is saved, it is not an easy operation because the RDB file needs to save the state of the entire dataset. So you may save RDB files at least once every 5 minutes. In this case, in the event of an outage, you could lose several minutes of data. Every time the RDB is saved, Redis will fork() a child process, and the child process will do the actual persistence work. When the data set is large, fork() can be very time-consuming, causing the server to stop processing clients within a certain millisecond; if the data set is very large and CPU time is very tight, this stop time may even be long for a full second. Although AOF rewrites also require fork(), no matter how long the interval between executions of AOF rewrites is, there is no loss of data durability.

Advantages of AOF:

For the same dataset, the size of AOF files is usually larger than that of RDB files. Depending on the fsync strategy used, AOF may be slower than RDB. In general, the performance of fsync per second is still very high, and turning off fsync can make AOF as fast as RDB, even under heavy load. However, when dealing with huge write loads, RDB can provide a more guaranteed maximum latency (latency). AOF has had such bugs in the past: due to individual commands, when the AOF file is reloaded, the dataset cannot be restored to the original state when it was saved. (The blocking command BRPOPLPUSH, for example, has caused such a bug.) Tests are added to the test suite for this situation: they automatically generate random, complex data sets and reload them to ensure everything normal. Although this kind of bug is not common in AOF files, it is almost impossible for RDB to have this kind of bug in comparison.

RDB and AOF, which one should I use?

In general, if you want to achieve data security comparable to PostgreSQL, you should use both persistence features. If you care a lot about your data, but can still tolerate data loss within minutes, then you can just use RDB persistence. There are many users who only use AOF persistence, but we do not recommend this method: because it is very convenient to generate RDB snapshots regularly for database backup, and the speed of RDB recovery of data sets is faster than that of AOF recovery, In addition, using RDB can avoid the bug of AOF program mentioned earlier. For the reasons mentioned above, in the future we may integrate AOF and RDB into a single persistence model. (This is a long-term plan.)

RDB snapshot:

By default, Redis saves database snapshots in a binary file named dump.rdb. You can set Redis to automatically save the dataset once when the condition of "at least M changes to the dataset in N seconds" is met. You can also manually let Redis save the dataset by calling SAVE or BGSAVE. For example, the following setting will make Redis automatically save a dataset once when the condition "at least 1000 keys have been changed within 60 seconds" is met: 
save 60 1000 
This persistence method is called a snapshot.

How snapshots work:

When Redis needs to save the dump.rdb file, the server does the following: 
Redis calls fork() with both parent and child processes. 
The subprocess writes the dataset to a temporary RDB file. 
When the child process finishes writing to the new RDB file, Redis replaces the old RDB file with the new RDB file and deletes the old RDB file. 
This way of working allows Redis to benefit from a copy-on-write mechanism. 
Append-only file (AOF) 
snapshots are not very durable: if Redis goes down for some reason, the server will lose the most recent writes that haven't been saved to the snapshot those data. Although for some programs, data durability is not the most important consideration, for those programs that pursue full durability, the snapshot function is not suitable. 
Since version 1.1, Redis has added a fully durable persistence method: AOF persistence. 
You can turn on the AOF feature by modifying the configuration file: 
appendonly yes 
From now on, whenever Redis executes a command that changes the dataset (eg SET), this command will be appended to the end of the AOF file. 
In this way, when Redis is restarted, the program can rebuild the dataset by re-executing the commands in the AOF file.

AOF rewrite:

因为 AOF 的运作方式是不断地将命令追加到文件的末尾, 所以随着写入命令的不断增加, AOF 文件的体积也会变得越来越大。举个例子, 如果你对一个计数器调用了 100 次 INCR , 那么仅仅是为了保存这个计数器的当前值, AOF 文件就需要使用 100 条记录(entry)。然而在实际上, 只使用一条 SET 命令已经足以保存计数器的当前值了, 其余 99 条记录实际上都是多余的。为了处理这种情况, Redis 支持一种有趣的特性: 可以在不打断服务客户端的情况下, 对 AOF 文件进行重建(rebuild)。执行 BGREWRITEAOF 命令, Redis 将生成一个新的 AOF 文件, 这个文件包含重建当前数据集所需的最少命令。

AOF 有多耐久? 
你可以配置 Redis 多久才将数据 fsync 到磁盘一次。 
有三个选项: 
每次有新命令追加到 AOF 文件时就执行一次 fsync :非常慢,也非常安全。 
每秒 fsync 一次:足够快(和使用 RDB 持久化差不多),并且在故障时只会丢失 1 秒钟的数据。 
从不 fsync :将数据交给操作系统来处理。更快,也更不安全的选择。 
推荐(并且也是默认)的措施为每秒 fsync 一次, 这种 fsync 策略可以兼顾速度和安全性。 
总是 fsync 的策略在实际使用中非常慢, 即使在 Redis 2.0 对相关的程序进行了改进之后仍是如此 —— 频繁调用 fsync 注定了这种策略不可能快得起来。

如果 AOF 文件出错了,怎么办?

服务器可能在程序正在对 AOF 文件进行写入时停机, 如果停机造成了 AOF 文件出错(corrupt), 那么 Redis 在重启时会拒绝载入这个 AOF 文件, 从而确保数据的一致性不会被破坏。

当发生这种情况时, 可以用以下方法来修复出错的 AOF 文件:

为现有的 AOF 文件创建一个备份。 
使用 Redis 附带的 redis-check-aof 程序,对原来的 AOF 文件进行修复。 
$ redis-check-aof –fix 
(可选)使用 diff -u 对比修复后的 AOF 文件和原始 AOF 文件的备份,查看两个文件之间的不同之处。 
重启 Redis 服务器,等待服务器载入修复后的 AOF 文件,并进行数据恢复。 
AOF 的运作方式 
AOF 重写和 RDB 创建快照一样,都巧妙地利用了写时复制机制。

以下是 AOF 重写的执行步骤:

Redis 执行 fork() ,现在同时拥有父进程和子进程。 
子进程开始将新 AOF 文件的内容写入到临时文件。对于所有新执行的写入命令,父进程一边将它们累积到一个内存缓存中,一边将这些改动追加到现有 AOF 文件的末尾: 这样即使在重写的中途发生停机,现有的 AOF 文件也还是安全的。当子进程完成重写工作时,它给父进程发送一个信号,父进程在接收到信号之后,将内存缓存中的所有数据追加到新 AOF 文件的末尾。现在 Redis 原子地用新文件替换旧文件,之后所有命令都会直接追加到新 AOF 文件的末尾。

为最新的 dump.rdb 文件创建一个备份。 
将备份放到一个安全的地方。 
执行以下两条命令: 
redis-cli> CONFIG SET appendonly yes 
redis-cli> CONFIG SET save “” 
确保命令执行之后,数据库的键的数量没有改变。 
确保写命令会被正确地追加到 AOF 文件的末尾。 
步骤 3 执行的第一条命令开启了 AOF 功能: Redis 会阻塞直到初始 AOF 文件创建完成为止, 之后 Redis 会继续处理命令请求, 并开始将写入命令追加到 AOF 文件末尾。 
步骤 3 执行的第二条命令用于关闭 RDB 功能。 这一步是可选的, 如果你愿意的话, 也可以同时使用 RDB 和 AOF 这两种持久化功能。 
别忘了在 redis.conf 中打开 AOF 功能! 否则的话, 服务器重启之后, 之前通过 CONFIG SET 设置的配置就会被遗忘, 程序会按原来的配置来启动服务器。

RDB 和 AOF 之间的相互作用:

在版本号大于等于 2.4 的 Redis 中, BGSAVE 执行的过程中, 不可以执行 BGREWRITEAOF 。 反过来说, 在 BGREWRITEAOF 执行的过程中, 也不可以执行 BGSAVE 。 
这可以防止两个 Redis 后台进程同时对磁盘进行大量的 I/O 操作。 
如果 BGSAVE 正在执行, 并且用户显示地调用 BGREWRITEAOF 命令, 那么服务器将向用户回复一个 OK 状态, 并告知用户, BGREWRITEAOF 已经被预定执行: 一旦 BGSAVE 执行完毕, BGREWRITEAOF 就会正式开始。当 Redis 启动时, 如果 RDB 持久化和 AOF 持久化都被打开了, 那么程序会优先使用 AOF 文件来恢复数据集, 因为 AOF 文件所保存的数据通常是最完整的。

备份 Redis 数据:

Redis 对于数据备份是非常友好的, 因为你可以在服务器运行的时候对 RDB 文件进行复制: RDB 文件一旦被创建, 就不会进行任何修改。 当服务器要创建一个新的 RDB 文件时, 它先将文件的内容保存在一个临时文件里面, 当临时文件写入完毕时, 程序才使用 原子地用临时文件替换原来的 RDB 文件。这也就是说, 无论何时, 复制 RDB 文件都是绝对安全的。

转自:http://my.oschina.net/davehe/blog/174662

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326572102&siteId=291194637