A text with your depth understanding of the principles and persistent way of Redis

Redis persistence provides two ways, one is based on a snapshot of the form of RDB, and the other is based on the log form of AOF, each method has its own advantages and disadvantages, this article will introduce Redis persistence of these two ways, hope after reading this article you have a more comprehensive and clear understanding of the Redis both.

RDB persistent snapshots

RDB snapshots start chatting, RDB is a persistent way open Redis default, we do not need to open a separate, take a look at the configuration information associated with RDB:

################################ SNAPSHOTTING  ################################
#
# Save the DB on disk:
#
#   save <seconds> <changes>
#
#   Will save the DB if both the given number of seconds and the given
#   number of write operations against the DB occurred.
#
#   In the example below the behaviour will be to save:
#   after 900 sec (15 min) if at least 1 key changed
#   after 300 sec (5 min) if at least 10 keys changed
#   after 60 sec if at least 10000 keys changed
#   save ""
# 自动生成快照的触发机制 中间的是时间,单位秒,后面的是变更数据 60 秒变更 10000 条数据则自动生成快照
save 900 1
save 300 10
save 60 10000

# 生成快照失败时,主线程是否停止写入
stop-writes-on-bgsave-error yes

# 是否采用压缩算法存储
rdbcompression yes

# 数据恢复时是否检测 RDB文件有效性
rdbchecksum yes

# The filename where to dump the DB
# RDB 快照生成的文件名称
dbfilename dump.rdb

# 快照生成的路径 AOF 也是存放在这个路径下面
dir .

RDB much information about configuration, we need to adjust even less, we only need to modify the mechanism and generate a snapshot file storage path can be based on their volume of business.

There are two ways RDB persistence: manual trigger and automatic trigger , manual trigger using the following two commands:

  • the Save : will block the current Redis server to respond to other commands until RDB snapshot generation is complete, for example relatively large memory for a long time can cause blockage, so the online environment is not recommended

  • bgsave : Redis master process will fork a child process, RDB snapshot generation has to be responsible for the child, after completion, the child automatically end, bgsave only a short period of time in blocking fork child process, this process is very short, it is recommended use this command to manually trigger

In addition to performing manual trigger command outside, inside there Redis persistence mechanism automatically triggers the RDB, Redis will automatically trigger RDB persisted in the following cases :

  • Related save the configuration information in the configuration as above, we profile save 60 10000when it may be classified as configured "save mn" format, indicating the presence of n times within m seconds modification data set, automatically triggers bgsave.

  • In the case where the main, if the full amount of the copy operation performed from the node, the master node is performed automatically bgsave RDB generated from the node sends the file

  • Executing debug reload command to reload Redis, it will automatically trigger the save operation

  • When the shutdown command, if not open AOF persistence function is performed automatically by default bgsave

Above is the RDB persistent, it can be seen that the save command is relatively small, used in most cases are bgsave command, so this command bgsave there are some things that then we will take a look at the principles behind bgsave , starting with the flowchart begins to start:

bgsave operational flow chart

bgsave command about the following steps:

  • 1, the implementation bgsave commands, Redis main process to determine whether there is currently executing RDB / AOF child process, if there is, bgsave not execute commands directly back down.
  • 2, the parent process fork operation to create a child process, the parent process fork operation will be blocked after the completion of the parent process will fork blockage can not accept other commands.
  • 3, RDB child process to create a new file, generates a temporary snapshot file based on the parent process of the current memory data, after the completion of the replacement of the original file with the new RDB RDB file, and sent to the parent process generates a snapshot of RDB completion notification

The above command is bgsave something behind, RDB content on the same subject, we take a summary of RDB persistence of advantages and disadvantages, the advantages of RDB way :

  • RDB is a snapshot of a moment Redis data node memory, is very suitable to do a backup, or uploaded to a remote server file systems for disaster recovery
  • RDB data recovery is much faster than the AOF

There are disadvantages also exist advantages, disadvantages RDB are :

  • RDB persistent way no way to do real-time data persistence / second-level persistence. We already know bgsave command each time you run a fork operation should create a child process, belongs to the heavyweights operations, frequently performed high cost.
  • RDB 文件使用特定二进制格式保存,Redis 版本演进过程中有多个格式 的 RDB 版本,存在老版本 Redis 服务无法兼容新版 RDB 格式的问题

如果我们对数据要求比较高,每一秒的数据都不能丢,RDB 持久化方式肯定是不能够满足要求的,那 Redis 有没有办法满足呢,答案是有的,那就是接下来的 AOF 持久化方式

AOF 持久化方式

Redis 默认并没有开启 AOF 持久化方式,需要我们自行开启,在 redis.conf 配置文件中将 appendonly no 调整为 appendonly yes,这样就开启了 AOF 持久化,与 RDB 不同的是 AOF 是以记录操作命令的形式来持久化数据的,我们可以查看以下 AOF 的持久化文件 appendonly.aof

*2
$6
SELECT
$1
0
*3
$3
set
$6
mykey1
$6
你好
*3
$3
set
$4
key2
$5
hello
*1
$8

大概就是长这样的,具体的你可以查看你 Redis 服务器上的 appendonly.aof 配置文件,这也意味着我们可以在 appendonly.aof 文件中国修改值,等 Redis 重启时将会加载修改之后的值。看似一些简单的操作命令,其实从命令到 appendonly.aof 这个过程中非常有学问的,下面时 AOF 持久化流程图:

AOF persistent flow chart

在 AOF 持久化过程中有两个非常重要的操作:一个是将操作命令追加到 AOF_BUF 缓存区,另一个是 AOF_buf 缓存区数据同步到 AOF 文件,接下来我们详细聊一聊这两个操作:

1、为什么要将命令写入到 aof_buf 缓存区而不是直接写入到 aof 文件?

我们知道 Redis 是单线程响应,如果每次写入 AOF 命令都直接追加到磁盘上的 AOF 文件中,这样频繁的 IO 开销,Redis 的性能就完成取决于你的机器硬件了,为了提升 Redis 的响应效率就添加了一层 aof_buf 缓存层, 利用的是操作系统的 cache 技术,这样就提升了 Redis 的性能,虽然这样性能是解决了,但是同时也引入了一个问题,aof_buf 缓存区数据如何同步到 AOF 文件呢?由谁同步呢?这就是我们接下来要聊的一个操作:fsync 操作

2、aof_buf 缓存区数据如何同步到 aof 文件中?

aof_buf 缓存区数据写入到 aof 文件是有 linux 系统去完成的,由于 Linux 系统调度机制周期比较长,如果系统故障宕机了,意味着一个周期内的数据将全部丢失,这不是我们想要的,所以 Linux 提供了一个 fsync 命令,fsync 是针对单个文件操作(比如这里的 AOF 文件),做强制硬盘同步,fsync 将阻塞直到写入硬盘完成后返回,保证了数据持久化,正是由于有这个命令,所以 redis 提供了配置项让我们自行决定何时进行磁盘同步,redis 在 redis.conf 中提供了appendfsync 配置项,有如下三个选项:

# appendfsync always
appendfsync everysec
# appendfsync no
  • always:每次有写入命令都进行缓存区与磁盘数据同步,这样保证不会有数据丢失,但是这样会导致 redis 的吞吐量大大下降,下降到每秒只能支持几百的 TPS ,这违背了 redis 的设计,所以不推荐使用这种方式
  • everysec:这是 redis 默认的同步机制,虽然每秒同步一次数据,看上去时间也很快的,但是它对 redis 的吞吐量没有任何影响,每秒同步一次的话意味着最坏的情况下我们只会丢失 1 秒的数据, 推荐使用这种同步机制,兼顾性能和数据安全
  • no:不做任何处理,缓存区与 aof 文件同步交给系统去调度,操作系统同步调度的周期不固定,最长会有 30 秒的间隔,这样出故障了就会丢失比较多的数据。

这就是三种磁盘同步策略,但是你有没有注意到一个问题,AOF 文件都是追加的,随着服务器的运行 AOF 文件会越来越大,体积过大的 AOF 文件对 redis 服务器甚至是主机都会有影响,而且在 Redis 重启时加载过大的 AOF 文件需要过多的时间,这些都是不友好的,那 Redis 是如何解决这个问题的呢?Redis 引入了重写机制来解决 AOF 文件过大的问题。

3、Redis 是如何进行 AOF 文件重写的?

Redis AOF 文件重写是把 Redis 进程内的数据转化为写命令同步到新 AOF 文件的过程,重写之后的 AOF 文件会比旧的 AOF 文件占更小的体积,这是由以下几个原因导致的:

  • 进程内已经超时的数据不再写入文件
  • 旧的 AOF 文件含有无效命令,如 del key1、hdel key2、srem keys、set a111、set a222等。重写使用进程内数据直接生成,这样新的AOF文件只保 留最终数据的写入命令
  • 多条写命令可以合并为一个,如:lpush list a、lpush list b、lpush list c可以转化为:lpush list a b c。为了防止单条命令过大造成客户端缓冲区溢 出,对于 list、set、hash、zset 等类型操作,以 64 个元素为界拆分为多条。

重写之后的 AOF 文件体积更小了,不但能够节约磁盘空间,更重要的是在 Redis 数据恢复时,更小体积的 AOF 文件加载时间更短。AOF 文件重写跟 RDB 持久化一样分为手动触发自动触发,手动触发直接调用 bgrewriteaof 命令就好了,我们后面会详细聊一聊这个命令,自动触发就需要我们在 redis.conf 中修改以下几个配置

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
  • auto-aof-rewrite-percentage:代表当前 AOF文件空间 (aof_current_size)和上一次重写后 AOF 文件空间(aof_base_size)的比值,默认是 100%,也就是一样大的时候
  • auto-aof-rewrite-min-size:表示运行 AOF 重写时 AOF 文件最小体积,默认为 64MB,也就是说 AOF 文件最小为 64MB 才有可能触发重写

满足了这两个条件,Redis 就会自动触发 AOF 文件重写,AOF 文件重写的细节跟 RDB 持久化生成快照有点类似,下面是 AOF 文件重写流程图:

AOF rewrite file

AOF 文件重写也是交给子进程来完成,跟 RDB 生成快照很像,AOF 文件重写在重写期间建立了一个 aof_rewrite_buf 缓存区来保存重写期间主进程响应的命令,等新的 AOF 文件重写完成后,将这部分文件同步到新的 AOF 文件中,最后用新的 AOF 文件替换掉旧的 AOF 文件。需要注意的是在重写期间,旧的 AOF 文件依然会进行磁盘同步,这样做的目的是防止重写失败导致数据丢失,

Redis 持久化数据恢复

We know that Redis is based on memory, all data stored in memory, since the machine restarted downtime or other factors will lead to all of our data is lost, which is the reason for the persistence to do, when the server restarted, Redis will be loaded from the persistent data in the file, so that our data is restored to the data prior to the restart, in how this one Redis data recovery is achieved? Let's take a look at the flow chart of data recovery:

Redis data recovery

Redis data recovery process is relatively simple, the priority is to restore the AOF file, the file attempts to load RDB if AOF file does not exist, why RDB recovery faster than AOF files, but still will give priority to load the AOF documents? I personally think that a more complete and file data AOF AOF compatibility than RDB, should be noted that when there RDB / AOF, if the data loading is not successful, Redis service will fail to start.

At last

Currently on the Internet has a lot of heavyweights Redis tutorial series, any similarity, please forgive me up. The original is not easy, the code word is not easy, but also hope that we can support. If something incorrect in the text, but also look made, thank you.

Welcome scan code concern micro-channel public number: learn together "flat head brother's technology blog," the first Colombian peace, progress together.

Flathead brother of technical Bowen

Guess you like

Origin www.cnblogs.com/jamaler/p/11897144.html