Two ways of Redis data persistence

Redis data persistence

Data persistence is mainly to write the data in memory to disk. The data saved by redis is in memory. When redis crashes, the data stored in redis will disappear, which may lead to data loss. So how to solve this problem? This requires us to persist the data in redis.

Two methods of redis persistence

image-20230319100338554

Data recovery in the event of a downtime

image-20230319100441056

RDB

RDB (Redis DataBase) data persistence is to generate a snapshot (snapshot, the default is dump.rdb) of the data in the memory at the specified time at the specified time, and then save the snapshot to the disk, so that even if redis is down, the next time you can By directly reading the hard disk snapshot files into the memory, the effect of data recovery can be achieved.

At specified intervals, a point-in-time snapshot of the dataset is taken.

image-20230319153929623

Profiles (6 VS 7)

Redis6

image-20230319101212262

Redis7

image-20230319101358523

trigger method

automatic trigger

Modify the configuration file
vim /myredis/redis7.conf

# 5秒钟2次修改
save 5 2
# 修改dump文件保存路径,路径需要存在
dir /myredis/dumpfiles
# 修改dump文件名称
dbfilename dump6379.rdb
trigger backup
# 执行上面的两个命令后,会再我们配置的 dir中生成 备份文件
set key1 value1 
set key2 value2
restore backup

Move the backup file dump.rdb to the redis installation directory and start the service

important:

  • Executing the flushall and flushdb commands will also generate a backup file, but the backup file is empty.

  • The redis server and backup must be isolated from each other. Prevent the loss of backup files after the physical machine is damaged.

manual trigger

save

Execution in the main program will block the current redis server until the persistence work is completed. During the execution of the save command, redis cannot process other commands, and it is forbidden to use it online

image-20230319133321871

bgsave (default)

Redis will use bgsave to take a snapshot of all the data in the current memory. This operation is done by the child process in the background, which allows the main process to modify the data at the same time.

image-20230319133337021

Get the time of the last execution snapshot through the lastsave command

lastsave

advantage

  • RDB is a very compact single-file point-in-time representation of Redis data. RDB files are great for backups. For example, an RDB file can be archived hourly for the last 24 hours and an RDB snapshot saved daily for 30 days, which allows for easy recovery of different versions of a dataset in the event of a disaster.
  • RDB is very suitable for disaster recovery, he can be transferred to a remote data center or cloud server.
  • RDB maximizes Redis performance because the only work a Redis parent process needs to do for persistence is to fork a child process that will do all the rest. The parent process never does disk I/O or similar.
  • RDB allows faster restarts with large datasets compared to AOF
  • On replicas, RDB supports restart and partial resync after failover.

Summarize

  • Suitable for large-scale data recovery
  • Backup according to business schedule
  • Low requirements for data integrity and consistency
  • RDB files load much faster in memory than AOF

shortcoming

  • RDB is not good if you need to minimize the chance of data loss when Redis stops working (e.g. power outage). You can configure different savepoints for multiple RDBs. If Redis stops working for any reason without shutting down properly, the latest minute of data will be lost
  • RDB needs to fork() frequently in order to use subprocesses to persist on disk. If the data set is large, fork() may consume more time, and if the data set is large and the CPU performance is not good, it may cause Redis to stop the client end services for a few milliseconds or even a second. AOF also requires fork() but the frequency is lower, and the frequency of rewriting the log can be adjusted without any trade-off for durability.

Summarize

  • Make a backup at a certain interval. If redis goes down, the data from the current to the latest snapshot period will be lost.
  • The memory data is fully synchronized. If the amount of data is large, I/O will seriously affect server performance.
  • RDB relies on the fork of the main process, which will cause an instant delay in the server in a large data set. When fork(), a copy of the data in memory is cloned, resulting in volume expansion.

the case

Data loss case, simulate redis downtime

Set save 5 2, 2 modifications in 5 seconds

①, normal write

# 正常写入数据
set k1 v1
set k2 v2

image-20230319142214370

⑤, simulate downtime

# 写入k3
set k3 v3
# 查看redis进程
ps -ef|grep redis
# kill,模拟宕机
kill -9 3878

image-20230319143624418

⑤, restart redis

Restart redis and find that k3 data is lost

image-20230319144351141

Repair dump.rdb file

cd /usr/local/bin
redis-check-rdb /myredis/dumpfiles/dump6379.rdb

image-20230319150017464

Trigger RDB snapshot

What situations trigger RDB snapshots?

  • The default snapshot configuration in the configuration file
  • Manual save/bgsave command
  • Execute the flushdb/flushall command
  • Execute shutdown and enable AOF persistence without setting
  • When master-slave replication, the master node automatically triggers

Disable RDB

  1. Order

    redis-cli config set save ""
    
  2. Configuration file redis.conf

    # 配置文件设置下面内容
    save ""
    

RDB optimization configuration items

Configuration file redis.conf

# 1.
save <seconds> <changes>
# 2.
dbfilename
# 3.
dir
# 4.默认yes,如果配置成no,表示你不在乎数据不一致或者有其他的手段发现和控制这种不一致,那么在快照写入失败时,也能确保redis继续接受新的写请求
stop-writes-on-bgsave-error
# 5.默认yes,对于存储到磁盘中的快照,可以设置是否进行压缩存储。如果是的话,redis会采用LZF算法进行压缩。如果你不想消耗CPU来进行压缩的话,可以设置为关闭此功能
rdbcompression
# 6.默认yes,在存储快照后,还可以让redis使用CRC64算法来进行数据校验,但是这样做会增加大约10%的性能消耗,如果希望获取到最大的性能提升,可以关闭此功能
rdbchecksum
# 7.在没有持久性的情况下删除复制中使用的RDB文件启用。默认情况下no,此选项是禁用的。
rdb-del-sync-files

AOF

AOF (Append Only File) data persistence is to record every write operation command to redis (appendonly.aof by default). When redis crashes, the saved operation command will be re-executed to achieve the effect of data recovery .

Record each write operation in the form of a log, record the write commands executed by redis (read operations are not recorded), only allow appending files and not rewrite files, and redis will read the file to rebuild the data at the beginning of startup, in other words , if redis restarts, the write command will be executed from the front to the back according to the content of the log file to complete the data recovery work.

image-20230320154624201

By default redis does not enable AOF, you need to manually set appendonly yes

work process

image-20230320091709004

serial number describe
1 As the source of commands, Client will have multiple sources and a steady stream of request commands.
2 After these commands arrive at the Redis Server, they are not directly written into the AOF file, but these commands are first put into the AOF cache for storage. The AOF buffer here is actually an area in the memory. The purpose of the existence is to write these commands to the disk after reaching a certain amount, so as to avoid frequent disk IO operations.
3 AOF buffer will write commands to the AOF file on disk according to the three write-back strategies of AOF buffer*** synchronous files***.
4 With the increase of written AOF content, in order to avoid file expansion, the combination of commands (also known as * AOF rewriting)* will be performed according to the rules , so as to achieve the purpose of AOF file compression.
5 When the Redis Server server restarts, it will load data from the AOF file.

Three Writeback Strategies

Always

Synchronous write back, write the log back to disk synchronously immediately after each command is executed

everysec

Write back every second, after each command is executed, just write the log back to the memory buffer of the AOF file, and write the contents of the buffer to the disk every second

no

Write back controlled by the operating system. After each command is executed, the log is first written to the memory buffer of the AOF file. The operating system decides when to write the buffer content back to the disk.

Summarize

configuration item write back timing advantage shortcoming
Always synchronous write back High reliability, almost no data loss Every write command must be written to the disk, which has a great impact on performance
Everysec write back per second moderate performance Lost data within 1 second when downtime
No OS-controlled writeback good performance More data lost during downtime

Profiles (6 VS 7)

Modify the configuration file

open aof

vim /myredis/redis7.conf
# 修改appendonly 为 yes
appendonly yes

Use the default writeback strategy

# 每秒写回
sppendfsync everysec

aof file save path

  • redis6: The AOF save file location is the same as the RDB save file location, both through the dir configuration in the redis.conf configuration file

  • **redis7: ** The path to save the appendonly.aof file used in redis7 is dir + appenddirname

    # 修改之前配置的 dir /myredis/dumpfiles 为下面内容
    dir /myredis
    # 配置 appendonlydirname
    appenddirname "appendonlydir"
    

aof file save name

  • redis6: There is and only one

  • redis7:

    • base basic file

    • incr incremental file

    • manifest list file:

      • base : The basic AOF, which is generally generated by subprocesses through rewriting, and there is only one file at most.

      • incr : Incremental AOF, which is generally created when the AOF rewrite starts, and there may be multiple files.

      • history : Historical AOF, which is changed from base and incr AOF. Every time AOF rewriting is successfully completed, the corresponding base and incr AOF before this AOF rewriting will become history, and history type AOF will be automatically redis delete.

      In order to manage these AOF files, we introduced a manifest file to track and manage these AOFs. At the same time, in order to facilitate AOF backup and copy, we put all AOF files and manifest files into a separate file directory, and the directory name is determined by the appenddirname configuration.

repair aof file

redis-check-aof --fix appendonly.aof.1.incr.aof

advantage

  • Using AOF Redis is more durable: you can have different fsync strategies: no fsync at all, fsync every second, fsync every query, with the default strategy of fsync every second, the write performance is still great. fsync is performed using a background thread, when no fsync is in progress the main thread will be struggling to perform writes, so you only lose a second of writes.
  • The AOF journal is an append-only journal, so there are no seek issues, nor corruption issues on power outages. Even if for some reason (disk full or whatever) the log ends up with half written commands, the redis-check-aof tool is able to fix it easily
  • When AOF becomes too large, Redis can automatically rewrite AOF in the background. Rewriting is completely safe because while Redis continues appending to the old file, a brand new file is generated using the minimal set of operations required to create the current dataset, and once the second file is ready, Redis switches the two and Start attaching to the new one.
  • AOF in turn wraps the logs of all operations in an easy-to-understand and parse format. You can even easily export AOF files. For example, even if you accidentally flushed everything with the FLUSHALL command, you can still save your dataset by stopping the server, removing the latest command, and restarting Redis, as long as no log rewriting was performed in the meantime.

Summarize

Better protection against data loss, high performance, and emergency recovery.

shortcoming

  • AOF files are typically larger than equivalent RDB files for the same dataset.
  • Depending on the exact fsync strategy, AOF can be slower than RDB. In general, performance per second with fsync set is still very high, and with fsync disabled it should be as fast as RDB even under heavy load. RDB is still able to provide more guarantees about maximum latency even under huge write load.

Summarize

For the data of the same data set, the aof file is much larger than the rdb file, and the recovery speed is slower than that of rdb. The operation efficiency of aof is slower than that of rdb. The synchronization strategy per second is more efficient, and the non-synchronization efficiency is the same as that of rdb.

AOF rewriting mechanism

Since AOF persistence means that Redis continuously records write commands into the AOF file, as Redis continues, the AOF file will become larger and larger, occupying more server memory and requiring longer AOF recovery time.

In order to solve this problem, Redis has added a rewriting mechanism. When the size of the AOF file exceeds the set peak value, Redis will automatically start the content compression of the AOF file, and only keep the minimum instruction set that can restore the data or manually use the command bgrewriteaof to rewrite.

Summary : Start the content compression of the AOF file, and only keep the minimum instruction set for recoverable data.

Why do you need to rewrite?

set k1 v1
set k1 v3
set k1 v3
set k1 v4

If the above four commands are not rewritten, then all four commands will be saved in the aof file, which not only takes up space, but needs to execute four commands at a time at startup, but in fact we only need the effect of the last command execution.

After rewriting is turned on, only the last command needs to be kept.

trigger method

automatic trigger

image-20230320143811398

  1. According to the aof size after the last rewrite, judge whether the current aof size has doubled
  2. Rewrite is a file size that satisfies

Triggered when two conditions are met at the same time

manual trigger

The client sends the bgrewriteaof command to the server

The rewriting of the AOF file is not to rearrange the original file, but to directly read the existing key-value pairs in the server, and then use one command to replace the multiple commands that previously recorded the key-value pair to generate a new file To replace the original AOF file.

the case

Preparation

# 开启 aof
appendonly yes
# 重写峰值,修改为 1k
auto-aof-rewrite-min-size 1k
# 关闭混合,修改为 no
aof-use-rdb-preamble no
# 删除之前全部的 aof和rdb(省略代码)

Automatically trigger a case

Start the redis service and keep writing content to it until it reaches 1KB, then start the rewriting mechanism.

Manually trigger a case

# 发送命令
bgrewriteaof

rewriting principle

  1. Before rewriting starts, redis will create a "rewriting sub-process", which will read the existing AOF file, analyze and compress the instructions contained in it, and write it to a temporary file.
  2. At the same time, the main process will accumulate the newly received write commands into the memory buffer while continuing to write them into the original AOF file. This is to ensure the availability of the original AOF file and avoid rewriting There was an accident in the process.
  3. When the "rewriting child process" completes the rewriting work, it will send a signal to the parent process, and the parent process will append the write command cached in the memory to the new AOF file after receiving the signal
  4. When the addition is over, redis will replace the old AOF file with the new AOF file, and then there will be new write instructions, which will be appended to the new AOF file
  5. The operation of rewriting the aof file does not read the old aof file, but rewrites the entire in-memory database content to a new aof file by command, which is somewhat similar to the snapshot

RDB + AOF

  • The RDB persistence method can store snapshots of data at specified time intervals

  • The AOF persistence method records each write operation to the server. When the server restarts, these commands are automatically executed to restore the original data. The AOF command appends and saves each write operation to the end of the file with the redis protocol.

RDB and AOF can coexist. If AOF is enabled, the AOF file will be loaded first to restore the original data, because in general, the data set saved by AOF is more complete than the data set saved by RDB file.

Do you want to use only AOF?

It is not recommended, because RDB is more suitable for backing up the database, and AOF is not good for backup because it is constantly changing.

image-20230320165051626

recommended way

RDB + AOF Hybrid

  1. Enable blend mode

    aof-use-rdb-preamble的值为 yes
    
  2. RDB + AOF Hybrid

    RDB image for full persistence, AOF for incremental persistence

    First use RDB for snapshot storage, and then use AOF to persistently record all write operations. When the rewriting strategy is satisfied or manual rewriting is triggered, the latest data is stored as a new RDB record. In this case, when the service is restarted, the data will be restored from the RDB and AOF parts, which not only ensures data integrity, but also improves the performance of restoring data. To put it simply: some of the files generated by the hybrid persistence method are in RDB format, and some are in AOF format. AOF includes RDB header + AOF mix

pure cache

Close RDB + AOF at the same time

  • Disable RDB

    When rdb persistence mode is disabled, we can still use the commands save and bgsave to generate rdb files

    save ""
    
  • Disable AOF

    Disable aof persistence mode, we can still use the command bgrewriteaof to generate aof files

    appendonly no
    

✨✨✨✨If you want to know more about Redis, you can follow the official account of the author

Guess you like

Origin blog.csdn.net/weixin_52372879/article/details/129801918