Redis persistence - data loss and solution

Data write-back mechanism
of Redis The data write-back mechanism of Redis is divided into two types: synchronous and asynchronous.
Synchronous write-back is the SAVE command, and the main process directly writes data back to the disk. In the case of large data, it will cause the system to freeze for a long time, so it is generally not recommended.
Asynchronous write-back is the BGSAVE command. After the main process forks, it copies itself and writes back to the disk through this new process. After the write-back is over, the new process closes itself. Since the main process does not need to be blocked in this way, the system will not die, and this method is generally adopted by default.
Personally, I feel that method 2 is very clumsy to fork the main process, but it seems to be the only method. Hot data in memory can be modified at any time, and the memory image must be frozen to keep it on disk for a certain period of time. Freezing will cause suspended animation. After forking a new process, it is equivalent to copying a memory image at that time, so that the main process does not need to be frozen, as long as the operation is performed on the child process.
Doing a fork on a process with small memory does not require too many resources, but when the memory space of the process is in units of G, fork becomes a terrifying operation. What's more, what about the process of fork 14G memory on the host with 16G memory? It will definitely report that the memory cannot be allocated. What's even more annoying is that the more frequently changed hosts are, the more frequent forks are, and the cost of the fork operation itself may not be much better than suspended animation.
After finding the reason, directly modify the /etc/sysctl.conf kernel parameter vm.overcommit_memory= 1
sysctl -p
The Linux kernel will decide whether to release or not according to the setting of the parameter vm.overcommit_memory parameter.
If vm.overcommit_memory = 1, directly release
vm.overcommit_memory = 0: Compare the virtual memory size allocated by this request with the current free physical memory of the system plus swap to decide whether to release.
vm.overcommit_memory= 2: It will compare all the allocated virtual memory of the process plus the virtual memory allocated by this request and the current free physical memory of the system plus swap to decide whether to release.
Redis persistence practice and disaster recovery simulation
Reference :
Redis Persistence http://redis.io/topics/persistence
Google Groups https://groups.google.com/forum/?fromgroups=#!forum/redis-db
1. Discussion and Understanding of Redis Persistence
At present , there are two ways of Redis persistence: RDB and AOF
First, we should clarify what the persistent data is for, and the answer is for data recovery after restart.
Redis is an in-memory database, whether it is RDB or AOF, it is only a measure to ensure data recovery.
Therefore, when Redis uses RDB and AOF for recovery, it will read the RDB or AOF file and reload it into memory.
RDB is Snapshot snapshot storage, which is the default persistence method.
It can be understood as a semi-persistent mode, that is, data is periodically saved to disk according to a certain strategy.
The corresponding generated data file is dump.rdb, and the snapshot period is defined by the save parameter in the configuration file.
The following are the default snapshot settings:
save 900 1 #When one Keys data is changed, refresh to Disk once in 900 seconds
save 300 10 #When 10 Keys data are changed, refresh to Disk once in 300 seconds
save 60 10000 #When 10,000 Keys data are changed,
the RDB file that is refreshed to Disk once in 60 seconds will not be broken, because its write operation is performed in a new process.
When a new RDB file is generated, the child process generated by Redis will first write the data to a temporary file, and then rename the temporary file to the RDB file through the atomic rename system call.
This way, at any time of failure, the Redis RDB file is always available.
At the same time, the RDB file of Redis is also a part of the internal implementation of Redis master-slave synchronization.
The first synchronization between the slave and the master is as follows: the
slave sends a synchronization request to the master, the master first dumps the rdb file, and then transfers the full amount of the rdb file to the slave, and then the master forwards the cached command to the slave, and the initial synchronization is completed.
The second and subsequent synchronization implementations are as follows:
The Master directly sends snapshots of variables to each Slave in real time.
But no matter what causes the Slave and Master to disconnect and reconnect, the process of the above two steps will be repeated.
The master-slave replication of Redis is based on the persistence of memory snapshots. As long as there is a Slave, there will be memory snapshots.
It can be clearly seen that RDB has its shortcomings, that is, once there is a problem with the database, the data saved in our RDB file is not completely new.
All data from the last time the RDB file was generated to when Redis was down is lost.
AOF (Append-Only File) has better persistence than RDB.
Because when using the AOF persistence method, Redis will append each received write command to the file through the Write function, similar to MySQL's binlog.
When Redis restarts, it rebuilds the entire database contents in memory by re-executing the write commands saved in the file.
The corresponding setting parameters are:
$ vim /opt/redis/etc/redis_6379.conf
appendonly yes #Enable AOF persistence mode
appendfilename appendonly.aof #The name of the AOF file, the default is appendonly.aof
# appendfsync always #Every time a write is received The command is immediately forced to write to disk, which is the most guaranteed complete persistence, but the slowest speed, generally not recommended.
appendfsync everysec #Force write to disk once per second, a good compromise between performance and persistence, is the recommended way.
# appendfsync no #Completely rely on OS writing, generally once every 30 seconds, the performance is the best but the persistence is the least guaranteed, and it is not recommended.
The fully persistent method of AOF also brings another problem, the persistent file will become larger and larger.
For example, if we call the INCR test command 100 times, all 100 commands must be saved in the file, but in fact 99 are redundant.
Because to restore the state of the database, it is enough to save a SET test 100 in the file.
In order to compress AOF persistent files, Redis provides the bgrewriteaof command.
After receiving this command, Redis will use a similar method to snapshot to save the data in memory to the temporary file in the form of command, and finally replace the original file, so as to control the growth of the AOF file.
Because it is a process of simulating a snapshot, the old AOF file is not read when rewriting the AOF file, but a new AOF file is rewritten with a command for the database contents in the entire memory.
The corresponding setting parameters are:
$ vim /opt/redis/etc/redis_6379.conf
no-appendfsync-on-rewrite yes #When the log is rewritten, the command append operation is not performed, but it is only placed in the buffer to avoid Conflict on DISK IO caused by appending with command.
auto-aof-rewrite-percentage 100 #When the current AOF file size is twice the size of the AOF file obtained from the last log rewrite, a new log rewrite process will be started automatically.
auto-aof-rewrite-min-size 64mb #The minimum value for the current AOF file to start the new log rewriting process, to avoid frequent rewriting due to the small file size when Reids is just started.
What to choose? The following is the official recommendation:
Generally, if you want to provide high data security, it is recommended that you use both persistence methods at the same time.
If you can live with a few minutes of data loss from a disaster, then you can just use RDB.
Many users only use AOF, but we recommend that since RDB can take a full snapshot of the data from time to time and provide faster restarts, it is better to use RDB as well.
Therefore, we hope to unify AOF and RDB into a persistence mode in the future (long-term plan).
In terms of data recovery:
RDB's startup time will be shorter for two reasons:
First, there is only one record for each data in the RDB file, and there may not be multiple operation records of a data like the AOF log. So each piece of data only needs to be written once.
Another reason is that the storage format of the RDB file is the same as the encoding format of the Redis data in the memory, and no data encoding work is required, so the CPU consumption is much smaller than the loading of the AOF log.
2. Disaster recovery simulation
Since the persistent data is used for data recovery after restart, it is very necessary for us to conduct such a disaster recovery simulation.
It is said that if the data is to be persisted and stability is to be ensured, it is recommended to leave half of the physical memory empty. Because when taking a snapshot, the child process that fork comes out to perform the dump operation will occupy the same memory as the parent process. The real copy-on-write has a relatively large impact on performance and memory consumption.
At present, the usual design idea is to use the Replication mechanism to make up for the deficiencies in the performance of aof and snapshot, so as to achieve data persistence.
That is, neither Snapshot nor AOF is performed on the Master to ensure the read and write performance of the Master, while on the Slave, both Snapshot and AOF are enabled for persistence and data security.
First, modify the following configuration on the Master:
$ sudo vim /opt/redis/etc/redis_6379.conf
#save 900 1 #Disable Snapshot
#save 300 10
#save 60 10000

appendonly no #Disable AOF
Next, modify the following configuration on the Slave :
$ sudo vim /opt/redis/etc/redis_6379.conf
save 900 1 #Enable Snapshot
save 300 10
save 60 10000

appendonly yes #Enable AOF
appendfilename appendonly.aof #AOF file name
# appendfsync always
appendfsync everysec #Force write to disk once per second
# appendfsync no 

no-appendfsync-on-rewrite yes #During log rewriting, no command append operation is performed
auto-aof-rewrite-percentage 100 #Automatically start a new log rewriting process
auto-aof-rewrite-min-size 64mb #The minimum value to start a new log rewriting process Value Start Master and Slave
respectively
$ /etc/init.d/redis start
After the startup is complete, confirm that the Snapshot parameter is not started in the Master
redis 127.0.0.1:6379> CONFIG GET save
1) "save"
2) ""
Then pass the following script Generate 250,000 pieces of data in the Master:
dongguo@redis:/opt/redis/data/6379$ cat redis-cli-generate.temp.sh
#!/bin/bash

REDISCLI="redis-cli -a slavepass -n 1 SET"
ID=1

while(($ID<50001))
do
  INSTANCE_NAME="i-2-$ID-VM"
  UUID=`cat /proc/sys/kernel/random/uuid`
  PRIVATE_IP_ADDRESS=10.`echo "$RANDOM % 255 + 1" | bc`.`echo "$RANDOM % 255 + 1" | bc`.`echo "$RANDOM % 255 + 1" | bc`\
  CREATED=`date "+%Y-%m-%d %H:%M:%S"`

  $REDISCLI vm_instance:$ID:instance_name "$INSTANCE_NAME"
  $REDISCLI vm_instance:$ID:uuid "$UUID"
  $REDISCLI vm_instance:$ID:private_ip_address "$PRIVATE_IP_ADDRESS"
  $REDISCLI vm_instance:$ID:created "$CREATED"

  $REDISCLI vm_instance:$INSTANCE_NAME:id "$ID"

  ID=$(($ID+1))
done
dongguo@redis:/opt/redis/data/6379$ ./redis-cli-generate.temp.sh
During the data generation process, you can clearly see that the master is only created when the slave is synchronized for the first time dump.rdb file, and then send it to Slave by means of incremental transfer command.
The dump.rdb file did not grow any more.
dongguo@redis:/opt/redis/data/6379$ ls -lh
total 4.0K
-rw-r--r-- 1 root root 10 Sep 27 00:40 dump.rdb
and on Slave you can see dump.rdb The files and AOF files are constantly increasing, and the growth rate of the AOF file is significantly greater than that of the dump.rdb file.
dongguo@redis-slave:/opt/redis/data/6379$ ls -lh
total 24M
-rw-r--r-- 1 root root 15M Sep 27 12:06 appendonly.aof
-rw-r--r-- 1 root root 9.2M Sep 27 12:06 dump.rdb After
waiting for data insertion, first confirm the current data volume.
redis 127.0.0.1:6379> info
redis_version:2.4.17
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
gcc_version:4.4.5
process_id:27623
run_id:e00757f7b2d6885fa9811540df9dfed39430b642
uptime_in_seconds:1541
uptime_in_days:0
lru_clock:650187
used_cpu_sys:69.28
used_cpu_user:7.67
used_cpu_sys_children:0.00
used_cpu_user_children:0.00
connected_clients:1
connected_slaves:1
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:33055824
used_memory_human:31.52M
used_memory_rss:34717696
used_memory_peak:33055800
used_memory_peak_human:31.52M
mem_fragmentation_ratio:1.05
mem_allocator: that jemalloc-3.0.0
loading: 0
aof_enabled: 0
changes_since_last_save: 250000
bgsave_in_progress: 0
last_save_time: 1348677645
bgrewriteaof_in_progress: 0
total_connections_received: 250 007
total_commands_processed: 750 019
expired_keys: 0
evicted_keys: 0
keyspace_hits: 0
keyspace_misses: 0
pubsub_channels: 0
pubsub_patterns: 0
latest_fork_usec :246
vm_enabled:0
role:master
slave0:10.6.1.144,6379,online
db1:keys=250000,expires=0
The current data volume is 250,000 keys, occupying 31.52M of memory.
Then we directly kill the Redis process of the Master to simulate a disaster.
dongguo@redis:/opt/redis/data/6379$ sudo killall -9 redis-server
我们到Slave中查看状态:
redis 127.0.0.1:6379> info
redis_version:2.4.17
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
gcc_version:4.4.5
process_id:13003
run_id:9b8b398fc63a26d160bf58df90cf437acce1d364
uptime_in_seconds:1627
uptime_in_days:0
lru_clock:654181
used_cpu_sys:29.69
used_cpu_user:1.21
used_cpu_sys_children:1.70
used_cpu_user_children:1.23
connected_clients:1
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:33047696
used_memory_human:31.52M
used_memory_rss:34775040
used_memory_peak:33064400
used_memory_peak_human:31.53M
mem_fragmentation_ratio:1.05
mem_allocator:jemalloc-3.0.0
loading:0
aof_enabled:1
changes_since_last_save:3308
bgsave_in_progress:0
last_save_time:1348718951
bgrewriteaof_in_progress:0
total_connections_received:4
total_commands_processed:250308
expired_keys:0
evicted_keys:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:694
vm_enabled: 0 Role: Slave aof_current_size: 17,908,619
aof_base_size : 16,787,337 aof_pending_rewrite: 0 aof_buffer_length: 0 aof_pending_bio_fsync: 0 MASTER_HOST: 10.6.1.143 MASTER_PORT : 6379 master_link_status: Down master_last_io_seconds_ago: -1 master_sync_in_progress: 0 master_link_down_since_seconds: 25 slave_priority: 100 DB1 : Keys = 250000 , expires=0, you can see that the status of master_link_status is already down, and the Master is no longer accessible. At this time, Slave is still running well, and retains AOF and RDB files. Next, we will restore the data on the Master through the AOF and RDB files saved on the Slave. First, cancel the synchronization state on the slave to prevent the master database from restarting before data recovery is completed, and then directly overwrite the data on the slave database, resulting in all data loss. redis 127.0.0.1:6379> SLAVEOF NO ONE OK



















确认一下已经没有了master相关的配置信息:
redis 127.0.0.1:6379> INFO
redis_version:2.4.17
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
gcc_version:4.4.5
process_id:13003
run_id:9b8b398fc63a26d160bf58df90cf437acce1d364
uptime_in_seconds:1961
uptime_in_days:0
lru_clock:654215
used_cpu_sys:29.98
used_cpu_user:1.22
used_cpu_sys_children:1.76
used_cpu_user_children:1.42
connected_clients:1
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:33047696
used_memory_human:31.52M
used_memory_rss:34779136
used_memory_peak:33064400
used_memory_peak_human:31.53M
mem_fragmentation_ratio:1.05
mem_allocator:jemalloc-3.0.0
loading:0
aof_enabled:1
changes_since_last_save:0
bgsave_in_progress:0
last_save_time:1348719252
bgrewriteaof_in_progress:0
total_connections_received:4
total_commands_processed:250311
expired_keys:0
evicted_keys:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:1119
vm_enabled:0
role:master
aof_current_size:17908619
aof_base_size:16787337
aof_pending_rewrite:0
aof_buffer_length:0
aof_pending_bio_fsync:0
db1:keys=250000,expires=0
Copy data files on Slave:
dongguo@redis-slave:/opt/redis/data/6379$ tar cvf /home/dongguo/ data.tar *
appendonly.aof
dump.rdb
Upload data.tar to the Master and try to restore the data:
you can see that there is a data file for initializing the Slave in the Master directory, which is very small, so delete it.
dongguo@redis:/opt/redis/data/6379$ ls -l
total 4
-rw-r--r-- 1 root root 10 Sep 27 00:40 dump.rdb
dongguo@redis:/opt/redis/data/ 6379$ sudo rm -f dump.rdb
then extract the data file:
dongguo@redis:/opt/redis/data/6379$ sudo tar xf
/home/dongguo/data.tar dongguo@redis:/opt/redis/data/ 6379$ ls -lh
total 29M
-rw-r--r-- 1 root root 18M Sep 27 01:22 appendonly.aof
-rw-r--r-- 1 root root 12M Sep 27 01:22 dump.rdb
启动Master上的Redis;
dongguo@redis:/opt/redis/data/6379$ sudo /etc/init.d/redis start
Starting Redis server...
查看数据是否恢复:
redis 127.0.0.1:6379> INFO
redis_version:2.4.17
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
gcc_version:4.4.5
process_id:16959
run_id:6e5ba6c053583414e75353b283597ea404494926
uptime_in_seconds:22
uptime_in_days:0
lru_clock:650292
used_cpu_sys:0.18
used_cpu_user:0.20
used_cpu_sys_children:0.00
used_cpu_user_children:0.00
connected_clients:1
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:33047216
used_memory_human:31.52M
used_memory_rss:34623488
used_memory_peak:33047192
used_memory_peak_human:31.52M
mem_fragmentation_ratio:1.05
mem_allocator:jemalloc-3.0.0
loading:0
aof_enabled:0
changes_since_last_save:0
bgsave_in_progress:0
last_save_time:1348680180
bgrewriteaof_in_progress:0
total_connections_received:1
total_commands_processed:1
expired_keys:0
evicted_keys:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:0
vm_enabled:0
role:master
db1:keys=250000,expires=0,
you can see that 250,000 pieces of data have been completely restored to the master.
At this point, you can safely restore the synchronization settings of the slave.
redis 127.0.0.1:6379> SLAVEOF 10.6.1.143 6379
OK
Check synchronization status:
redis 127.0.0.1:6379> INFO
redis_version:2.4.17
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
gcc_version:4.4.5
process_id
run_id:9b8b398fc63a26d160bf58df90cf437acce1d364
uptime_in_seconds:2652
uptime_in_days:0
lru_clock:654284
used_cpu_sys:30.01
used_cpu_user:2.12
used_cpu_sys_children:1.76
used_cpu_user_children:1.42
connected_clients:2
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:33056288
used_memory_human:31.52M
used_memory_rss:34766848
used_memory_peak:33064400
used_memory_peak_human:31.53M
mem_fragmentation_ratio:1.05
mem_allocator:jemalloc-3.0.0
loading:0
aof_enabled:1
changes_since_last_save:0
bgsave_in_progress:0
last_save_time:1348719252
bgrewriteaof_in_progress:1
total_connections_received:6
total_commands_processed:250313
expired_keys:0
evicted_keys:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:12217
vm_enabled:0
role:slave
aof_current_size:17908619
aof_base_size:16787337
aof_pending_rewrite:0
aof_buffer_length:0
aof_pending_bio_fsync:0
master_host:10.6.1.143
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_priority:100
db1:keys=250000,expires=0
master_link_status显示为up,同步状态正常。
In the process of this recovery, we copied the AOF and RDB files at the same time, so which file has completed the data recovery?
In fact, when the Redis server hangs, the data will be restored to the memory according to the following priorities during restart:
1. If only AOF is configured, the AOF file will be loaded to restore the data when restarting;
2. If both RDB and AOF are configured, the startup will only be performed Load the AOF file to restore the data;
3. If only RDB is configured, the dump file will be loaded to restore the data at startup.
That is to say, AOF has higher priority than RDB, which is also understandable, because AOF itself guarantees data integrity higher than RDB.
In this case, we ensured the data by enabling AOF and RDB on the Slave and restored the Master.
However, in our current online environment, since the data has an expiration time, the AOF method will not be practical. Too frequent write operations will make the AOF file grow to an abnormally large size, which greatly exceeds our actual data volume. , which also results in a huge amount of time during data recovery.
Therefore, you can only enable Snapshot on the Slave for localization. At the same time, you can consider increasing the frequency in save or calling a scheduled task to perform periodic bgsave snapshot storage to ensure the integrity of localized data as much as possible.
Under such an architecture, if only the Master hangs up, the Slave is complete and the data recovery can reach 100%.
If the Master and Slave hang at the same time, the data recovery can also reach an acceptable level.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327041180&siteId=291194637