Redis persistence and cache learning and understanding notes

Redis persistence:

Redis is an in-memory database. If the state of the database in memory is not saved to disk, once the server process exits, the state of the database in the server will also disappear. So Redis provides persistence!

RDB(Redis DataBase)

Insert picture description here
Write a snapshot of the data set in the memory to disk within a specified time interval, that is, a SnapShot snapshot. When it is restored, it reads the snapshot file directly into the memory.

Redis will separately create (fork) a child process for persistence, and will first write the data to a temporary file, until the persistence process is over. Use this temporary file to replace the last persisted file. During the whole process, the main process does not perform any io operations. This ensures extremely high performance. If large-scale data recovery is required and the integrity of data recovery is not very sensitive, the RDB method is more efficient than the AOF method. The disadvantage of RDB is that the last persistent data may be lost. Our default is RDB, under normal circumstances there is no need to modify this configuration!

Triggering mechanism:
1. When the save rules are met, the RDB rules will be automatically triggered
. 2. Executing the fluxhall command will also trigger our RDB rules
. 3. Exit redis and generate RDB files
. 4. Backup will automatically generate a dump. rdb

AOF(Append Only File)

Record all our commands, history, and execute this file when you restore!
Insert picture description here
Record each write operation in the form of a log. Record all the instructions executed by redis. Only append files are allowed, but files cannot be rewritten. Redis will read the files to rebuild the data at the beginning of redis startup. In other words, redis restarts will write instructions based on the contents of the log file. Execute once after arrival to complete the data recovery work.

Advantages:
1. Every modification is synchronized, and the integrity of the file will be better!
2. Sync once per second, may lose one second of data
Disadvantages:
1. Compared with data files, AOF is much larger than RDB, and the repair speed is slower than RDB!
2. Aof operation efficiency is slower than RDB, so our default configuration of redis is RDB persistence.

Expansion:

Open two persistence methods at the same time:

In this case, when redis restarts, the AOF file will be loaded first to restore the original data, because under normal circumstances the data set saved by the AOF file is more complete than the data set saved by the RDB file.

The RDB data is not real-time, and the server restarts when using both at the same time. It will only find the AOF file. Then, should I use AOF only? The author recommends not, because RDB is more suitable for backing up the database (AOF is not good for backup when it is constantly changing). Restart quickly, and there will be no AOF and there may be bugs, so you can keep it as a just in case.

Performance recommendations

Because the RDB file is only used for backup purposes, it is recommended to only persist the RDB file on the Slave, and it only needs to be backed up once every 15 minutes, and only save the 900 1 rule.

If you enable AOF, the advantage is that in the worst case, you will only lose no more than two seconds of data. The startup script is simpler and only load your own AOF file. The cost is that it brings continuous IO and the other is AOF rewrite. Finally, the blockage caused by writing the new data generated in the rewrite process to the new file is almost inevitable. As long as the hard disk is licensed, the frequency of AOF rewrite should be reduced as much as possible. The default value of 64M for AOF rewrite is too small and can be set above 5G. By default, the rewrite can be changed to an appropriate value if it exceeds 100% of the original size.

If AOF is not enabled, high availability can be achieved only by Master-Slave Repllcation, which can save a lot of IO and reduce system fluctuations caused by rewrite. The price is that if the Master/Slave are dumped at the same time, more than ten minutes of data will be lost. The startup script must also compare the RDB files in the two Master/Slave and load the newer one. Weibo is this architecture.

Redis cache

The problem to be considered is: cache penetration, cache breakdown and avalanche effect during invalidation

Cache penetration

Cache penetration refers to querying a data that must not exist. Because the cache is passively written when it is missed, and for fault tolerance, if the data cannot be found from the storage layer, it will not be written to the cache, which will lead to this non-existent Every time data is requested, the storage layer must be queried, which loses the meaning of caching.

During heavy traffic, the DB may be down. If someone uses a non-existent key to frequently attack our application, this is a vulnerability.

Solution
There are several ways you can effectively solve the problem of penetration of the cache, the most common is the use of a Bloom filter, all the hash data may be stored to a sufficiently large bitmap in a certain absence of data will be Bitmap intercepts, thereby avoiding the query pressure on the underlying system.

There is also a more simple and rude method. If the data returned by a query is empty (whether the data does not exist or the system is faulty), we still cache the empty result, but its expiration time will be very short, the longest No more than five minutes.

Cache avalanche

Cache avalanche means that the same expiration time is used when we set up the cache, which causes the cache to fail at a certain moment at the same time, and all requests are forwarded to the DB. The transient pressure of the DB is too heavy.

Solution
Redis high availability
The meaning of this idea is: since redis may hang, I will add a few more redis, so that after one hangs, the others can continue to work, in fact, it is a built cluster.

The avalanche effect of cache invalidation has a terrible impact on the underlying system. Most system designs consider using Gasoline or Queue to ensure the single-threaded write of the cache, so as to avoid a large number of concurrent requests falling on the underlying storage system during failure.

A simple solution to share here is to spread the cache expiration time. For example, we can add a random value on the basis of the original expiration time, such as 1-5 minutes random, so that the repetition rate of the expiration time of each cache will be reduced. It is difficult to cause a collective failure event.

Current limit downgrade
The idea of ​​this solution is to control the number of threads that read the database write cache through Gasoline or queue after the cache fails. For example, only one thread is allowed to query data and write cache for a key, and other threads wait.

Cache breakdown

For some keys that have an expiration time set, if these keys may be accessed extremely concurrently at certain points in time, it is a very "hot" data. At this time, there is a problem that needs to be considered: the problem of cache being "broken down". This and the cache avalanche are here for a certain key cache, the former is a lot of keys.

When the cache expires at a certain point in time, there are a large number of concurrent requests for this key at this point in time. These requests will generally load data from the back-end DB and reflect the cache when the cache expires. At this time, large concurrent requests may be Will instantly crush the back-end DB.

Solution:
Set hot data to never expire.
From the cache level, there is no expiration time set, so there will be no problems after the hot key expires.

Add mutex lock:
Distributed lock: Use distributed lock to ensure that there is only one thread to query the back-end service for each key at the same time. Other threads do not have the right to obtain the distributed lock, so you only need to wait. This method will The pressure of high concurrency is transferred to distributed locks, so the test of distributed locks is great.
Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_46011971/article/details/108897132