Redis application of combat

In order to improve system throughput, we often introduce buffer layer in the business architecture.

  • Cache penetration
  • A collection of caching
  • Hot data cache
    • Use locks to ensure high consistency
    • Optimistic strategy
  • Rename
    • Generate temporary keys
  • SortedSet
    • Latency Queuing
    • Sliding window
  • Some common sense

Cache penetration

In order to avoid invalid data occupies the cache, we do not usually empty object stored in the cache, but this strategy will result in cache penetrating questions.

To query the data does not exist, then of course this can not be found in data from the cache, the cache invalidation that is logical access to the database, all the data that does not exist query the database will arrive, a phenomenon known as cache penetration.

In order to reduce meaningless database access, we can cache data does not indicate the presence of a placeholder.

Compared with a higher probability of access that never existed data access deleted data, delete data and therefore should be placed in the cache representation has been deleted placeholder.

A collection of caching

Redis offers List, Hash, Set and SortedSet and other data structures, which we will call a collection of caching.

Assembling cache is typically updated more complex logic (or difficult to ensure consistency) and the reconstruction logic is simple, but to rebuild the cache database may also bring a lot of pressure.

Counter update logic also has a caching complex, but simple rebuild to rebuild the cache database pressure characteristics, the authors also be classified as a collection of caching. The complexity of the counter when the object is especially complex state machine, such as a count of the number of users and the total number of articles disclosed articles.

In a list of review articles, for example, when Redis cache review the list is empty, there are two possible reasons:

  • Cache invalidation
  • Does not comment

If the cache is not found in the list of comments after the comment when trying to update the cache, we need to consider is a cache miss or the original did not comment. Do not use LPUSH directly or ZADD instruction insert a comment.

Assembling of elements in the cache should be immutable objects or ID objects . Still review the list, for example, if the storage serialized comment or object directly in the List SortedSet, you only know the full field objects in order to locate the comment. After modifying a comment, we can not get a higher position or modify the content of the original comment difficulty. If a comment exists in the plurality of sets caching, you need to modify multiple.

In addition, the complete object is much larger than the number of bytes comments ID, use the ID when you need multiple storage can save a lot of memory.

Hot data cache

In the real business, we often need to deal with hotspot data cache invalidation issues. Concurrent reading hotspot amount of data is large, there may be a failure occurs once the cache large number of threads access the database, it may even slow down the response database downtime and other serious consequences.

Hot frequently written data may occur under some scenarios, using the updated cache policy is usually not a problem. If we chose to delete expired cache strategy to be updated, because hot data update very quickly lead to delete cache frequently produce large amounts of cache invalidation further error. If using a strategy update the database after the first delete the cache, a large number of read requests will very likely result in stale data in the write cache concurrency errors.

If the hot spot data such as Set or SortedSet set of caching, we may not be able to use an atomic instructions to complete the rebuild operation, and therefore need to be considered to ensure the thread safety of the reconstruction process.

Depending on how hot data consistency requirements, we have two sets of policies.

Use locks to ensure high consistency

For scenes of high consistency requirement we can use the distributed lock service. To access the data after the read request should get a read lock, write a request to update the data should be obtained after a write lock.

When the case of a cache miss occurs, distributed lock service will ensure that there is one and only one thread to acquire write locks and read cache done reconstruction work, read other threads due to inability to obtain a lock is blocked until the cache rebuild is complete. This method avoids the large number of threads repeat the cache database result rebuilding work pressure, but can not avoid the slow response.

In the Singleton pattern in multiple threads simultaneously calling getInstance () method may lead to duplicate objects created using cache lock reconstruction there are similar problems. A thread found a cache miss So get a write lock reconstruction work, thread B is completed before accessing the cache rebuild cache invalidation still occurs, then thread B attempts to acquire a write lock. Since the write lock is held by thread A, thread B is blocked until the rebuild is complete to get a write lock. Because the cache has been rebuilt, if thread B continues to rebuild the cache will result in overhead meaningless.

Singleton pattern we are familiar with Check-Lock-Check strategy to solve this problem:

try {
  读取缓存
  加读锁
} finally {
  释放读锁
}
if (缓存失效) {
  try {
    加写锁
    读取缓存
    if (缓存失效) {
      重建缓存
    }
  } finally {
    释放写锁
  }
}

Because the write protection lock thread safety issues when we need to worry about rebuilding the cache.

Optimistic strategy

When a cache hot data failure, we can first use the placeholder placeholder and then cached reconstruction. Other threads read cache placeholder will return an empty result without access to the database, but also to avoid the adverse consequences of a large number of thread blocks may cause.

placeholder does not guarantee that only one thread to access the database. When a thread A written placeholder, thread B may have occurred a cache miss into the reconstruction process.

If we can not guarantee atomic reconstruction process, you can complete the reconstruction operation on a temporary key, then use the Rename command to replace atomic officially open key to all threads.

Rename

Although Redis commands are atomic, but we often encounter a single command can not complete the operation, in addition to the use of distributed lock to guarantee thread safety complex process, but in some scenarios we can use the rename command to reduce overhead.

A typical scenario is mentioned above, we can not guarantee the operation can be done on a temporary private key when the current thread cache rebuild or update atomic operations, and then use the Rename command to replace atomic officially open key to all threads.

Another common scenario is dirty data into a Set or Hash, the use of SSCAN or HSCAN command asynchronously updated. SSCAN command is only guaranteed to start at the end of traversal key throughout the process has been present in the data set will be returned at least once, if traversing the same time add new data could result in the case of repeated or missing.

We can dirty data sets to rename the temporary private key asynchronous threads, asynchronous thread while traversing private dirty data set, other threads can still add data to the line of dirty data sets.

Generate temporary keys

In a clustered environment, it may only support RENAME command and RENAMENX at the same Slot. Therefore, we can use HashKey mechanisms to ensure temporary key and the original key in the same Slot.

If the original key is the "Original" We can generate a temporary key is "{original} -1", showing braces only by the sub-string inside the braces is determined hash Slot, "{original} -1" and will " original "in the same Slot.

The purpose of the temporary key is to avoid concurrency issues operate single-threaded, so be sure to check whether the key has been temporarily occupied by other threads.

There are two temporary key generation strategy:

  • Original key plus a random value: The "{original} -kGi3X1", the advantage of this method is a low probability of random key violation, but which is difficult to scan with a temporary key database
  • Original key plus counter: The "{original} -1", "{original} -2", the advantage of this method is easy to scan library temporary key, but a higher collision probability.

Does not exist after the detection using the temporary key is not safe, the thread A is detected between the actual available temporary key using the temporary key, other threads that will detect when it is available with a temporary key.

In order to avoid temporary key conflict, we can first try to set before using a placeholder. For example, using "{original} -1" before the first execution "SETNX {original} -1-lock" if successful, or can be used safely "{original} -1". Which in practice it is to add a simple distributed lock.

Use plus random value should be reconstructed or updated cache to minimize conflicts. Use the plus counter should traverse the dirty methods, we can search for a temporary key is not released under the counter to continue traversing the process is interrupted.

SortedSet

SortedSet Redis only as a data structure may be sorted and some may find the range of more flexible applications.

Latency Queuing

Consistency in the scene may be used without high requirements SortedSet act as a delay queue, the content of the message as a Member, the predetermined execution time as a UNIX timestamp score.

The method of polling call ZRANGEBYSCORE predetermined execution time is earlier than the current time of the message and sends Msg Consumer process.

127.0.0.1:6379> ZADD DelayQueue 155472822 msg
(integer) 1
127.0.0.1:6379> ZRANGEBYSCORE DelayQueue 0 1554728933 WITHSCORES
1) "msg"
2) "1554728822"

Due to Redis persistence mechanism, we can not provide any high consistency of service based on Redis queue.

Do not use Redis do message queue at high consistency requirements of the business scene .

Sliding window

In such hot search or limiting such business scenarios we need to quickly find the most searched keywords in the last hour.

Similarly the delay queue, the keyword as a member SortedSet, UNIX time stamp are as score.

Use ZRANGEBYSCORE command to query a certain period of time event occurred, ZREMRANGEBYSCORE command removes outdated data.

Some common sense

Read the readers of this article should have some experience with Redis cache, so some basic knowledge on the last to avoid wasting the reader's time.

  1. IO consuming operation is typically much higher than the CPU calculates, as far as possible using batch commands or the like MGET Pipeline mechanism to reduce IO time, do not read and write cycles IO operation Redis
  2. Redis model using kernel IO multiplexing the single thread mode, the command execution to ensure atomicity and serializability. (When writing to Redis 4.0 version is still the case, then it is possible to introduce multi-threaded kernel)
  3. Redis of RDB and AOF are persistent asynchronous mode, we can not guarantee that data is not lost after the collapse of Redis. So do not Redis for high consistency requirements of the business scene.

Guess you like

Origin www.linuxidc.com/Linux/2020-04/162788.htm