Server cache (Cache)

Caching refers to a technology that stores frequently accessed network content in a system that is closer to the user and has a faster access speed, so as to improve the content access speed.

How Server Caching Works

The outline is as follows:

cache mode
Cache eviction
cache breakdown
cache penetration
cache avalanche

cache mode

The more common modes are divided into two categories: Cache-aside and Cache-as-SoR . Among them, Cache-as-SoR (System of Record, that is, the DB that directly stores data ) includes Read-through , Write-through , and Write-behind .

Cache-aside

Cache-aside is a relatively common cache mode. In this mode, the process of reading data can be summarized as follows:

Read cache, if cache exists, return directly. If it does not exist, execute 2
Read SoR, then update the cache, the return
code is as follows:

# read v1

def get(key):

value = cache.get(key)

if value is None:

value = db.get(key)

cache.set(key, value)

return value

The process of writing numbers is:

Write SoR
Write the cache
code as follows:

# write v1

def set(key,value):

db.set(key, value)

cache.set(key, value)

The logic seems simple, but in a high-concurrency distributed scenario, there are actually many surprises.

Cache-as-SoR

In the Cache-aside mode, the maintenance logic of the cache needs to be implemented and maintained by the business side itself, while Cache-as-SoR puts the logic of the cache on the storage side, that is, db + cache is a transparent one for the business caller Overall, the business does not need to care about the implementation details, just get/set is enough. Common Cache-as-SoR modes include Read Through , Write Through , and Write Behind .

Read Through: When a read operation occurs, the cache is queried. If it is Miss, the SoR is queried and updated by the cache. The next time the cache is accessed, it can be directly accessed (that is, cache-aside is implemented on the storage side)
Write Through: When a write operation occurs, query the cache, if Hit, update the cache, and then hand it over to the cache model to update the SoR
Write Behind: When a write operation occurs, the SoR is not updated immediately, only the cache is updated, and then returned immediately, and the SoR is updated asynchronously (eventually consistent)

The Read/Write Through mode is easier to understand. It is to update the cache and SoR synchronously . The read scene also gives priority to the cache , and reads the SoR after a miss . The main significance of this type of mode is to relieve the pressure on SoR in the scenario of read operations and improve the overall response speed. There is no optimization for write operations, and it is suitable for scenarios with more reads and fewer writes. Write Behind 's cache and SoR updates are asynchronous, and write operations can be optimized through batch and merge during asynchronous time , so the performance of write operations can be improved.

The following two figures are the flow charts of Write Through and Write Behind taken from wikipedia :

Write Through 和 Write Behind

summary

At present, many DBs have built-in memory-based caches , which can respond to requests faster. For example, Hbase uses block -based caches . The high performance of mongo also relies on the fact that it occupies a large amount of system memory as a cache . However, the effect of adding a layer of local cache in the program will be more obvious, saving a lot of network I/O , which will greatly increase the processing delay of the system, and at the same time reduce the pressure on the downstream cache + db .

Cache eviction

Cache elimination is an old topic, and there are only a few commonly used cache strategies, such as FIFO , LFU , and LRU . Moreover, LRU is regarded as the standard configuration of the cache elimination strategy. Of course, according to different business scenarios, other strategies may be more suitable.

The FIFO elimination strategy usually uses Queue + Dict . After all, Queue is inherently FIFO . New cache objects are placed at the end of the queue, and when the queue is full, the objects at the head of the queue are dequeued and expired.

The core idea of LFU ( Least Frequently Used ) is that the least recently used data is eliminated first, that is, the number of times each object is used is counted, and when elimination is required, the least frequently used data is selected for elimination. So usually LFU is implemented based on min heap + Dict . Because the complexity of each change of the minimum heap is O(logn) , the efficiency of the LFU algorithm is O(logn) , which is slightly lower than the efficiency of FIFO and LRU O(1) .

LRU ( Least recently Used ), based on the principle of locality, that is, if the data has been used recently, it is very likely to be used in the future. Conversely, if the data has not been used for a long time, the probability of being used in the future is also low.

LRU expiration is usually implemented using a double-ended linked list + Dict
(the linked list used in the production environment is generally a double-linked list), and the most recently accessed data is moved from the original location to the head of the linked list, so that the data at the head of the chain is recently used. , and the tail of the chain is the longest unused, and the data to be deleted can be found within the time complexity of O(1) .

# LRU cache expiration summary logic , lock-free version

data_dict = dict()

link = DoubleLink()# double-ended queue

def get(key):

node = data_dict.get(key)

if node is not None:

link.MoveToFront(node)

return node

def add(key,value):

link.PushFront(Node(key,value))

if link.size()>max_size:

node = link.back()

del(data_dict[node.key])

link.remove_back()

Ps：

cache breakdown

In a high-concurrency scenario (such as seckill), if a key fails at a certain time, but at the same time there are a large number of requests to access this key , these requests will directly fall to the downstream DB , that is, cache penetration ( Cache penetration ) , causing great pressure on the DB , and it is very likely that the DB business will be killed in a wave.

In this case, a more common way to protect the downstream is to access the downstream DB through a mutex . The thread / process that acquires the lock is responsible for reading the DB and updating the cache , while other processes that fail to acquire the lock retry the entire get logic.

Implement this logic with the set method of redis as follows:

# read v2

r =redis.StrictRedis()

def get(key,retry=3):

def _get(k):

value = cache.get(k)

if value is None:

if r.set(k,1,ex=1,nx=true): #lock

value = db.get(k)

cache.set(k, value)

return true, value

else:

return None, false

else:

return value, true

while retry:

value, flag = _get(key)

if flag == True:

return value

time.sleep(1) #Failed to acquire lock, revisit after sleep

retry -= 1

raise Exception(" Get failed ")

cache penetration

When the data requested to be accessed is a piece of data that does not exist, generally such non-existing data will not be written into the cache , so requests for accessing such data will be directly sent to the downstream db . When the amount of such requests is large , will also bring risks to the downstream db .

Solution:

You can consider caching this kind of data for a short period of time appropriately, and caching this kind of empty data as a special value.
Another more rigorous approach is to use BloomFilter. The characteristics of BloomFilter will not miss when detecting whether the key exists (when BloomFilter does not exist, it must not exist), but it may be falsely positive (when BloomFilter exists, it may not exist ). Hbase uses BloomFilter internally to quickly find rows that do not exist.

Penetration prevention based on BloomFilter :

# read v3

r =redis.StrictRedis()

def get(key,retry=3):

def _get(k):

value = cache.get(k)

if value is None:

if not Bloomfilter.get(k):

# Check Bloomfilter first when cache miss

# Bloomfilter needs to synchronize transaction updates when Db is written

return None, true

if r.set(k,1,ex=1,nx=true):

value = db.get(k)

cache.set(k, value)

return true, value

else:

return None, false

else:

return value, true

while retry:

value, flag = _get(key)

if flag == True:

return value

time.sleep(1)

retry -= 1

raise Exception(" Get failed ")

cache avalanche

When for some reason, such as simultaneous expiration, restart, etc., a large number of caches fail at the same time, causing a large number of requests to directly hit the downstream service or DB , which brings huge pressure and may crash, that is, an avalanche.

For the scenario of simultaneous expiration , it is often due to the occurrence of cold start or traffic surge, which causes a large amount of data to be written to the cache in a very short period of time, and their expiration time is the same, so they expire in a similar time.

Solution:

A relatively simple method is random expiration , that is, the expiration time of each piece of data can be set to expire + random .
Another better solution is to build a second-level cache, such as a set of local_cache + redis storage scheme, or redis + redis mode designed before caching .

In addition, it is a reasonable downgrade plan. In a high-concurrency scenario, when it is detected that the high concurrency may or has already affected resources, the downstream resources are protected through a current-limiting and downgrading scheme to prevent the entire resource from being overwhelmed and unavailable, and the cache is gradually built during the current-limiting period. After the cache gradually recovers, the current limit is canceled and the downgrade resumes.

Server cache (Cache)

Guess you like