Six common high concurrency cache problem, you know a few?

Foreword

In general, current Internet application website or APP, it's the whole process that we can use this figure shows to represent user requests to start, this interface is the innermost browser and APP, forwarded to the network, to application service, and finally to storage, it may be purely database file system, and then return to the interface rendering content.

v2-757a7301975f49fdf53a20f952de1e5a_b.jpg

With the popularity of the Internet, content information and more complex, the number of users and traffic increasing, we need to support more concurrent application amount, at the same time, the calculation made our application server and database server are also increasingly the more, however, tend to our application server resources are limited and technological change is slow, they are able to receive requests per second is limited, write or file is limited.

How can effective use of limited resources to provide the greatest possible throughput it? An effective way is to introduce the cache, breaking figure of standard processes, each link request can get directly from the cache and return to the target data, thereby reducing the amount of their calculations, to effectively improve the speed of response, so that the limited resources services more users, we like to use this figure shows the cache, it can actually occur in all aspects 1-4 in.

Cache coherency problem

At very high data timeliness requirements, the need to ensure consistent cache data in the database, and the need to ensure data cache node and a copy is also consistent differences in the phenomenon can not appear.

This is more dependent on the cache expiration and renewal policies. Generally when changes occur in the data, take the initiative to update the data in the cache or remove the corresponding cache.

v2-b0f017438bbff4a021184bcf5e6e1808_b.jpg

Cache concurrency issues

After the cache expires attempt to get data from back-end database, which is a plausible process. However, under high concurrency scenarios, there may be multiple concurrent requests to get data from the database, resulting in great impact on the back-end database, and even lead to "avalanche" phenomenon.

In addition, when a cache key while being updated, but also may be a large number of requests to obtain, which can lead to problems of consistency. Then how to avoid similar problems?

We'll think of something like "lock" mechanism, in case of a cache update or expired, first try to get a lock, or when the update is completed and then release the lock database acquired from other requests only need to sacrifice some time to wait, you can continue to get data directly from the cache.

v2-dd7ec6329841388c010b7af9e0ed5b10_b.jpg

Cache penetrating questions

Cache penetration in some places, also known as "breakdown." Many friends of the cache penetrating understanding is: because the cache failure or cache expiration led to a large number of requests penetrate to the back-end database server, causing a huge impact on the database.

这其实是一种误解。真正的缓存穿透应该是这样的:

在高并发场景下,如果某一个key被高并发访问,没有被命中,出于对容错性考虑,会尝试去从后端数据库中获取,从而导致了大量请求达到数据库,而当该key对应的数据本身就是空的情况下,这就导致数据库中并发的去执行了很多不必要的查询操作,从而导致巨大冲击和压力。

可以通过下面的几种常用方式来避免缓存传统问题:

1、缓存空对象

对查询结果为空的对象也进行缓存,如果是集合,可以缓存一个空的集合(非null),如果是缓存单个对象,可以通过字段标识来区分。这样避免请求穿透到后端数据库。

同时,也需要保证缓存数据的时效性。这种方式实现起来成本较低,比较适合命中不高,但可能被频繁更新的数据。

2、单独过滤处理

对所有可能对应数据为空的key进行统一的存放,并在请求前做拦截,这样避免请求穿透到后端数据库。这种方式实现起来相对复杂,比较适合命中不高,但是更新不频繁的数据。

v2-69ccf610270889c03bf5419f4a443559_b.jpg


缓存颠簸问题

缓存的颠簸问题,有些地方可能被成为“缓存抖动”,可以看做是一种比“雪崩”更轻微的故障,但是也会在一段时间内对系统造成冲击和性能影响。一般是由于缓存节点故障导致。业内推荐的做法是通过一致性Hash算法来解决。

欢迎大家关注我的公种浩【程序员追风】,文章都会在里面更新,整理的资料也会放在里面。

缓存的雪崩现象

缓存雪崩就是指由于缓存的原因,导致大量请求到达后端数据库,从而导致数据库崩溃,整个系统崩溃,发生灾难。

导致这种现象的原因有很多种,上面提到的“缓存并发”,“缓存穿透”,“缓存颠簸”等问题,其实都可能会导致缓存雪崩现象发生。这些问题也可能会被恶意攻ji者所利用。

还有一种情况,例如某个时间点内,系统预加载的缓存周期性集中失效了,也可能会导致雪崩。为了避免这种周期性失效,可以通过设置不同的过期时间,来错开缓存过期,从而避免缓存集中失效。

从应用架构角度,我们可以通过限流、降级、熔断等手段来降低影响,也可以通过多级缓存来避免这种灾难。

In addition, from the perspective of the entire development process of the system, should be strengthened stress tests, try to simulate real-world scenarios, exposed the problem as soon as possible in order to guard against.

v2-7fdd56f44321653f57398b6b8206f2ba_b.jpg

Cache bottomless pit phenomenon

The issues raised by the staff of the facebook, facebook around the year 2010, memcached node has up to 3000, thousands of G cached content.

They found a problem --- memcached frequency of connections, reduced efficiency, so add memcached node, after adding, see the problem as a result of the frequency of connections, still exists, and has not improved, known as the "bottomless pit phenomenon."

v2-b960a4b373c906aa68d35674264fd9b2_b.jpg

The current mainstream database, caching, Nosql, search middleware technology stack, support the "slice" technology, to meet the "high-performance, high concurrency, high availability, scalability" and other requirements.

Some are modulo (or identity Hash) mapped to different values ​​in the example client-side by Hash, some are mapped by the scope of the value of the client-side manner. Of course, some is done on the server side.

However, every operation may require different nodes and communication networks to complete the multi-instance node, the overhead is larger, the greater the performance impact pair.

The main can be avoided and optimize the following aspects:

1, data distribution mode

Some business data may be distributed for Hash, and some business suitable for the range of distribution, this can avoid the overhead of network IO to some extent.

2, IO optimization

Can take advantage of connection pooling, NIO and other technologies to reduce connection costs as much as possible, to enhance the ability of concurrent connections.

3, data access

Disposable obtain large data sets, you will score many times to obtain a small set of data network IO cost less.

Of course, the phenomenon is not common cache bottomless pit. You might not encounter in the vast majority of the company.


At last

Welcome to share with everyone, like the point of a praise yo remember the article, thanks for the support!


Guess you like

Origin blog.51cto.com/14442094/2435605