How the cache is designed

Problem situation:

When there are a large number of requests to the internal system, if each request requires us to operate the database, such as a query operation, then for the data whose data does not change much, it will consume a lot of us to query the database every time. performance

Especially when data operations are performed in massive data, if they are all loaded from DB, it will challenge the patience of users

To put it simply, for example, when we go to the community to find out if a person is at home, when there is no communication tool, we have to go through the security of the community every time, then go to the specific unit building, and finally arrive at the door of this house. I finally know if I'm at home

If we change to a better security guard, he knows whether there are people in a specific home in the current community, then if we directly ask the community security guard, we will naturally not need to go to the wrong way, and will naturally improve the efficiency.

Here, we can simply regard excellent security as a cache. Every time we visit, we will access the cache first, which can greatly improve access efficiency and system performance.

It can be seen that having an excellent security guard is very important

What the basic design of the cache looks like

The design of the cache is naturally also to solve the inefficiency problem of the system, so that the system can achieve high performance and high concurrency

For example, we directly access a single-machine database such as mysql, which is a thousand-level qps . If we access the cache, it can reach tens of thousands or hundreds of thousands . This gap is not a little bit, but a qualitative leap.

The design of the cache is actually a matter of the order of DB and cache operations and who will operate it. It is roughly divided into the following four modes

  • Cache Aside
  • Read Through
  • Write Through
  • Write Behind Caching

Of the above four modes, the one used by Cache Aside is the most commonly used, and we will discuss in detail later.

The meaning of the next three modes is

Read Through

  • The cache is updated during the query operation. If the cache is invalid, the cache server will load the data into the cache by itself.

Write Through

  • 是在更新数据库的时候,如果命中了缓存,则先更新缓存,再由缓存服务器自己去更新数据,
  • 如果是没有命中缓存,那么就直接更新数据库

Write Behind Caching 通过名字我们知道,是在写到缓存操作之后才做些操作,实际上这种模式只更新缓存,不会更新数据库,缓存服务器会以异步的方式将数据批量更新到数据库中

很明显,这种,模式速度自然会更快,可这种模式对于保证数据库和缓存数据一致性问题,是个硬伤,且还会存在丢数据的情况,比如,咱们的缓存服务器挂掉了

Cache Aside 读写缓存模式是怎么玩的

Cache Aside 读写模式缓存又是如何去处理的呢,一起来看看

Cache Aside 模式读取数据的逻辑是这个样子的:

读取数据时

  • 先读取缓存中的数据,如果缓存中有数据,则直接返回
  • 若缓存中没有数据,则去读数据库中的数据,并将数据同步到缓存中

写入数据时

  • 写入数据库,写入成功时,将缓存的数据删除掉

仔细的同学可能会思考并提出这样的问题,如果我一个查询操作,现在缓存中无数据,此时会去数据库中查询,在这个过程中,另外有一个写入数据库的操作,且操作完毕后,删除了缓存,这个时候,第一个操作实际上从数据库拿到的还是之前的老数据,并且会将数据放到缓存中,那么此时的数据实际上是一个老数据,也可以理解是在脏数据

这个点其实我们就无需担心了,大佬们已经论证过这种情况出现的概率极低

因为咱们的写表操作是要锁表的,且我们知道数据库写入操作比读取操作要慢,也就是说,当同时有一个读取和写入 DB 的操作时,自然是写入的操作是要后返回结果的,此处不要杠啥读写数据量不一致的情况,咱们做对比,自然是在同等条件下比较咯

从图中我们知道,同等条件下,先进行查询 DB 的操作,过程中,来了一个写入 DB 操作,自然是 查询操作先返回,写入操作再返回结果

其实此处,有的做法是,写入数据的时候,写入成功,同时也会将数据同步到缓存中

那么这种方式的引入,实际上从数据库到缓存就有了 2 种情况了,一个是查询操作,一个是写入操作,那么在实际操作中,我是可以加入分布式锁来进行处理,保证写入数据库的时候,同时也要写入缓存,数据才可访问,当然查询 DB 操作也是一样

缓存带来了哪些问题?

那么引入缓存除了可以带来高性能,高并发,自然也是有会带来一些问题的,例如:

  • 缓存击穿
  • 缓存穿透
  • 缓存雪崩

如上 3 中情况,都是由于缓存这一层防线失守了,导致外部请求以各种各样的形式,各种各样的原因打到了数据库上,导致出现的问题,详细的 缓存击穿,缓存穿透,缓存雪崩的出现情况,解决方式可以查看历史文章 redis 缓存穿透,缓存击穿,缓存雪崩

感谢阅读,欢迎交流,点个赞,关注一波 再走吧

我正在参与掘金技术社区创作者签约计划招募活动,点击链接报名投稿

Guess you like

Origin juejin.im/post/7120941198042693662