Redis cache and dual-write database consistency problems

First, the cache due to its high concurrency and high performance characteristics, has been widely used in the project. In terms of read cache, we doubt nothing, are in accordance with the flow chart next to business operations.

But in terms of updating the cache, to finish updating the database, it is updating the cache, or delete the cache. Or is to delete the cache, and then update the database, in fact, we remain highly controversial. There is no one comprehensive blog, parses these different kinds of programs. So bloggers in fear, they braved the risk of being sprayed, wrote this article.

Article structure

This article consists of three parts
1, to explain the cache update policy
2, for each analysis strategy disadvantage
3, are given for the shortcomings of refinement

text

Do a description, in theory, to set the cache expiration time, it is to ensure the consistency of the final solution. In this program, we can set the expiration time for the cached data, all writes to the database prevail, the operation of the cache can only do our best. This means that if the database write successful cache update failed, then once they reach the expiration time, the subsequent read request will naturally read the new values from the database and then backfill the cache. Therefore, the following discussion of ideas does not depend on the cache settings to the expiration of this program.
Here, we discuss three update strategy:

  1. To update the database, and then update the cache

  2. Delete the cache, and then update the database

  3. To update the database, and then delete the cache

No one should ask me, why did not update the cache first, and then update the database of this strategy.

(1) to update the database, and then update the cache

This program, they are generally opposed. why? The following two reasons.
- Causes a (thread-safe angle)
while Request A and Request B update operation, then there will be
(1) A thread updates the database
(2) thread B updates the database
(3) Thread B updates the cache
(4) threads a cache is updated

A request to update the cache which appeared to be early fishes than B request to update the cache, but because the network reasons, B earlier than A cache is updated. This leads to dirty data, and therefore not considered.
- for two reasons (business scenario angle)
has the following two points:
(1) If you are writing a database scenario more, and read data scene is relatively small business needs, using this program will lead to, are they not read data , the cache will be updated frequently, waste performance.
(2) If you write the value of the database, not directly write cache, but to go through a series of complex calculations re-write cache. So, after each write to the database, the calculated values are written to the cache again, no doubt a waste of performance. Clearly, delete the cache is more suitable.

The next discussion is the most controversial, first delete the cache, and then update the database. Or to update the database, and then delete the cache problem.

(2) to delete the cache, and then update the database

The reason for this program will lead to inconsistent Yes. At the same time there is a request to update the operation A, the other B requests query operation. Then it will appear the following situations:
(1) A request to write, delete cache
(2) B request inquiry found the cache does not exist
(3) B request to query the database to get the old value
(4) Request B will write caching old value
( 5) A request for a new value written to the database

The above situation will lead to inconsistencies. Moreover, if the cache is not used to set the expiration time of the policy, the data is always dirty.
So, how to solve it? A time delay double deletion strategy
Pseudo code

public void write(String key,Object data){

        redis.delKey(key);

        db.updateData(data);

        Thread.sleep(1000);

        redis.delKey(key);

    }

Translated into Chinese is described in
(1) out of the first cache
(2) write database (these two steps, like the original)
(3) Sleep 1 second, out of the cache again

To do so, can be cached dirty data in 1 second caused by the deletion again.
Well, this one second how determined the specific sleep how long?
For the above case, the reader should evaluate their own time-consuming to read data business logic of their own projects. Then write data at sleep time consuming basis of the read data on the business logic, you can add a few hundred ms. The purpose of doing so is to ensure that the end of the read requests, write requests can delete cached read requests caused by dirty data.
If you use a separate read and write architecture mysql how to do?
ok, in this case, as the cause of inconsistent data, or two requests, one request for an update operation A, the other B requests query operation.
(1) A write request, delete cache
(2) A request to write data to the database,
(3) the query request buffer B was found, no value buffer
(4) to query request B from the library, this time, has not master-slave synchronization is completed, so that the old value to the query
(5) to request the write cache the old value B
(6) from the master database synchronization is completed, the library becomes a new value

The above-mentioned circumstances, is the reason for inconsistent data. Or double deletion delay tactics. But, as modified sleep time in a master-slave synchronization on the basis of the delay time, plus a few hundred ms.
With this synchronous phase out strategy to reduce throughput how to do?
ok, then the second delete as asynchronous. Yourself a thread, asynchronous delete. Thus, the written request would not sleep after a period of time, and then return. To do so, to increase throughput.
The second If deletion fails how to do?
This is a very good question, because the second deletion failed, a situation will arise. There are two requests, one request for an update operation A, the other B requests query operation, for convenience, assume that a single library:
(1) A write request, delete cache
(2) discovery cache query request B absent
(3) B request to query the database to get the old value
(4) request B will write caching old value
(5) A request for a new value will be written to the database
(6) A request to try to write to a cache removal request B values, results Failed.

ok, that is to say. If the second failure delete cache, cache and database problems occur again inconsistent.
How to solve it?
Concrete solutions, resolve Look at the first blogger (3) update the kinds of policies.

(3) to update the database, and then delete the cache

First, let's talk about. Foreigners made a cache update routine, called "Cache-Aside pattern". Which pointed out

  • Failure: An application to start taking data cache, do not get, fetch data from the database, after the success, into the cache.

  • Hit: application access to data from the cache, taken after return.

  • Update: put data into the database, after the success, let cache invalidation.

  • In addition, the social networking site facebook is also well-known paper "Scaling Memcache at Facebook" proposed, they are also used to update the database, and then delete the cached policy.

This concurrency problem does not exist it?
no. Assuming that there will be two requests to do a query request A, B a request to do an update operation, then there will be generated a situation
(1) failure of just buffer
(2) A request to query a database to obtain an old value
(3) Request B the new value written to the database
(4) removal of the cached request B
(5) a request to the old value of the write cache found

ok, if this happens, really dirty data will occur.
However, the probability of this happening and how many do?
This occurs there is a congenital condition that step (3) database write operation than in step (2) to read and less time consuming database operations, possible that step (4) prior to step (5). However, we think, read speed much faster than the database write operations (or else why do read and write separation, separate read and write is to make sense because the read operation faster, consume less resources), and therefore step (3) shorter than the time-consuming step (2), this situation is difficult to appear.
Assume that someone have to bicker, obsessive-compulsive disorder, must resolve how to do?
How to solve the concurrency problem?
First, to set the buffer time is an effective program. Secondly, the use of asynchronous strategy (2) given the delay in the deletion policy to ensure that after reading the request is complete, then delete it.
There are other reasons for inconsistencies caused it?
Yes, this is the cache update policy (2) and cache update policy (3) there is a problem, delete the cache fails if how to do, that is not the case it would appear inconsistent. For example, a request to write data, and then written to the database, and delete the cache fails, it will appear on the inconsistencies. This is also the last cache update policy questions (2) in left.
How to solve?
Provides a safeguard mechanism to retry, here are two options.
Scheme I:
As shown below

scheme shown below,
(1) update the database data;
(2) deleting the failed cache problems because
(3) The key to be deleted is transmitted to the message queue
(4) their own consumption message to be deleted is obtained Key
(5) continue to retry the delete operation until it succeeds

However, this solution has the disadvantage of causing a large number of service lines intrusion code. So with Option II, in the second scheme, the subscription program to start a subscription database binlog, acquire necessary operation. In the application, the other from a program, access to information coming from this subscription program, delete cache operation.
Scheme Two:

the flow as shown below:
(1) update the database data
(2) the operation information database will be written to the log which binlog
(3) Feed program extracting the desired data, and Key
(. 4) from another section of the non-service code, this information is obtained
(5) operation tries to delete cache, delete the found failure
(6) to send the information to a message queue
(7) of the data retrieved from the message queue in retry operation.

Remarks: The above procedures have an existing subscription binlog middleware called the canal, can be done to subscribe binlog log function in mysql. As for the oracle, bloggers do not currently know whether there are ready-made middleware can be used. Further, the retry mechanism, is the use of the blogger message queue manner. If the consistency requirement is not very high, directly in the program from another thread, from time to time to retry to, these flexible it can play freely, just a thought.

 

 

Memory size control

The total amount of property selection, better versatility, and ease of maintenance, such as the user table, with a total amount of attributes can,

But we need to consider the choice of cache performance and space saving only property we need just fine (but later changed the table structure, maintenance is poor)

:( penetration cache directly on the storage layer operations, lost the meaning of the buffer layer)

        Query data does not exist in a database, such as product information, query ID that does not exist, every time access to DB, if someone malicious damage, probably caused by an excessive direct pressure on the DB.

solution:

       1. When a key time to go through a query data, if the corresponding data in the database does not exist, we see this key corresponding to the value set as a default value, such as "NULL", and set a cache expiration time , this time before a cache miss, all through this key to access the cache have been blocked. If this back key corresponding data exists in the DB, then a cache miss, data access by this key again, be able to get a new value.

       2. The common is the use of a Bloom filter (with a small memory can hold much data), the hash of all possible data to a bitmap large enough, one must not be present in the bitmap data interception off, thus avoiding queries pressure on the underlying storage system. (Bloom filter: actually a very long series of random binary vector mapping function and a Bloom filter can be used to retrieve whether an element has the advantage that in a set of query time and space efficiency far. more than the general algorithm, the disadvantage is a certain error recognition rate and remove difficulties.)

 About Bloom filter:

 

Cache cache invalidation avalanche :()

      Failure cache at the same time a large area, so subsequent requests will fall on the database, resulting in a short time withstand a large number of database requests and bounced off.

solution:

     1. The key in the system cache invalidation uniformly staggered, to prevent a large number of uniform time key corresponding to a cache miss;

     2. redesigned cache usage, when we go to query the data key, first query cache, query cache at this time if not, then be distributed by locking the lock, the lock has made the process of check and set the cache DB, then unlock; if there are other processes to wait for a lock, then return the cached data and other unlock or DB query again.

     3. Try to ensure high availability of the entire cluster redis, machine downtime found to make up as soon as possible

     4. ehcache local cache + hystrix limiting & downgrade, MySQL avoid collapse out

If you have collapsed: You can also use the persistence mechanism redis will be saved to the cache of data recovery as soon as possible.

Cache bottomless pit:

In order to meet the business plus a large number of nodes, but not to enhance performance decreased.

When the client adds a cache, you can just mget once, but if increased to three cache, this time you need to mget three times (time network traffic increases) for each additional one, the client needs to do a new the mget, to put pressure on server performance.

Meanwhile, mget need to wait for the slowest machine operation is completed to be completed mget operations. This is the parallel design, if it is a serial design is even more slow.

This can be summarized by the example above: more machines! = Higher performance

But is not no way, usually when optimizing IO of the following methods can be used.

  1. Optimized command. Such as slow under investigation keys, hgetall bigkey.
  2. The number of times we need to reduce network traffic. The optimization of the frequency of use in practical applications the most, we try to reduce the number of communications.
  3. Reducing access costs. Client such as using long connection or a connection pool, NIO like.

to sum up

This article is actually the current Internet has the consistency of the program, carried out a summary. For the update policy to delete the cache, and then update the database, as well as the proposed scheme maintains a memory queue way, bloggers looked at, that realization is very complex, it is not necessary, it is not necessary given in the text. Finally, I hope you gain something

                        <li class="tool-item tool-active is-like "><a href="javascript:;"><svg class="icon" aria-hidden="true">
                            <use xlink:href="#csdnc-thumbsup"></use>
                        </svg><span class="name">点赞</span>
                        <span class="count">1</span>
                        </a></li>
                        <li class="tool-item tool-active is-collection "><a href="javascript:;" data-report-click="{&quot;mod&quot;:&quot;popu_824&quot;}"><svg class="icon" aria-hidden="true">
                            <use xlink:href="#icon-csdnc-Collection-G"></use>
                        </svg><span class="name">收藏</span></a></li>
                        <li class="tool-item tool-active is-share"><a href="javascript:;"><svg class="icon" aria-hidden="true">
                            <use xlink:href="#icon-csdnc-fenxiang"></use>
                        </svg>分享</a></li>
                        <!--打赏开始-->
                                                <!--打赏结束-->
                                                <li class="tool-item tool-more">
                            <a>
                            <svg t="1575545411852" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="5717" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><defs><style type="text/css"></style></defs><path d="M179.176 499.222m-113.245 0a113.245 113.245 0 1 0 226.49 0 113.245 113.245 0 1 0-226.49 0Z" p-id="5718"></path><path d="M509.684 499.222m-113.245 0a113.245 113.245 0 1 0 226.49 0 113.245 113.245 0 1 0-226.49 0Z" p-id="5719"></path><path d="M846.175 499.222m-113.245 0a113.245 113.245 0 1 0 226.49 0 113.245 113.245 0 1 0-226.49 0Z" p-id="5720"></path></svg>
                            </a>
                            <ul class="more-box">
                                <li class="item"><a class="article-report">文章举报</a></li>
                            </ul>
                        </li>
                                            </ul>
                </div>
                            </div>
            <div class="person-messagebox">
                <div class="left-message"><a href="https://blog.csdn.net/qq_16803227">
                    <img src="https://profile.csdnimg.cn/A/D/9/3_qq_16803227" class="avatar_pic" username="qq_16803227">
                                            <img src="https://g.csdnimg.cn/static/user-reg-year/1x/6.png" class="user-years">
                                    </a></div>
                <div class="middle-message">
                                        <div class="title"><span class="tit"><a href="https://blog.csdn.net/qq_16803227" data-report-click="{&quot;mod&quot;:&quot;popu_379&quot;}" target="_blank">君莫笑_0808</a></span>
                                            </div>
                    <div class="text"><span>发布了11 篇原创文章</span> · <span>获赞 6</span> · <span>访问量 2万+</span></div>
                </div>
                                <div class="right-message">
                                            <a href="https://im.csdn.net/im/main.html?userName=qq_16803227" target="_blank" class="btn btn-sm btn-red-hollow bt-button personal-letter">私信
                        </a>
                                                            <a class="btn btn-sm  bt-button personal-watch" data-report-click="{&quot;mod&quot;:&quot;popu_379&quot;}">关注</a>
                                    </div>
                            </div>
                    </div>
    </article>
    
Released five original articles · won praise 0 · Views 251

First, the cache due to its high concurrency and high performance characteristics, has been widely used in the project. In terms of read cache, we doubt nothing, are in accordance with the flow chart next to business operations.

But in terms of updating the cache, to finish updating the database, it is updating the cache, or delete the cache. Or is to delete the cache, and then update the database, in fact, we remain highly controversial. There is no one comprehensive blog, parses these different kinds of programs. So bloggers in fear, they braved the risk of being sprayed, wrote this article.

Guess you like

Origin blog.csdn.net/weixin_32822759/article/details/104527169