Redis and Mysql double write consistency program resolve

I. Introduction

First, the cache due to its high concurrency and high performance characteristics, has been widely used in the project. In terms of read cache, we doubt nothing, are in accordance with the flow chart next to business operations

Redis and Mysql double write consistency program resolve

 

But in terms of updating the cache, to finish updating the database, it is updating the cache, or delete the cache. Or is to delete the cache, and then update the database, in fact, everyone there is a big controversy

This article consists of three parts 1, 2 to explain the cache update policy, for each analysis strategy disadvantage 3, are given for the shortcomings of refinement

Two program consistency

Do a description, in theory, to set the cache expiration time, it is to ensure the consistency of the final solution. In this program, we can set the expiration time for the cached data, all writes to the database prevail, the operation of the cache can only do our best. This means that if the database write successful cache update failed, then once they reach the expiration time, the subsequent read request will naturally read the new values from the database and then backfill the cache. Therefore, the following discussion of ideas does not depend on the cache settings to the expiration of this program. Here, we discuss three kinds update strategy:

  • 1. First update the database, and then update the cache
  • 2. delete the cache, and then update the database
  • 3. The first update the database, and then delete the cache

Three first update the database, and then update the cache

This program, they are generally opposed. why? The following two reasons.

One reason (thread-safe angle) while Request A and Request B update operation, then there will be

  • (1) A thread update the database
  • (2) update the database thread B
  • (3) update the cache thread B
  • (4) A thread updates the cache

A request to update the cache which appeared to be early fishes than B request to update the cache, but because the network reasons, B earlier than A cache is updated. This leads to dirty data, and therefore not considered.

Two reasons (business scenario angle) of the following points:

  • (1) If you are writing a database scenario more, and read data scene is relatively small business needs, using this program will lead to, are they not read data, the cache will be updated frequently, waste performance.
  • (2) If you write the value of the database, not directly write cache, but to go through a series of complex calculations re-write cache. So, after each write to the database, the calculated values ​​are written to the cache again, no doubt a waste of performance. Clearly, delete the cache is more suitable.

The next discussion is the most controversial, first delete the cache, and then update the database. Or to update the database, and then delete the cache problem.

Four first delete the cache, and then update the database

The reason for this program will lead to inconsistent Yes. At the same time there is a request to update the operation A, the other B requests query operation. Then it will appear the following situations:

  • (1) A request to write, delete cache
  • (2) discovery cache query request B absent
  • (3) B a request to a database query to obtain the old value
  • (4) the old value B requests write cache
  • (5) A request for the new value written to the database of the above situation will lead to inconsistencies. Moreover, if the cache is not used to set the expiration time of the policy, the data is always dirty.

So, how to solve it? A time delay double deletion strategy Pseudo code

public void write(String key,Object data){
 redis.delKey(key);
 db.updateData(data);
 Thread.sleep(1000);
 redis.delKey(key);
 }

Description is translated into Chinese

  • (1) out of the first cache
  • (2) write database (these two steps, like the original)
  • (3) Sleep 1 second, to do so again out of the cache, the cache dirty data can be caused by one second, deleted again.

Well, this one second how determined the specific sleep how long?

For the above case, the reader should evaluate their own time-consuming to read data business logic of their own projects. Then write data at sleep time consuming basis of the read data on the business logic, you can add a few hundred ms. The purpose of doing so is to ensure that the end of the read requests, write requests can delete cached read requests caused by dirty data.

If you use a separate read and write architecture mysql how to do?

ok, in this case, as the cause of inconsistent data, or two requests, one request for an update operation A, the other B requests query operation.

  • (1) A request to write, delete cache
  • (2) A request to write data to the database,
  • (3) the query request buffer B was found, no cache value
  • (4) B to query requests from the library, this time, the master-slave synchronization is not yet complete, so the query to the old value
  • (5) the old value B requests write cache
  • (6) database master-slave synchronization is completed, the library becomes a new value for the above case, the reason the data is inconsistent. Or double deletion delay tactics. But, as modified sleep time in a master-slave synchronization on the basis of the delay time, plus a few hundred ms.

With this synchronous phase out strategy to reduce throughput how to do?

ok, then the second delete as asynchronous. Yourself a thread, asynchronous delete. Thus, the written request would not sleep after a period of time, and then return. To do so, to increase throughput.

The second If deletion fails how to do?

This is a very good question, because the second deletion failed, a situation will arise. There are two requests, one request for an update operation A, the other B requests query operation, for convenience, assume that a single library:

  • (1) A request to write, delete cache
  • (2) discovery cache query request B absent
  • (3) B a request to a database query to obtain the old value
  • (4) the old value B requests write cache
  • (5) A request for the new value written to the database
  • (6) A request to try to write to a cache removal request B value, the result failed. ok, that is to say. If the second failure delete cache, cache and database problems occur again inconsistent. How to solve it? Concrete solutions, resolve Look at the first blogger (3) update the kinds of policies.

Five to update the database, and then delete the cache

First, let's talk about. Foreigners made a cache update routine, called "Cache-Aside pattern". Which pointed out

  1. Failure : An application to start taking data cache, do not get, fetch data from the database, after the success, into the cache.
  2. Hit : application access to data from the cache, taken after return.
  3. Updated : put data into the database, after the success, let cache invalidation.

In addition, the social networking site facebook is also well-known paper "Scaling Memcache at Facebook" proposed, they are also used to update the database, and then delete the cached policy.

This concurrency problem does not exist it?

no. Assuming that there will be two requests to do a query request A, B a request to do an update operation, then there will be generated a situation

(1) cache just fail

(2) A request to query a database to obtain an old value

(3) B requests the new value written to the database

(4) removal of the cached request B

(5) A request to the old value of the write cache found ok, if this happens, dirty data will indeed occur.

However, the probability of this happening and how many do?

This occurs there is a congenital condition that step (3) database write operation than in step (2) to read and less time consuming database operations, possible that step (4) prior to step (5).

However, we think, read speed much faster than the database write operations (or else why do read and write separation, separate read and write it is to make sense because the read operation faster, consume less resources), and therefore step (3) shorter than the time-consuming step (2), this situation is difficult to appear. Assume that someone have to bicker, obsessive-compulsive disorder, must resolve how to do?

How to solve the concurrency problem?

First, to set the buffer time is an effective program. Secondly, the use of asynchronous strategy (2) given the delay in the deletion policy to ensure that after reading the request is complete, then delete it.

There are other reasons for inconsistencies caused it?

Yes, this is the cache update policy (2) and cache update policy (3) there is a problem, delete the cache fails if how to do, that is not the case it would appear inconsistent. For example, a request to write data, and then written to the database, and delete the cache fails, it will appear on the inconsistencies. This is also the last cache update policy questions (2) in left.

How to solve? Provides a safeguard mechanism to retry, here are two options.

A program : As shown below

Redis and Mysql double write consistency program resolve

 

As shown in Scheme

  • (1) update the database data;
  • (2) failed to delete the cache because of problems
  • (3) The key to be deleted is transmitted to the message queue
  • (4) self-consumption message, get the key to be deleted
  • (5) continue to retry the delete operation until it succeeds, however, the program has a drawback, causing a lot of intrusion into business lines of code. So with Option II, in the second scheme, the subscription program to start a subscription database binlog, acquire necessary operation. In the application, the other from a program, access to information coming from this subscription program, delete cache operation.

Option Two :

Redis and Mysql double write consistency program resolve

 

Process as shown below:

  • (1) update the database data
  • (2) the operation information database will be written to the log which binlog
  • (3) Feed the extracted program and data required for key
  • (4) a new paragraph non-business code, the information obtained
  • (5) try to remove the cache operation, we found that deletion fails
  • (6) to send the information to a message queue
  • (7) the data retrieved from the message queue in retry operation.

Remarks: the above procedures have an existing subscription binlog middleware called the canal, can be done to subscribe binlog log function in mysql. As for the oracle, small series currently do not know whether there are ready-made middleware can be used. Further, the retry mechanism, is the use of small series is the way the message queue. If the consistency requirement is not very high, directly in the program from another thread, from time to time to retry to, these flexible it can play freely, just a thought.

Guess you like

Origin blog.csdn.net/ziwuzhulin/article/details/94646149
Recommended