On the database and cache dual-write consistency

And cache concurrency due to its high performance characteristics, are widely used in the project. FIG read buffer process is as follows:

Read caching process

Dual write consistency has the following three requirements:

  1. Cache can not read dirty data
  2. Cache may read stale data, but to achieve within a tolerable time eventually consistent
  3. The tolerable time as small as possible

To meet the above three can be used to read and write requests serialized string into a memory queue to go, so as to ensure a certain inconsistency does not occur. However, after serialization, it will lead to a substantial decrease in throughput of the system will, under normal circumstances to use several times more than a machine to support online request.

Dual write serialization

So, here, we discuss three common methods:

  1. To update the database, and then update the cache
  2. Delete the cache, and then update the database
  3. To update the database, and then delete the cache

1. First update the database, and then update the cache

This approach is generally opposed to everyone, because focused on the following two points:

Reason 1: Thread safety point of view.
Request while A and B update operation request, then there will be:

  1. A database update thread
  2. Thread B updates the database
  3. Thread B updated its cache
  4. Thread A updates the cache
    which appeared in A request to update the cache should request early fishes than the B update the cache, but because the network and other reasons, B earlier than A cache is updated. This leads to dirty data, and therefore not considered.

    "First update the cache, and then update the database" this scheme the same token, is caused by dirty data, it is not considered

Reason 2: The business scenario perspective.
There are the following two points:

  1. If you are writing a database scenario more, and read data scene is relatively small business needs, using this program will lead to fundamentally not read data, the cache will be updated frequently, waste performance.
  2. If you write the value of the database, not directly write cache, but to go through a series of complex calculations re-write cache. So, after each write to the database, the calculated values ​​are written to the cache again, no doubt a waste of performance. Clearly, delete the cache is more suitable.

If you must update the cache, the cache data can be considered to increase the version number

Why delete cache

2. delete the cache, and then update the database

The program will also lead to inconsistencies. Request while A and B update operation request, then there will be:

  1. A request to write, delete cache
  2. B request inquiry found the cache does not exist
  3. B request to query the database to get the old value
  4. B requests the old value of the write cache
  5. A request for the new value written to the database of the above situation will lead to inconsistencies. Moreover, if the cache is not used to set the expiration time of the policy, the data is always dirty.

After you remove the first update

Solution:

  1. Delete cache
  2. Write database (these two steps, like the original)
  3. Sleep certain period of time (for example, 1 second or 200ms), delete the cache again. To do so, the cache dirty data can be deleted again.

However, this solution due to the dormant thread is still very affecting throughput

3. The first update the database, and then delete the cache

This approach is used in many engineering solutions, we look at whether a certain security.
Suppose there are two requests, a request to do a query operation A, B a request to do an update operation, then there will be generated a situation

  1. Cache just fail
  2. A request to query the database, was an old value
  3. B requests the new value written to the database
  4. B request to delete the cache
  5. A request to the old value of the write cache found

In this way, the dirty data arises, however, the above situation is to assume that a write request in the database faster than the read request. In fact, the speed of the read operation in the project database much faster than the write operation.
Either or Paxos by 2PC protocol ensures consistency, or is to find ways to reduce the probability of concurrent dirty, probably because 2PC too slow, too complicated and Paxos, considering, Facebook chose the third option.

If you delete the cache fails how to do?

Start a program to subscribe binlog subscription databases, access data need to operate. In the application, the other from a program, access to information coming from this subscription program, delete cache operation.

Delete Cache retry

Ali open source middleware canal may fulfill the functions subscribe binlog log.

to sum up

This article is currently available in the Internet has been a consistent program summary, I hope you gain something.

Finally, my experience is limited to a limited level, readers are welcome valuable suggestions and comments on the text of the opinion. If you want to get more resources or want to learn more and exchange of technology enthusiasts together, I can focus on the public number "whole food engineer Xiaohui," replies the background Keywords receive learning materials, into the front and back end technology exchange group and a programmer sideline group. Programmers also can join the group sideline Q group: 735 764 906 with the exchange.

Heck, if my card lost.  Micro-letter search for "whole food engineer Xiaohui," I can still find

Guess you like

Origin www.cnblogs.com/mseddl/p/11570647.html