9. How to ensure double buffering and write database consistency?

Author: Chinese Shi Shan

Interview questions

How to ensure that double buffering and write database consistency?

Interviewer psychological analysis

You just use the cache, it may be related to caching and database storage double double write, you write as long as the double, it must have a data consistency problem, then how do you solve the consistency problem?

Face questions analysis

In general, if the cache can allow a slight occasional inconsistencies with the database, which means that if your system is not strictly required "Cache + database" must be consistent, it is best not to do this program, namely: a read request and write requests serialized string into a memory queue to go.

Serialization ensures inconsistency will not occur, but it also leads to a significant reduction in the throughput of the system, a request by several times than normal to support the machine line.

Cache Aside Pattern

The most classic database read and write cache + mode is Cache Aside Pattern.

  • Read, read-ahead cache, cache not, school database, and then taken out into the data cache, and returns a response.
  • Updated when you update the database, and then delete the cache.

Why delete cache, instead of updating the cache?

The reason is simple, many times, in more complex scenarios cache, the cache is not just the value of the database directly taken out.

For example, a field may update a table, then the corresponding cache, the other two is the need to query data tables and operations, in order to calculate the new value of the cache.

In addition the cost of updating the cache is sometimes very high. It is not to say that every time you modify the database, be sure to have the corresponding cache update a? Maybe some scenes like this, but for more complex cache data computing scenario, it is not the case. If you frequently modify more than one table involved in a cache, cache is updated frequently. But the problem is that the cache in the end will not be frequently accessed?

For chestnut, a cache table fields involved in the modification 1 minute 20 times, or 100 times, then the cache updated 20 times, 100 times; however at one minute the cache is read only 1 time , a large number of cold data. In fact, if you just delete the cache, then in one minute, but the cache will be recalculated once only, a significant reduction in overhead. Cache is used only to count the cache.

In fact, delete the cache, instead of updating the cache is calculated by a lazy thinking, do not always re-do complex calculations, whether it would not be used, but it needs to be used when re-calculation. Like mybatis, hibernate, have lazy loading thoughts. Query a department, a department with a staff list, no need to say each query departments, 1000 staff are inside the data also check out ah. 80% of the cases, the investigation department, just to access the information in this sector on it. First check the department, at the same time you want to access the inside of the employees, so this time only when you want to access the inside of the staff, will go inside the database query 1000 employees.

Most primary cache inconsistency problems and solutions

Problem: to update the database, and then delete the cache. If you delete the cache fails, the database will lead to new data, the cache is old data, there have been inconsistencies.

redis-junior-inconsistent

Solution: delete the cache, and then update the database. If the database update fails, then the database is old data, the cache is empty, then the data is not inconsistent. Because when the cache is not read, so I read the old data in the database, and then update the cache.

More complex analysis of data inconsistencies

Data has changed, first delete the cache, and then go to modify the database, they will not be modified. A request over, read cache, the cache was found empty, to query the database, found the old data before the modification, into the cache. Subsequent data change program finished modifying the database. Over, and cache data in the database is not the same ...

Why are millions of traffic at high concurrency scenarios, the cache will appear this problem?

Only when the data for a concurrent read and write, you may face this problem. In fact, if you say the words of concurrency is very low, especially in low concurrent read access per day to 10,000, then a few cases, there will be inconsistent with the kind of scene just described. But the problem is, if every day is millions of traffic, tens of thousands of concurrent reads per second, per second as long as there is a request to update the data, it may be above + database cache inconsistencies.

Solutions are as follows:

When updating data, based on the unique identification data, after the operation route, to send a jvm internal queue. When reading data, if the data is not found in the cache, then read the data again to update the cache + operation, after the unique identification in accordance with the route, also sends the same jvm internal queue.

A work queue corresponds to a thread, each worker thread to get the corresponding serial operation, then a one is performed. In this case, a change of operating data, delete the cache, and then go to update the database, but did not complete the update. At this time, if a read request over not read cache, then the first cache may transmit the request to update the queue, then the backlog in the queue will then wait for the cache update complete synchronization.

There is an optimization point, a queue, in fact, more than one update cache request strung together does not make sense, so you can do the filtering, if the queue has been found to have a request to update the cache, then do not update requests came alive again go operation, the update operation directly in front of waiting for a request can be completed.

After that queue corresponding work to be done to modify the database thread on one operation at will to perform an action, that is, the cache update operation, then reads the latest values ​​from the database, and then write cache .

If the request waiting time range, to continue the polling can take values ​​found, then return directly; if the waiting time exceeds a certain length of time request, then this time the old value of the current read directly from the database.

Under high concurrency scenarios, the solution is to pay attention to the problem:

  • Blocking read request length

As a result of the read request of asynchronous very mild, so be sure to pay attention to the problem of reading timeout, each read request must return within the timeout time.

The solution, says the biggest risk is that point, the data may be updated very frequently, resulting in a significant backlog queue update operations on the inside, then read requests will be a lot of time-out occurs, leading to a large number of requests go directly to the database. Always through some simulated real test to see how the frequency of update data.

In addition, because a queue, it may update backlog for multiple data items, hence the need for testing according to their own business situation, you may need to deploy multiple services, each sharing some of the update data. If a memory queue will actually squeeze 100 merchandise inventory modification operations, inventory modification operations to spend every 10ms to complete, then finally a read request goods, may wait 10 * 100 = 1000ms = 1s, to get data , this time leading to long blocking read request.

Must be done according to the actual operation of business systems, to carry out some of the stress test, and simulated online environment, to see the busiest time, how much memory queue may squeeze the update operation, it may result in the last update operation corresponding read request, it will hang much time, if the read request to return in 200ms, after if you calculate, even if it is the busiest time, the backlog of 10 update, wait up to 200ms, it is also possible.

If a particularly large number of possible memory queue backlog update, then you have to add the machine, so that service instances deployed on each machine processing less data, each memory queue backlog update operation will be less.

In fact, according to the project prior experience, in general, write frequency data is very low, so in fact it is normal, in the queue backlog update operation should be small. For highly concurrent read like this, read cache architecture of the project, write requests are generally very small, QPS energy per second to hundreds of pretty good.

Let's look at the actual rough estimates.

If one second write operation 500, if the time is divided into five pieces, to write every 200ms operation 100, memory 20 into the queue, each memory queue backlog 5 might write operation. After each write operation performance testing, usually completed in about 20ms, then for each memory read request queue data, most hang for a while, certainly returned within 200ms.

After just a simple calculation, we know that the single write QPS support of a few hundred is no problem, if you write QPS expanded 10 times, then the expansion of the machine, expansion machine 10 times, each machine 20 queues.

  • Concurrent read request is too high

Here we must also do stress tests to ensure that when it happens to run into the above situation, there is a risk that a sudden large number of read requests in the tens of milliseconds delay hang on the service, see the service can carry the live, need how many machines can Kang Zhu peak maximum limit situation.

But because not all data is updated at the same time, the cache will not fail the same time, so every possible minority that is cached data fails, then those data corresponding to the read request over, concurrency should not be particularly Big.

  • Multi-service request routing instances deployed

This service may be deployed in multiple instances, you must ensure that the implementation of data update, as well as a request to perform a cache update operation, all to the same service instance Nginx server route.

For example, for the same commodity read request is routed to all of the same machine. Can do their own route according to a hash of request parameters, you can also use the routing function hash Nginx, etc. between services.

  • Routing hot commodity, leading inclined request

In case of a commodity read and write requests are particularly high, all hit the same queue inside the same machine to go, the pressure may cause a machine is too large. That is, because the only time in the updated product data will empty the cache, and then will lead to concurrent read and write, so they are actually going to see based on business systems, if the update frequency is not too high, the impact of this problem is not particularly large, but indeed possible that some of the machine load will be higher.

Guess you like

Origin www.cnblogs.com/morganlin/p/11980471.html