When the write operation is performed, it is necessary to ensure that the data is read from the cache to the persistent data in the database is consistent, it is necessary to update the cache.
Because the database and caching involves two steps, it is difficult to guarantee the atomicity of updates.
When designing update policy, we need to consider more aspects:
- Impact on system throughput: for example, update the database load caching strategy results in less than delete the cached policy load
- Concurrent Security: concurrent read and write operation sequence of some abnormality may cause data inconsistencies, such as caching data for long term storage stale
- Update affects failed: If an operation fails, how to minimize the impact on business
- The difficulty of detecting and repairing failures: failures caused by operator error will leave detailed records in the log easily detected and repaired. Data concurrency issues caused the error is not obvious signs difficult to find, and more prone to the risk of concurrent operations produce large errors in peak traffic periods.
There are two ways to update the cache:
- Failure to delete the cache: would rather read from the database because the cache miss when reading new data and update the cache
- Update Cache: directly covered by the new data written to the cache data expired
Update the cache and update the database, there are two in this order:
- After the first database cache
- After the first cache database
There are four combinations of two updated strategy, we are now one by one analysis.
Thread concurrency issues usually due after the start of the operation was to complete lead, we called this "false start." Let us analyze one by one in four strategies "false start" to bring the error.
To update the database, and then delete the cache
If the database update is successful, delete the cache operation fails, after which the cache is read stale data, resulting in inconsistencies.
Concurrency errors that may occur:
time | A thread | Thread B | database | Cache |
---|---|---|---|---|
1 | Cache invalidation | v1 | null | |
2 | Read from the database v1 | v1 | null | |
3 | Update the database | v2 | null | |
4 | Delete Cache | v2 | null | |
5 | Write Cache | v2 | v1 |
To update the database, and then update the cache
Delete the cache with the same strategy, if successful database update cache update failure will result in data inconsistencies.
Concurrency errors that may occur:
time | A thread | Thread B | database | Cache |
---|---|---|---|---|
0 | v0 | v0 | ||
1 | Update the database to v1 | v1 | v0 | |
2 | Update the database as v2 | v2 | v0 | |
3 | Update cache v2 | v2 | v2 | |
4 | Update cache v1 | v2 | v1 |
When two threads write conflict, thread A writes the old data can be avoided by way comparative data version.
Delete the cache, and then update the database
Concurrency errors that may occur:
time | A thread | Thread B | database | Cache |
---|---|---|---|---|
1 | Delete Cache | v1 | null | |
2 | Cache invalidation | v1 | null | |
3 | Read from the database v1 | v1 | null | |
4 | Update the database as v2 | v2 | null | |
5 | The write cache v1 | v2 | v1 |
First update the cache, and then update the database
If the cache is successfully updated database update fails, after which read data are not persistent. Because the data in the cache is volatile, this state is very dangerous.
Because the database because the key constraint leads to a higher possibility of failure is written, so this strategy is risky.
Concurrency errors that may occur:
time | A thread | Thread B | database | Cache |
---|---|---|---|---|
0 | v0 | v0 | ||
1 | Update cache v1 | v0 | v1 | |
2 | Update cache v2 | v0 | v2 | |
3 | Update the database as v2 | v2 | v2 | |
4 | Update the database to v1 | v1 | v2 |
Asynchronous update
Double the updated logical complexity, more consistency. Now we can update the database by way of subscription to update the cache.
Ali Baba open source database mysql binlog incremental subscription and consumption components - the Canal .
We can use only written to the database server API, and another thread increments the subscription database binlog policy cache update.
This policy exists to update and delete the cache after a similar database concurrency problems:
time | Reading Thread | Write thread | Asynchronous Thread | database | Cache |
---|---|---|---|---|---|
1 | Cache invalidation | v1 | null | ||
2 | Read from the database v1 | v1 | null | ||
3 | Update the database as v2 | v2 | null | ||
4 | Delete cache / cache update | v2 | null | ||
5 | Write Cache | v2 | v1 |
This problem can also use asynchronous thread updates the cache version of the method when comparing data and write caching to resolve.