Five strategies summarize the advantages and disadvantages of the process of using a combination of cache analysis

Today translation of an article on cache strategy, the original title is Cacheing Strategies and How to Choose the Right One, a friend to see, feel good summary, in view of many of my friends are too lazy to read English, so it Pippi in broken level try to translate a wave, how to feel okay, we can look forward to help get more people to see.

Caching is one of the easiest ways to improve system performance. In contrast, the speed of the database (or NoSQL database) is relatively slow, but the speed is often the key to winning.

If used properly, the cache can reduce response time, and reduce cost database load. This is a list of several caching strategy, choosing the right one will be very different. Cache policy depends on the data and data access patterns. In other words, how the data is written and read. E.g:

  • The system is a write once read many less do? (E.g., time based log)

  • Whether the data is only written once and read many times? (Such as user profiles)

  • The data returned is always only do? (For example, a search query)

Choosing the right caching strategy is key to improving performance. Let's take a quick look at the various caching strategies.

The first: Cache-Aside

This is probably the most commonly used method for caching. While the cache is located, an application talks directly to the cache and the database.

Explain briefly:

  1. Application first checks the cache.

  2. If found in the cache, he said that it has hit the buffers. Data is read and returned to the application.

  3. If not found in the cache, the cache misses. Applications have to do some extra work, it needs to query the database to read data, the data is returned to the client, and then also in the cache, so that subsequent reads of the same data can be stored in the data cache hit.

Cache-aside is particularly appropriate to read more scenarios. Use of Cache-aside cache invalidation system has a certain elasticity. If the cache cluster is down, the system can still be operated by directly accessing the database. (However, if the cache fall during the peak load, which is not much help. Response time may become very bad, the worst case is that the database may stop working.)

Another advantage is that the data model may be different from the cache data in the database model. For example, in response to a plurality of queries generated can be stored on a request id.

When a cache-aside, the most common strategy is to write directly to write data to the database. When this occurs, the cache may be inconsistent with the database. To solve this problem, developers often introduced TTL, and continue to provide stale data until the TTL expires. If it is necessary to ensure the freshness of data, developers either make the cache entry is invalid, or use the appropriate write strategy, we will discuss later.

The second: Read-Though Cache

Cache database under the Read-though strategy consistent. When a cache miss, it loads the appropriate data from the database, and returns it to the cache fill applications (refer to the diagram).

cache-aside and read-through strategy is delay in loading data, that is, only when the data is loaded in the first reading.

Although the read-through and cache-aside is very similar, but there are at least two key differences:

  1. In the cache-aside, the application is responsible for obtaining data from the database and populate the cache. In the read-through, then this logic is typically provided by the library supports or independent caches.

  2. Cache-aside with different data models read-through cache is not different from data in the database model.

When the same data multiple requests, read-through cache is best suited for reading large amount of workload. For example, a news story. The disadvantage is that, when the first request data, it always leads to cache misses, and lead to load additional data to the cost of the cache. Developers issue a query to "warm" or "warm" cache to address this issue by manually. Like cache-aside, the data may become inconsistent between the cache and the database, and solutions in the write strategy, we will see this in the next.

Third: Write-Through Cache

In this write strategy, data is first written to the cache, and then written to the database. Cache and database consistency, always write through cache to the main database.

In this write strategy, data is first written to the cache, and then written to the database. Cache and database consistency, always write through cache to the main database.

For its part, write-through cache does not seem much of a role, in fact, they introduce additional write latency, because data is written to the cache, and then written to the primary database. However, when used in conjunction with a read-through, we get all the benefits of read-through, and also received a guarantee data consistency, so that we do not have to use the cache invalidation technology.

DynamoDB Accelerator (DAX) is a good example of write-through / read-through cache is. It inline with DynamoDB and applications. DynamoDB reading and writing may be accomplished by DAX. (Note: If you plan to use DAX, be sure to familiarize yourself with its data consistency model and how it interacts with the DynamoDB.)

The fourth Write-Around

Under this strategy, the data directly into the database, the data can only be read into the cache. Write-around can be combined with read-through, and write data only once, the case where reading or never read less frequently to provide good performance. For example, real-time log message or chat rooms. Also, this mode can be used in combination with the cache-aside.

The fifth Write-Back

Under this strategy, the application writes data to the cache, the cache will be confirmed immediately, and written to the database after a time delay data. Sometimes this strategy is also known as write-behind.

Write-back cache improves write performance, very useful for write heavy workload workload. When combined with the read-through, and it is very effective for mixed workloads, recently updated and accessed data is always available in the cache. It has a database failure elasticity to a large extent, and can tolerate downtime of some of the database. If supports batch or merging, it can reduce overall database write operations, which will reduce the load and reduce costs.

Some developers use Redis, while using the cache-aside and write-back of two strategies to better absorb peak periods of peak load. The main disadvantage is that, if a cache miss, the data may be permanently lost. Most internal relational database storage engine (such as InnoDB) are enabled by default write-back cache. Query memory is written first, and finally flushed to disk.

to sum up

In this article, we discuss the advantages and disadvantages of different caching strategies. In practice, carefully evaluate your goals, understand the data access (read / write) mode, and choose the best policy or combination of policies.

If you choose the wrong how to do? Does not match your target or access patterns? You may introduce additional delay, or at least do not see all the benefits. For example, if write-through / read-through should be used in actual write-around / read-through (low frequency access data is written), the cache will be used in the presence or absence of garbage. It can be said, if the cache is large enough, it may not be a problem. But in many practical high-throughput system, when the memory is not big enough and always need to consider the cost of the server, the right strategy is very important.

Guess you like

Origin juejin.im/post/5d5fde2051882505a87a918d