Architect's Diary - Memcached Restrictions and Usage Recommendations

Limitations of Memcached

There is no limit to the amount of item data that can be saved in Memcached, as long as there is enough memory
The maximum memory used by a single Memcached process is 2G . To use more memory, you can open multiple Memcached processes on multiple ports.
Memcached sets the expiration time of Item to a maximum of 30 days. If it is set to permanent, it will also expire at this time. The constant REALTIME_MAXDELTA is controlled by 60*60*24*30
Memcached lacks authentication and security controls, and the Memcached server should be placed behind a firewall
Memcached itself is a server designed for caching, so it does not think too much about the permanent problem of data. When the content capacity reaches the specified value, it will automatically delete the unused cache based on the LRU (Least Recently Used) algorithm.
The maximum key length is 250 bytes , and it cannot be stored if it is larger than this length. The constant KEY_MAX_LENGTH 250 controls
The maximum data of a single item is 1MB , and the data exceeding 1MB will not be stored. It is controlled by the constant POWER_BLOCK1048576, which is the default slab size, and needs to be recompiled after modification.
The maximum number of simultaneous connections is 200 , which is controlled by freetotal in conn_init(), and the maximum number of soft connections is 1024, which is controlled by settings.maxconns=1024

Memcached does not implement a redundancy mechanism, nor does it do any fault tolerance

In the case of node failure, the cluster does not need to do any fault tolerance. In the event of a node failure, the response is entirely up to the user. When a node fails, there are several options for you to choose from:
1: Ignore it! Before the failed node is restored or replaced, there are many other nodes that can cope with
the
2: Remove the failed node from the node list. Be careful doing this! Under the remainder hash algorithm, adding or removing nodes by the client will cause all cached data to be unavailable! Because the list of nodes referenced by the hash has changed, most of the keys will be mapped to different nodes due to the change of the hash value.
3: Start the hot standby node and take over the IP occupied by the failed node. This prevents hash turbulence
4: If you want to add and remove nodes without affecting the original hash result, you can use consistent hashing algorithm
5: Reshing. When the client accesses data, if it finds that a node is down, it will do another hash (the hash algorithm is different from the previous one), and re-select another node (it should be noted that the client does not put the down node removed from the node list, it is still possible to hash it first next time). If a node is good or bad, the double-hashing method is risky, and there may be dirty data on both good and bad nodes

The usual purpose of using Memcached

By caching database query results, the number of database accesses is reduced; there is also caching of hot data to improve the speed and scalability of Web applications, such as:
1: Cache simple query results: The query cache stores the corresponding data of a given query statement. The entire result set is most suitable to
cache SQL statements that are frequently used but do not change corresponding to the query result set, such as loading specific
filtering content
2: Cache simple row-based query results
3: Cached Just SQL data, commonly used hot data such as pages can be cached to save CPU time

Use tiered caching

Memcached can process a large amount of cached data at high speed, but it is still necessary to consider maintaining a multi-layer cache structure according to the situation of the system. In addition to the Memcached cache, multi-level caches can also be established through local caches (such as ehcache, oscache, etc.).
For example, a local cache can be used to cache some basic data, such as small but frequently accessed data (such as product classification, connection information, server state variables, application configuration variables, etc.), cache these data, and make them as close as possible to the processor is Makes sense, this helps reduce page generation time and increases reliability in the event of memcached failures.

Special Note : The cache needs to be updated when the data is updated
Warm up your cache
If you have a high-traffic site, the cache is empty at first, and then a crowd of people hits the site, the database may be overwhelmed while the cache is populating.
Solution: You can first find a way to fill the commonly used data that needs to be cached into Memcached, for example: you can write some scripts to cache common pages; you can also write a command line tool to fill the cache

Typical application scenarios of Memcached

1: Distributed application
2: Database front-end cache
3: Data sharing between servers
4: Frequent changes, frequently queried data, but not necessarily written to the database, such as: user online status
5: Infrequent changes, frequent queries, regardless of whether library, are more suitable for use

Scenarios that are not suitable for using Memcached

1: Applications that change frequently, such as stocks and finance
, which need to be stored in the library as soon as they change Any benefit, on the contrary, will slow down the system efficiency, because the network connection also requires resources
3: The size of the cache object is greater than 1MB
4: The length of the key is greater than 250 characters
5: The virtual host does not allow the memcached service to run
if the application itself is hosted on a low-end On the virtual private server, virtualization technologies such as vmware and xen are not suitable for running memcached. Memcached needs to take over and control large blocks of memory. If the memory managed by memcached is swapped out by the OS, the performance of memcached will be greatly reduced.
6: The application runs in an unsafe environment.
Memcached can access memcached only through telnet in order to provide any security policy. If the application is running on a shared system, security concerns need to be considered.
7: What the business itself needs is persistent data or it should be database

In general, batch import and export of items in Memcached should not be performed

Because Memcached is a non-blocking server. Anything that could cause memcached to pause or momentarily denial of service should be something to consider carefully. Imagine if the cached data changes between export and import, you need to deal with dirty data; if the cached data expires between export and import, what do you do with the data? But it is very useful in one scene. Bulk importing of cached data is helpful if you have a lot of data that never changes and you want the cache to warm up quickly.

What if I really need to export and import items in memcached in batches?

In some scenarios, batch export and import is really necessary, such as dealing with the "shocked herd" problem (nodes fail, and the database is overwhelmed by repeated queries), or there are poorly optimized queries, etc.
A possible solution is to use MogileFS (or CouchDB or similar software) to store the item, then calculate the item and dump it to disk. MogileFS can easily overwrite items and provide fast access. You can even cache items in MogileFS in memcached, which can speed up reading. The combination of MogileFS+Memcached can speed up the response speed when the cache misses and improve the usability of the application.