First, the system cache
In large massive concurrent access to the site and openstack such as cluster, for relational databases, especially large relational database, if its concurrent access to thousands of times per second, and each visit in a hundreds of millions of records when the query data in the table a record, its efficiency will be very low, for the database, which also can not afford.
When using a buffer system can solve large-scale concurrent data access brought about by inefficient database and pressure and other issues, active data storage system cache frequently used to avoid duplicate data in memory access, database queries brought frequent disk i / o and a large table when query time overhead, so the cache system is almost large sites essential functional modules.
Cache memory system may be considered to be based database, with respect to the rear end of the large-scale production database memory-based cache database provides fast data access operations, thereby increasing data access request from the client feedback, and a pressure reducing access to back-end database.
Two, Memcached concept
Memcached is an open source, high-performance, distributed memory object caching system. To reduce the number of database read by caching data and objects in memory, to improve site access speed, acceleration dynamic WEB applications by alleviating database load.
Memcached is a cache memory, often need to access objects or data in the cache memory, the memory, the data cache is accessed by way of the API, after the data is stored using HASH to the HASH table in memory the data stored in the HASH table in the form of key-value, since there is no Memcached authentication for access control and security management, thus the internet-based system architecture, generally located in a safe area Memcached server user.
Memcached server node when the physical memory is insufficient free space, Memcached will use the least recently used algorithm (LRU, LastRecentlyUsed) recent inactive data clean up, thereby sort out a new memory storage space needed to store data.
Memcached has in solving large-scale cluster data cache many problems have obvious advantages and is also easy to secondary development, so more and more users to its cache as a cluster system. In addition, Memcached open API, so that large most programming languages can be used Memcached, such as javac, C / C ++ C # , Perl, python, PHP, Ruby variety of popular programming languages.
Because Memcached many advantages, it has become the first choice of many open source projects clustered caching system. Openstacksd as the keystone authentication program. Will use Memcached to cache Token tenants such as identity information, eliminating the need for user information stored in a MySQL query back-end database when the user login authentication, which can greatly improve the user openstack in large clusters under high load operation of the database authentication process, such as accessing web management interface and object storage Horizon Swift projects Memcached will use to cache data to improve the response rate of client requests.
Three, Memcached caching process
1, the client checks whether the requested data in Memcache, if present, returns the requested data directly, do not carry out any operation on the data.
2, if the requested data is not in Memcache, go to database queries, the acquisition of data from the database back to the client, while the data cache in a Memcache
3, each time updating the database while the database update Memcache. Ensure data consistency.
4, when the allocated memory space is used up Memcache, uses LRU (least Recently Used, least recently used) was added to the failure policy strategy, stale data is first to be replaced, then replace the data not recently used.
Four, Memcached features
1, a simple protocol
that uses text line based protocol, can access the data on the server directly via telnet Memcached
2, event processing based on libevent
libevent library use C development, kqueue it BSD systems, epoll Linux system, such as event handling functionality packaged into one interface, ensure that the number of links even if the server side. Plus it can also play a very good performance. Memcached use this library for asynchronous event handling.
3, built-in memory management
Memcached memory management will have its own way, this approach is very efficient, all data stored in Memcached built-in memory, when the data stored in the filled space, using the LRU algorithm is not automatically deleted cache use, that reuse memory space expired. Memecached not consider disaster recovery data problem, restart all once all data is lost.
4、节点相互独立的分布式
各个 Memecached 服务器之间互不通信,都是独立的存取数据,不共享任何信息。通过对客户端的设计,让 Memcached 具有分布式,能支持海量缓存和大规模应用。
五、使用Memcached应该考虑的因素
1、Memcached服务单点故障
在Memcached集群系统中每个节点独立存取数据,彼此不存在数据同步镜像机制,如果一个Memcached节点故障或者重启,则该节点缓存在内存的数据全部会丢失,再次访问时数据再次缓存到该服务器
2、存储空间限制
Memcache缓存系统的数据存储在内存中,必然会受到寻址空间大小的限制,32为系统可以缓存的数据为2G,64位系统缓存的数据可以是无限的,要看Memcached服务器物理内存足够大即可
3、存储单元限制
Memcache缓存系统以 key-value 为单元进行数据存储,能够存储的数据key尺寸大小为250字节,能够存储的value尺寸大小为1MB,超过这个值不允许存储
4、数据碎片
Memcache缓存系统的内存存储单元是按照Chunk来分配的,这意味着不可能,所有存储的value数据大小正好等于一个Chunk的大小,因此必然会造成内存碎片,而浪费存储空间
5、利旧算法局限性
Memcache缓存系统的LRU算法,并不是针对全局空间的存储数据的,而是针对Slab的,Slab是Memcached中具有同样大小的多个Chunk集合
6、数据访问安全性
Memcache缓存系统的慢慢Memcached服务端并没有相应的安全认证机制通过,通过非加密的telnet连接即可对Memcached服务器端的数据进行各种操作