Slab Allocation mechanism: organize memory for reuse
Recent memcached uses a mechanism called Slab Allocator to allocate and manage memory by default. Before this mechanism, memory was allocated by simply mallocing and freeing all records. However, this approach can lead to memory fragmentation, burdening the operating system's memory manager, and in the worst case, the operating system being slower than the memcached process itself. Slab Allocator was born to solve this problem.
Let's take a look at the principle of Slab Allocator. Here is the goal of the slab allocator in the memcached documentation:
the primary goal of the slabs subsystem in memcached was to eliminate memory fragmentation issues totally by using fixed-size memory chunks coming from a few predetermined size classes.
That is to say, the basic principle of Slab Allocator is to divide the allocated memory into blocks of a specific length according to a pre-specified size to completely solve the problem of memory fragmentation.
The principle of Slab Allocation is quite simple. Divide the allocated memory into chunks of various sizes,
and divide chunks of the same size into groups (sets of chunks) (Figure 1).
Figure 1 Structure diagram of Slab Allocation
Moreover, slab allocator also has the purpose of reusing allocated memory. That is, the allocated memory is not released, but reused.
Key Terms for Slab Allocation
Page
The memory space allocated to the slab, the default is 1MB. After being allocated to the slab, it is divided into chunks according to the size of the slab.
Chunk
Memory space for caching records.
Slab Class
Groups of chunks of a specific size.
The principle of caching records in Slab
The following describes how memcached selects slabs for data sent by clients and caches them in chunks.
According to the size of the received data, memcached selects the slab most suitable for the data size (Figure 2).
Memcached saves a list of free chunks in the slab, selects chunks according to the list,
and caches data in it.
Figure 2 Method for selecting groups to store records in
In fact, Slab Allocator has advantages and disadvantages. The following describes its shortcomings.
Disadvantages of Slab Allocator
Slab Allocator solves the original memory fragmentation problem, but the new mechanism also brings new problems to memcached.
The problem is that since the memory is allocated a certain length, the allocated memory cannot be used efficiently. For example, cache 100 bytes of data into 128-byte chunks, and the remaining 28 bytes are wasted (Figure 3).
Figure 3 Use of chunk space
There is currently no perfect solution to this problem, but more effective solutions are documented in the documentation.
The most efficient way to reduce the waste is to use a list of size classes that closely matches (if that's at all possible) common sizes of objects that the clients of this particular installation of memcached are likely to store.
That is, if the common size of the data sent by the client is known in advance, or if only data of the same size is cached, as long as a list of groups suitable for the data size is used, waste can be reduced.
But unfortunately, no tuning can be done yet, and we can only look forward to future versions. However, we can adjust the difference in slab class size. Next, the growth factor option is described.
Tuning with Growth Factor
By specifying the Growth Factor at startup (via the -f option), memcached
can control the differences between slabs to some extent. The default value is 1.25. However, before this option, this factor used to be fixed at 2, known as the "powers of 2" strategy.
Let's try starting memcached in verbose mode with the previous settings:
$ memcached -f 2 -vv
Here is the verbose output after startup:
slab class 1: chunk size 128 perslab 8192
slab class 2: chunk size 256 perslab 4096
slab class 3: chunk size 512 perslab 2048
slab class 4: chunk size 1024 perslab 1024
slab class 5: chunk size 2048 perslab 512
slab class 6: chunk size 4096 perslab 256
slab class 7: chunk size 8192 perslab 128
slab class 8: chunk size 16384 perslab 64
slab class 9: chunk size 32768 perslab 32
slab class 10: chunk size 65536 perslab 16
slab class 11: chunk size 131072 perslab 8
slab class 12: chunk size 262144 perslab 4
slab class 13: chunk size 524288 perslab 2
It can be seen that, starting from the 128-byte group, the size of the group increases to twice the original size in turn. The problem with this setting is that the difference between slabs is relatively large, and in some cases, it is quite a waste of memory. Therefore, to minimize memory waste, the option growth factor was added two years ago.
Let's take a look at the output at the current default setting (f=1.25) (limited space, only the 10th group is written here):
slab class 1: chunk size 88 perslab 11915
slab class 2: chunk size 112 perslab 9362
slab class 3: chunk size 144 perslab 7281
slab class 4: chunk size 184 perslab 5698
slab class 5: chunk size 232 perslab 4519
slab class 6: chunk size 296 perslab 3542
slab class 7: chunk size 376 perslab 2788
slab class 8: chunk size 472 perslab 2221
slab class 9: chunk size 592 perslab 1771
slab class 10: chunk size 744 perslab 1409
It can be seen that the gap between groups is much smaller than when the factor is 2, which is more suitable for caching records of several hundred bytes. From the above output, you may feel that there are some calculation errors, which are deliberately set to maintain the alignment of the number of bytes.
When introducing memcached into production, or deploying directly with default values, it is best to recalculate the expected average length of the data and adjust the growth factor to get the most appropriate settings. Memory is a precious resource, and it would be a shame to waste it.
Next, I will introduce how to use the stats command of memcached to view various information such as the utilization of slabs.
View the internal state of memcached
memcached has a command called stats that can be used to obtain various information.
There are many ways to execute a command, and telnet is the easiest:
$ telnet 主机名 端口号
After connecting to memcached, enter stats and press Enter to get various information including resource utilization. Additionally, enter "stats slabs" or "stats items" to get information about cache records. To end the program, enter quit.
Details of these commands can be found in the protocol.txt document in the memcached package.
$ telnet localhost 11211
Trying ::1...
Connected to localhost.
Escape character is '^]'.
stats
STAT pid 481
STAT uptime 16574
STAT time 1213687612
STAT version 1.2.5
STAT pointer_size 32
STAT rusage_user 0.102297
STAT rusage_system 0.214317
STAT curr_items 0
STAT total_items 0
STAT bytes 0
STAT curr_connections 6
STAT total_connections 8
STAT connection_structures 7
STAT cmd_get 0
STAT cmd_set 0
STAT get_hits 0
STAT get_misses 0
STAT evictions 0
STAT bytes_read 20
STAT bytes_written 465
STAT limit_maxbytes 67108864
STAT threads 4
END
quit
In addition, if libmemcached, a client library for the C/C++ language, is installed, the memstat command is installed. It's very simple to use, you can get the same information as telnet in fewer steps, and you can get information from multiple servers at once.
$ memstat --servers=server1,server2,server3,...
libmemcached can be obtained from:
Check the usage of slabs
Using a Perl script named memcached-tool written by Brad, the creator of memcached, can easily get the usage of the slab (it organizes the return value of memcached into an easy-to-read format). The script can be obtained from the following address:
The method of use is also extremely simple:
$ memcached-tool 主机名:端口 选项
There is no need to specify options when viewing slabs usage, so just use the following command:
$ memcached-tool 主机名:端口
The information obtained is as follows:
# Item_Size Max_age 1MB_pages Count Full?
1 104 B 1394292 s 1215 12249628 yes
2 136 B 1456795 s 52 400919 yes
3 176 B 1339587 s 33 196567 yes
4 224 B 1360926 s 109 510221 yes
5 280 B 1570071 s 49 183452 yes
6 352 B 1592051 s 77 229197 yes
7 440 B 1517732 s 66 157183 yes
8 552 B 1460821 s 62 117697 yes
9 696 B 1521917 s 143 215308 yes
10 872 B 1695035 s 205 246162 yes
11 1.1 kB 1681650 s 233 221968 yes
12 1.3 kB 1603363 s 241 183621 yes
13 1.7 kB 1634218 s 94 57197 yes
14 2.1 kB 1695038 s 75 36488 yes
15 2.6 kB 1747075 s 65 25203 yes
16 3.3 kB 1760661 s 78 24167 yes
The meaning of each column is:
List | meaning |
# | slab class number |
Item_Size | Chunk size |
Max_age | Time-to-live of the oldest record within the LRU |
1MB_pages | Number of pages allocated to slab |
Count | Number of records in slab |
Full? | Whether the slab contains free chunks |
The information obtained from this script is very handy for tuning and is highly recommended.
Summary of Memory Storage
This time, the caching mechanism and tuning method of memcached are briefly explained. I hope readers can understand the memory management principle of memcached and its advantages and disadvantages.
Next time, I will continue to explain the principles of LRU and Expire, as well as the latest development direction of memcached - pluggable architecher).