Article Directory
Foreword
Written a C language program must have known method for dynamically allocate memory via malloc (), in which the memory allocator using ptmalloc2 glibc provided.
In addition to glibc, the industry has a relatively well-known memory allocator of tcmalloc jemalloc Google and Facebook. Both in avoiding memory fragmentation and performance than glic we have a comparative advantage, in a multi-threaded environment is more effective.
Golang also realized the memory allocator, similar principles and tcmalloc, simply put, it is to maintain a large global memory, each thread (Golang as P) maintained a small private memory, lack of private memory and then from the global application.
In addition, memory allocation and GC (garbage collection) are closely related, so it is necessary to understand the principles before GC understanding memory allocation.
Basic concepts
In order to facilitate self-management of memory, the practice is to apply a Xianxiang system memory, the memory is cut into small pieces and then, through a certain memory allocation algorithms to manage memory.
In 64-bit system, for example, the program memory Golang below apply to the system shown in startup:
memory is divided into pre-application of spans, bitmap, arena three parts. Which require memory allocation from the arena here is the so-called heap area applications. Which spans management arena and bitmap area to exist.
arena size of 512G, in order to facilitate management of the arena is divided into a region of a page, each page is 8KB, a total of 512GB / 8KB pages;
pointer storage area span of spans, each pointer corresponding to a Page, so the size of the area of span (512GB / 8KB) * pointer size 8byte = 512M
bitmap area size is calculated by the arena, but mainly for GC.
span
span is the key data structure for managing pages arena, each span included in one or a plurality of consecutive pages, in order to meet dispensing small objects, in a span will be divided into smaller particle size, and for large objects such as pages of more than size, achieved through multiple pages.
class
It is with the size of the object, a series of class division, each class represents a fixed-size objects, and the size of each span. Following table:
// class bytes/obj bytes/span objects waste bytes
// 1 8 8192 1024 0
// 2 16 8192 512 0
// 3 32 8192 256 0
// 4 48 8192 170 32
// 5 64 8192 128 0
// 6 80 8192 102 32
// 7 96 8192 85 32
// 8 112 8192 73 16
// 9 128 8192 64 0
// 10 144 8192 56 128
// 11 160 8192 51 32
// 12 176 8192 46 96
// 13 192 8192 42 128
// 14 208 8192 39 80
// 15 224 8192 36 128
// 16 240 8192 34 32
// 17 256 8192 32 0
// 18 288 8192 28 128
// 19 320 8192 25 192
// 20 352 8192 23 96
// 21 384 8192 21 128
// 22 416 8192 19 288
// 23 448 8192 18 128
// 24 480 8192 17 32
// 25 512 8192 16 0
// 26 576 8192 14 128
// 27 640 8192 12 512
// 28 704 8192 11 448
// 29 768 8192 10 512
// 30 896 8192 9 128
// 31 1024 8192 8 0
// 32 1152 8192 7 128
// 33 1280 8192 6 512
// 34 1408 16384 11 896
// 35 1536 8192 5 512
// 36 1792 16384 9 256
// 37 2048 8192 4 0
// 38 2304 16384 7 256
// 39 2688 8192 3 128
// 40 3072 24576 8 0
// 41 3200 16384 5 384
// 42 3456 24576 7 384
// 43 4096 8192 2 0
// 44 4864 24576 5 256
// 45 5376 16384 3 256
// 46 6144 24576 4 0
// 47 6528 32768 5 128
// 48 6784 40960 6 256
// 49 6912 49152 7 768
// 50 8192 8192 1 0
// 51 9472 57344 6 512
// 52 9728 49152 5 512
// 53 10240 40960 4 0
// 54 10880 32768 3 128
// 55 12288 24576 2 0
// 56 13568 40960 3 256
// 57 14336 57344 4 0
// 58 16384 16384 1 0
// 59 18432 73728 4 0
// 60 19072 57344 3 128
// 61 20480 40960 2 0
// 62 21760 65536 3 256
// 63 24576 24576 1 0
// 64 27264 81920 3 128
// 65 28672 57344 2 0
// 66 32768 32768 1 0
Each column in the table have the following meanings:
- class: class ID, each span structure has a class ID, which indicates the object type that can be processed span
- bytes / obj: the number of bytes representing an object class
- bytes / span: the number of bytes occupied by each span stack, i.e. the page size of the pages *
- objects: the number of objects that can be allocated to each span, i.e. (bytes / spans) / (bytes / obj)
- waste bytes: Each span memory fragmentation produced, i.e. (bytes / spans)% (bytes / obj)
Seen in the table is the maximum object size 32K, 32K than the size of the class is represented by a special, the class ID is 0, contains only one object per class.
span data structure
span is the basic unit of memory management, each span manage a particular class object, with the object according to size, span one or more pages split into a plurality of blocks managed.
src/runtime/mheap.go:mspan
Data structure defines:
type mspan struct {
next *mspan //链表前向指针,用于将span链接起来
prev *mspan //链表前向指针,用于将span链接起来
startAddr uintptr // 起始地址,也即所管理页的地址
npages uintptr // 管理的页数
nelems uintptr // 块个数,也即有多少个块可供分配
allocBits *gcBits //分配位图,每一位代表一个块是否已分配
allocCount uint16 // 已分配块的个数
spanclass spanClass // class表中的class ID
elemsize uintptr // class表中的对象大小,也即块大小
}
In class 10, for example, as shown in the following memory management span and FIG:
spanclass 10, reference may be derived class table npages = 1, nelems = 56, elemsize 144. StartAddr time span which is initialized to specify the address of a page. allocBits points to a bitmap, each representative of whether a block is allocated, the present embodiment there are two blocks has been allocated, it is also allocCount 2.
next and prev for multiple span link up, which facilitates the management of multiple span, the next will be described.
cache
With basic memory management unit span, but also have a data structure to manage the span, this data structure called mcentral, each thread needs to allocate memory from the time span mcentral managed memory, multi-threaded application in order to avoid the constant memory lock , golang span allocated for each thread's cache, the cache that is cache.
src/runtime/mcache.go:mcache
It defines a data structure of the cache:
type mcache struct {
alloc [67*2]*mspan // 按class分组的mspan列表
}
mspan alloc array of pointers to the array size of 2 times the total class. Each element of the array represents a type of class span lists, each class has two type of span list, the first set of objects represented by the list contains a pointer to the object of the second group is not represented in the list contains a pointer, GC is done to improve scan performance for the span list does not contain pointers, no need to scan.
Depending on whether the object contains a pointer, the subjects were divided into two noscan and scan, wherein representatives noscan no pointers, while representatives of the scan pointer need be scanned GC.
mcache span and a correspondence relationship as shown below:
mchache when no span of initialization, and dynamically acquiring from the central cached in use, with the use of data, the number of each class is not span the same. As shown in the figure, class number of the span 0 to more than class1, description of the assignment of threads to be more small objects.
central
cache as a private resource thread is a single thread services, and is the central global resource for serving multiple threads, a thread out of memory when the central would apply when a thread releases the memory will recover into central.
src/runtime/mcentral.go:mcentral
It defines a central data structure:
type mcentral struct {
lock mutex //互斥锁
spanclass spanClass // span class ID
nonempty mSpanList // non-empty 指还有空闲块的span列表
empty mSpanList // 指没有空闲块的span列表
nmalloc uint64 // 已累计分配的对象个数
}
- lock: Inter-thread mutex to prevent multiple threads to read and write conflict
- spanclass: Each mcentral manages a set span of the list have the same class
- nonempty: refers to memory as well as a list of available span
- empty: no memory refers to a list of available span
- nmalloc: refers to the cumulative number of objects allocated
Thread gets span steps from the central:
- Lock
- Obtaining a usable span from nonempty list, and remove it from the list
- The removed empty into the span list
- The span is returned to the thread
- Unlock
- Thread the span into the buffer cache
Thread will span the return of the following steps:
- Lock
- The span is removed from the empty list
- The span join noneempty list
- Unlock
Above thread gets span and span simply return flow from the central, for simplicity, not on the details unfold.
heap
mcentral data structures visible from each mcentral target only specific management span class specifications. Each class corresponds to a fact that will mcentral, this set mcentral mheap stored in the data structure.
src/runtime/mheap.go:mheap
It defines a data structure of the heap:
type mheap struct {
lock mutex
spans []*mspan
bitmap uintptr //指向bitmap首地址,bitmap是从高地址向低地址增长的
arena_start uintptr //指示arena区首地址
arena_used uintptr //指示arena区已使用地址位置
central [67*2]struct {
mcentral mcentral
pad [sys.CacheLineSize - unsafe.Sizeof(mcentral{})%sys.CacheLineSize]byte
}
}
- lock: mutex
- spans: spans point region for mapping the relationship between the span and the page
- bitmap: bitmap start address of
- arena_start: arena area first address
- arena_used: The current arena has a maximum address area
- central: each class corresponding to two mcentral
Seen from the data structure, mheap manages all memory, memory management is in fact Golang through a mheap types of global variables.
Memory management mheap diagram is as follows:
The system is divided into a pre-allocated memory spans, bitmap, arean three regions, up through mheap management. Then look at memory allocation process.
Memory allocation process
There are different size distribution logic for different objects to be dispensed:
- (0, 16B) and does not contain the object pointers: Tiny distribution
- (0, 16B) the object pointer comprising: a normal distribution
- [16B, 32KB]: normal distribution
- (32KB, -): Tiny large object allocation and distribution of which belong to the large object allocation memory management optimization category, where the only concern being the general allocation method.
To request size n of memory, for example, assign the following steps:
- Get the current thread private cache mcache
- Calculated with the appropriate class according to size ID
- [Class] query the list of available span from the alloc mcache
- If mcache not available span from mcentral apply for a new span was added in mcache
- If mcentral also not available in span from mheap apply for a new span to join mcentral
- Obtained from the address of the span and return to an idle objects
to sum up
Golang memory allocation is a very complex process, which also doped with the GC process, is described here only for its critical data structures, understand its principles without getting bogged down in implementation details.
- Golang startup application program a large memory, and is divided into spans, bitmap, arena area
- arena area by one page is divided into small pieces
- span manage one or more pages
- mcentral span for managing multiple application threads to use
- mcache as a thread private resources, resources from mcentral