Go language memory management

Foreword

Written a C language program must have known method for dynamically allocate memory via malloc (), in which the memory allocator using ptmalloc2 glibc provided.
In addition to glibc, the industry has a relatively well-known memory allocator of tcmalloc jemalloc Google and Facebook. Both in avoiding memory fragmentation and performance than glic we have a comparative advantage, in a multi-threaded environment is more effective.

Golang also realized the memory allocator, similar principles and tcmalloc, simply put, it is to maintain a large global memory, each thread (Golang as P) maintained a small private memory, lack of private memory and then from the global application.

In addition, memory allocation and GC (garbage collection) are closely related, so it is necessary to understand the principles before GC understanding memory allocation.

Basic concepts

In order to facilitate self-management of memory, the practice is to apply a Xianxiang system memory, the memory is cut into small pieces and then, through a certain memory allocation algorithms to manage memory.
In 64-bit system, for example, the program memory Golang below apply to the system shown in startup:
Here Insert Picture Description
memory is divided into pre-application of spans, bitmap, arena three parts. Which require memory allocation from the arena here is the so-called heap area applications. Which spans management arena and bitmap area to exist.

arena size of 512G, in order to facilitate management of the arena is divided into a region of a page, each page is 8KB, a total of 512GB / 8KB pages;

pointer storage area span of spans, each pointer corresponding to a Page, so the size of the area of ​​span (512GB / 8KB) * pointer size 8byte = 512M

bitmap area size is calculated by the arena, but mainly for GC.

span

span is the key data structure for managing pages arena, each span included in one or a plurality of consecutive pages, in order to meet dispensing small objects, in a span will be divided into smaller particle size, and for large objects such as pages of more than size, achieved through multiple pages.

class

It is with the size of the object, a series of class division, each class represents a fixed-size objects, and the size of each span. Following table:

// class  bytes/obj  bytes/span  objects  waste bytes
//     1          8        8192     1024            0
//     2         16        8192      512            0
//     3         32        8192      256            0
//     4         48        8192      170           32
//     5         64        8192      128            0
//     6         80        8192      102           32
//     7         96        8192       85           32
//     8        112        8192       73           16
//     9        128        8192       64            0
//    10        144        8192       56          128
//    11        160        8192       51           32
//    12        176        8192       46           96
//    13        192        8192       42          128
//    14        208        8192       39           80
//    15        224        8192       36          128
//    16        240        8192       34           32
//    17        256        8192       32            0
//    18        288        8192       28          128
//    19        320        8192       25          192
//    20        352        8192       23           96
//    21        384        8192       21          128
//    22        416        8192       19          288
//    23        448        8192       18          128
//    24        480        8192       17           32
//    25        512        8192       16            0
//    26        576        8192       14          128
//    27        640        8192       12          512
//    28        704        8192       11          448
//    29        768        8192       10          512
//    30        896        8192        9          128
//    31       1024        8192        8            0
//    32       1152        8192        7          128
//    33       1280        8192        6          512
//    34       1408       16384       11          896
//    35       1536        8192        5          512
//    36       1792       16384        9          256
//    37       2048        8192        4            0
//    38       2304       16384        7          256
//    39       2688        8192        3          128
//    40       3072       24576        8            0
//    41       3200       16384        5          384
//    42       3456       24576        7          384
//    43       4096        8192        2            0
//    44       4864       24576        5          256
//    45       5376       16384        3          256
//    46       6144       24576        4            0
//    47       6528       32768        5          128
//    48       6784       40960        6          256
//    49       6912       49152        7          768
//    50       8192        8192        1            0
//    51       9472       57344        6          512
//    52       9728       49152        5          512
//    53      10240       40960        4            0
//    54      10880       32768        3          128
//    55      12288       24576        2            0
//    56      13568       40960        3          256
//    57      14336       57344        4            0
//    58      16384       16384        1            0
//    59      18432       73728        4            0
//    60      19072       57344        3          128
//    61      20480       40960        2            0
//    62      21760       65536        3          256
//    63      24576       24576        1            0
//    64      27264       81920        3          128
//    65      28672       57344        2            0
//    66      32768       32768        1            0

Each column in the table have the following meanings:

  • class: class ID, each span structure has a class ID, which indicates the object type that can be processed span
  • bytes / obj: the number of bytes representing an object class
  • bytes / span: the number of bytes occupied by each span stack, i.e. the page size of the pages *
  • objects: the number of objects that can be allocated to each span, i.e. (bytes / spans) / (bytes / obj)
  • waste bytes: Each span memory fragmentation produced, i.e. (bytes / spans)% (bytes / obj)

Seen in the table is the maximum object size 32K, 32K than the size of the class is represented by a special, the class ID is 0, contains only one object per class.

span data structure

span is the basic unit of memory management, each span manage a particular class object, with the object according to size, span one or more pages split into a plurality of blocks managed.
src/runtime/mheap.go:mspanData structure defines:

type mspan struct {
	next *mspan			//链表前向指针,用于将span链接起来
	prev *mspan			//链表前向指针,用于将span链接起来
	startAddr uintptr // 起始地址,也即所管理页的地址
	npages    uintptr // 管理的页数
	
	nelems uintptr // 块个数,也即有多少个块可供分配

	allocBits  *gcBits //分配位图,每一位代表一个块是否已分配

	allocCount  uint16     // 已分配块的个数
	spanclass   spanClass  // class表中的class ID

	elemsize    uintptr    // class表中的对象大小,也即块大小
}

In class 10, for example, as shown in the following memory management span and FIG:
Here Insert Picture Description
spanclass 10, reference may be derived class table npages = 1, nelems = 56, elemsize 144. StartAddr time span which is initialized to specify the address of a page. allocBits points to a bitmap, each representative of whether a block is allocated, the present embodiment there are two blocks has been allocated, it is also allocCount 2.

next and prev for multiple span link up, which facilitates the management of multiple span, the next will be described.

cache

With basic memory management unit span, but also have a data structure to manage the span, this data structure called mcentral, each thread needs to allocate memory from the time span mcentral managed memory, multi-threaded application in order to avoid the constant memory lock , golang span allocated for each thread's cache, the cache that is cache.

src/runtime/mcache.go:mcacheIt defines a data structure of the cache:

type mcache struct {
	alloc [67*2]*mspan // 按class分组的mspan列表
}

mspan alloc array of pointers to the array size of 2 times the total class. Each element of the array represents a type of class span lists, each class has two type of span list, the first set of objects represented by the list contains a pointer to the object of the second group is not represented in the list contains a pointer, GC is done to improve scan performance for the span list does not contain pointers, no need to scan.

Depending on whether the object contains a pointer, the subjects were divided into two noscan and scan, wherein representatives noscan no pointers, while representatives of the scan pointer need be scanned GC.

mcache span and a correspondence relationship as shown below:
Here Insert Picture Description
mchache when no span of initialization, and dynamically acquiring from the central cached in use, with the use of data, the number of each class is not span the same. As shown in the figure, class number of the span 0 to more than class1, description of the assignment of threads to be more small objects.

central

cache as a private resource thread is a single thread services, and is the central global resource for serving multiple threads, a thread out of memory when the central would apply when a thread releases the memory will recover into central.

src/runtime/mcentral.go:mcentralIt defines a central data structure:

type mcentral struct {
	lock      mutex     //互斥锁
	spanclass spanClass // span class ID
	nonempty  mSpanList // non-empty 指还有空闲块的span列表
	empty     mSpanList // 指没有空闲块的span列表

	nmalloc uint64      // 已累计分配的对象个数
}
  • lock: Inter-thread mutex to prevent multiple threads to read and write conflict
  • spanclass: Each mcentral manages a set span of the list have the same class
  • nonempty: refers to memory as well as a list of available span
  • empty: no memory refers to a list of available span
  • nmalloc: refers to the cumulative number of objects allocated

Thread gets span steps from the central:

  1. Lock
  2. Obtaining a usable span from nonempty list, and remove it from the list
  3. The removed empty into the span list
  4. The span is returned to the thread
  5. Unlock
  6. Thread the span into the buffer cache

Thread will span the return of the following steps:

  1. Lock
  2. The span is removed from the empty list
  3. The span join noneempty list
  4. Unlock

Above thread gets span and span simply return flow from the central, for simplicity, not on the details unfold.

heap

mcentral data structures visible from each mcentral target only specific management span class specifications. Each class corresponds to a fact that will mcentral, this set mcentral mheap stored in the data structure.

src/runtime/mheap.go:mheapIt defines a data structure of the heap:

type mheap struct {
	lock      mutex

	spans []*mspan

	bitmap        uintptr 	//指向bitmap首地址,bitmap是从高地址向低地址增长的

	arena_start uintptr		//指示arena区首地址
	arena_used  uintptr		//指示arena区已使用地址位置

	central [67*2]struct {
		mcentral mcentral
		pad      [sys.CacheLineSize - unsafe.Sizeof(mcentral{})%sys.CacheLineSize]byte
	}
}
  • lock: mutex
  • spans: spans point region for mapping the relationship between the span and the page
  • bitmap: bitmap start address of
  • arena_start: arena area first address
  • arena_used: The current arena has a maximum address area
  • central: each class corresponding to two mcentral

Seen from the data structure, mheap manages all memory, memory management is in fact Golang through a mheap types of global variables.

Memory management mheap diagram is as follows:
Here Insert Picture Description
The system is divided into a pre-allocated memory spans, bitmap, arean three regions, up through mheap management. Then look at memory allocation process.

Memory allocation process

There are different size distribution logic for different objects to be dispensed:

  • (0, 16B) and does not contain the object pointers: Tiny distribution
  • (0, 16B) the object pointer comprising: a normal distribution
  • [16B, 32KB]: normal distribution
  • (32KB, -): Tiny large object allocation and distribution of which belong to the large object allocation memory management optimization category, where the only concern being the general allocation method.

To request size n of memory, for example, assign the following steps:

  1. Get the current thread private cache mcache
  2. Calculated with the appropriate class according to size ID
  3. [Class] query the list of available span from the alloc mcache
  4. If mcache not available span from mcentral apply for a new span was added in mcache
  5. If mcentral also not available in span from mheap apply for a new span to join mcentral
  6. Obtained from the address of the span and return to an idle objects

to sum up

Golang memory allocation is a very complex process, which also doped with the GC process, is described here only for its critical data structures, understand its principles without getting bogged down in implementation details.

  1. Golang startup application program a large memory, and is divided into spans, bitmap, arena area
  2. arena area by one page is divided into small pieces
  3. span manage one or more pages
  4. mcentral span for managing multiple application threads to use
  5. mcache as a thread private resources, resources from mcentral
Published 158 original articles · won praise 119 · views 810 000 +

Guess you like

Origin blog.csdn.net/u013474436/article/details/103746444