C++ memory management (two)

Continuing from the previous lecture, this lecture is about the allocator in the C++ standard library, which is the memory management obtained in std::allocator. The difference is between the 1.1 and 1.2 management versions in the previous C++ Primer. The allocator is in The level is higher, and the memory management efficiency, which is mainly reflected in space and time, has been greatly improved. There is an explanation from the C++ master Hou Jie in station B. If you are very interested in the underlying mechanism, you can go straight to it. The detailed explanation of the teacher can really benefit people.

Lecture 2 std::allocator

Insert picture description here
This picture shows the memory block structure (VC6 version) obtained by malloc. Suppose we need to apply for a memory size of 12 bytes. At this time, the memory block size actually allocated to us by the system is greater than 12 bytes (block size), as above As shown in the figure, it also contains two cookies, a total of 8 bytes, a debug header and a debug tail, a total of 4*8+4=36 bytes, so add 12 to get a total of 56 bytes, and the pad area is equivalent to one Fill area, because all memory allocation must start at an address that is divisible by 4, 8 or 16 (depending on the processor architecture), it is 16 in VC6, so the function of the pad area is to expand the allocated memory size to Multiples of 16. The role and significance of each area will be explained in detail in the third lecture. Now we only need to know the actual memory size and structure as shown above, and the pointer points to the first address of 12 bytes. The last lecture mentioned that the most important thing in memory management is to reduce the number of malloc calls and the amount of memory consumed by cookies. This lecture mainly deals with cookies. We may have questions about cookies. Since we want to find ways to reduce the waste of cookies in memory management, why do we need to design cookies when designing? In fact, for industrial-grade programs, we may need thousands of times of lightweight memory. For example, I need to perform 1 million memory allocations, and the parameters of the container allocated for these 1 million times are the same, such as size, type, etc., then If cookies are added to the memory allocated each time, the gate will have a huge redundancy. So we still need necessary cookies, but for redundancy, we can design a data structure to delete unnecessary cookie waste, which is also the main purpose of this lecture.

The allocator we discussed this time is located in GUNC2.9. In fact, the allocator (std::allocator) in the standard library used by G2.9 does not have any design for memory management, but an external The version is the allocator used by the G2.9 container. Opening the source code, we can find that the second parameter in the template designed for the container is not class Alloc=allocator, but class Alloc=alloc, that is, std::alloc, this alloc The designer of the version designed a pool_alloc for memory management. The design technique is very clever to make the memory allocation run efficiently, which is worthy of deep understanding by readers.

Design Ideas In the
Insert picture description here
last lecture, we rewrite operator new in each class and allocate a memory linked list for each class for memory allocation. In G2.9std::alloc, we collect all linked lists into one with 16 linked lists The linked list free_list[16] can be regarded as an array logically, and a linked list head pointer is placed in each array. The size of the storage area of ​​each linked list increases from low to high. For example, #0 is responsible for 8-byte linked list, #1 is responsible for 16-byte linked list, #2 is responsible for 24-byte linked list,..., #15 is responsible for 128-byte linked list Linked list. There are several questions. What should we do when the memory size we need is greater than 128 bytes or not a multiple of 8? First of all, when the required number of bytes is greater than 128, the allocator directly calls malloc to allocate memory. When it is less than 128 bytes, alloc allocates memory without cookies, which is also in line with the design idea of ​​lightweight memory management. If the number of bytes allocated is not an integer of 8, then we will take the nearest multiple of 8.

The next step is the general idea. Assuming that we have a container that applies for a size of 32 bytes, then the allocator will pull out a pointer of 32/8-1=3, that is, #3 free_list element, and apply for a size of 32byte 20 2. The size of this block is the large memory with cookies. Then we cut the first 640 bytes into 20 blocks to supply programmers' applications for 32 bytes, and put the 640 bytes of memory into a combat readiness pool. Then we have another container to apply for a size of 64 bytes. At this time, the allocator finds that there is still 640 bytes in the preparedness pool. It will not go to malloc a large memory with rookie, but will pull out a pointer to transfer the preparedness pool. The 640 bytes in the 640 bytes are divided into ten blocks and the first block is allocated to the container, and the remaining nine blocks are used for future applications for 64 bytes. At this time, the capacity of the preparedness pool is 0. If another type of container applies for memory next time, the program malloc memory is required.

Insert picture description here

There is also a very important concept. In the last lecture, in order to improve the memory consumption of the next pointer in version 1.1, we borrowed the first four bytes of a union to set the next pointer. In this structure, we put the embedded pointer in each small block. The first four bytes of each small block are the next pointers, which concatenate the empty areas of the entire linked list. After the memory block is allocated, the user only needs to write data in the area to overwrite the pointer data and make the pointer invalid. When reclaiming, the system automatically allocates the embedded pointer to this area to concatenate the entire free_list.

Finally, we walked through the process in detail to experience the details.
Insert picture description here
Suppose a container 1 needs to apply for 32 bytes of memory. At first the memory pool is empty, and a large block of malloc memory is needed. The specific size is 32bytes 20 2+RoundUp()=1280, where RoundUp() is an up-regulation function. The specific implementation of this function is to remove the current total amount of applications to 16 , and hang it in 32/8-1=3#. Among them, the first 640bytes is cut into 20 blocks, the first block is allocated, the remaining 19 blocks form a free_list, and the last 640bytes is used as a preparedness pool for other types of containers.
Insert picture description here
At this time, there is a kind of container 2 that needs to apply for 64bytes. First, look for the corresponding free_list at the 64/8-1=7# linked list node to see if there is a corresponding free_list that can be allocated. If not, check the preparation pool. Since the preparation pool has capacity, cut its 640bytes into Ten blocks, the first block is allocated, and the remaining 9 blocks form a free_list, which is hung at the node of the 7# linked list. At this time, the capacity of the combat readiness pool is 0.
Insert picture description here
In the same way, if there is container 3 applying for 96bytes, first look for the corresponding free_list at the 96/8-1=11# linked list node to see if there is a corresponding free_list that can be allocated, if not, check the preparation pool. Since the preparation pool is also 0, you need to re-malloc one Large block of memory, the method is the same as that of container 1 to get the application.
Insert picture description here
In the same way, if there is another container 4 to apply for 88bytes, first look for the corresponding free_list at the 88/8-1=10# linked list node to see if there is a corresponding free_list that can be allocated. If there is no corresponding free_list, check the preparation pool. Since the preparation pool has capacity, it will be 2000- 88*20=240, that is, cut into 20 blocks, the first block is allocated, and the remaining 19 blocks form a free_list, which is hung at the node of the 10# linked list. At this time, the capacity of the preparedness pool is 240 bytes.

Insert picture description here

The following is the same. Now we discuss some details, such as the picture below. If there is only 80bytes left in the combat readiness pool at a certain moment, there is a container 5 that applies for 104bytes. First, there is no free_list at the link list node 104/8-1=12# to allocate. Check the readiness pool and find that there is only 80bytes capacity, which cannot be satisfied. 104bytes in size. At this time, the system will put the remaining memory in the preparedness pool at the node of the 80/8-1=9# linked list to make the preparedness pool empty, and then use malloc to allocate a large block of memory just like the container 1 operation.

Insert picture description here

If at a certain moment, there is only 168bytes left in the combat readiness pool, at this time there is a container 6 applying for 48bytes. First, there is no free_list at the node of 48/8-1=5# linked list to allocate. Check the readiness pool and find that there is only 168bytes capacity. Take three small 48bytes areas from 168bytes, allocate the first block, and the remaining 2 blocks form a free_list, and hang them at the 5# link list node. At this time, the remaining 168-48*3=24bytes in the preparedness pool. This is the fragment handling method in this structure .

Insert picture description here

Next, we will discuss the recycling technology when malloc fails . Assume that the preparedness pool is 0 at a certain moment, the cumulative amount of memory requested is 9688, and the total size of the system heap area is 10000. Then I have a container 7 to apply for 72bytes size. First, there is no free_list to allocate at the 72/8-1=8# linked list node, and there is no capacity to allocate in the readiness pool. As usual, we need a large block of malloc memory, but at this time the system has no memory to allocate, causing malloc to fail. But looking at the above picture, we still have free_list waiting to be allocated soon, so we wonder whether we can use the unallocated free_list on hand to allocate the memory needed by container 7? Our idea is to backfill it with the free_list of the closest non-empty linked list node. For example, if we need 72bytes and there is no free_list at 72/8-1=8#, then start from 9# to find if there is a free_list. Observe the above picture (the previous picture above), we find that there is a small block at 9# (The size is 80bytes) waiting to be allocated, so we allocate 72bytes among them to 8# for distribution, and the remaining 8btyes are put into the combat readiness pool for backfilling.

With the design ideas and illustrations, the final implementation still has to be implemented in the code, as shown in the figure below.

std::alloc(G2.9)

The allocator in alloc (G2.9) is divided into two levels, and all the above discussions are actually in the second level of the allocator. The main function of the first-level allocator is to enable it after the second-level alloc allocation fails. The first-level allocator is used for malloc. The _callnewh() in the previous lecture, which is set_new_handler(), is implemented here. We don't need to know more about it, we cut directly into the second season distributor.

Insert picture description here
First of all, we found that three enumeration types are designed before the class, which is actually a constant design. We mentioned above that all linked lists are collected into a linked list free_list[16] with 16 linked lists. The size of the storage area of ​​each linked list increases from low to high. For example, #0 is responsible for 8-byte linked list, #1 is responsible for 16-byte linked list, #2 is responsible for 24-byte linked list,..., #15 is responsible for 128-byte linked list Linked list. The three constants here are the lower limit, upper limit, and number of types of small blocks. Then we look at the source code. From the data members of the class, a union obj is an embedded pointer, free_list[_NFREELISTS] stores an array of 16 linked list pointers, start_free, end_free, and heap_size are the start and end pointers of the preparedness pool and The total number of applications for resources. Looking at the member functions again, ROUND_UP() is the up-regulation function, FREELIST_INDEX() is to adjust the requested memory to a multiple of 8, and finally there are two key functions refill() and chunk_alloc(). The former implements free_list allocation, and the latter implements large memory allocation.

First look at the alloccate() and deallocate() source code
Insert picture description here
allocates. First, compare whether the number of bytes to be applied for is greater than the upper bound 128 bytes of the small blocks in the linked list. If it exceeds, then directly call the first-level allocator to allocate memory using malloc. The next step is to determine which linked list the requested number of bytes is located in. My_free_list calculates the array where the specified linked list node is located and returns the address of its first node, and assigns it to the result pointer. If the result is empty, it means that the free_list of the linked list node is empty, and you need to use refill for memory allocation, otherwise you only need to move the embedded pointer to point to the next empty small block.
Insert picture description here

In deallocate, first compare whether the number of bytes in the area to be reclaimed is greater than the upper bound of 128 bytes in the small blocks in the linked list, and then determine which linked list node the number of bytes to be recovered is in the array, and then set the recovery area p The embedded pointer makes it point to the free_list_link (next) pointer of the head node of the linked list, which is the first small block area of ​​the linked list, and then points the free_list_link (next) pointer of the head node of the linked list to p to complete the recovery.
Insert picture description here
Let's look at the source code of refill().
Insert picture description here
First, suppose we can take 20 blocks and give them to the chunk_alloc function to get large memory. Because nobjs is passed by value by reference, when nobjs==1, it means that only one piece of memory with a size of exactly n is obtained, then there is no need to cut at this time, just assign an applicant directly. Otherwise, we need to cut the large memory to divide it into n small blocks according to our requirements. First observe #260, we see that the pointer my_free_list points to the start address of the second small block. This is because the first small block needs to be allocated, so we directly start the loop from the second block. In the loop, we see #263, The next_obj pointer is first converted into bytes, then a small block byte number is added to make it point to the starting address of the next small block, and then converted into an obj type pointer and assigned to the next_obj pointer to realize a cutting pointer movement. In the loop, we continue to assign embedded pointers to the head of each small block to point to the starting address of the next small block until the last one is assigned NULL and then exit.

Finally, we look at the most critical chunk_alloc() function.
Insert picture description here
First look at the local variables. Total_bytes represents the required memory size, that is, the size of 20*small block. The initial value of nobjs is 20. bytes_left is the remaining memory size, that is, the size of the preparedness pool. The initial value is the large memory size obtained by malloc. First, we compare bytes_left and total_bytes. If it is larger, it means that the large memory can not only cut 20 small blocks, but the excess memory can also be placed in the combat readiness pool. At this time, we only need to make the result point to the memory starting address and modify the bytes_left size. . If bytes_left is greater than one, it means that the large memory obtained by the application cannot be divided into the preparation pool, and only small blocks of less than 20 blocks can be cut out. At this time, we need to confirm how many blocks can be cut, and then modify the bytes_left size. In other cases, the size of bytes_left is fragmented, and even a small block cannot be satisfied. At this time, we need to deal with the fragments and let them hang on a specific linked list node array. At this time, the preparedness pool is emptied to 0, which can be a large memory for malloc. Be well prepared.
Insert picture description here
First malloc a large memory, if the allocation is successful, we only need to modify the parameters, and then apply for the memory address. 0==start_free indicates that the allocation is unsuccessful, then we need to use the existing empty linked list to allocate the needed memory. As mentioned above, let’s look at the code directly. The for loop starts from the linked list node to which the size belongs, and increases by 8 each time until 128 to find the area where the free_list can allocate the memory that I currently apply for. If found, use the pointer p to point to the entry address and then modify the embedded pointer to allocate memory and modify the start_free and end_free addresses of the free_list and return the allocated memory.

The above is the main idea and code of G2.9 version of alloc for memory management. There are not many main functions and the data structure is very simple, but the design idea is worthy of our careful consideration.

Guess you like

Origin blog.csdn.net/GGGGG1233/article/details/115026839