A brief discussion of allocators in STL

The allocator is one of the six major components in STL and is the key to the normal operation of major containers. However, it is indeed transparent to users. It seems more like a hero behind the scenes and will never come to the stage. The audience can hardly see it, but it is so important. As a user, you hardly need to care about how its underlying layer is implemented, and you rarely even have the opportunity to use it. Here I will briefly talk about my understanding of it.

How do we obtain a piece of memory under normal circumstances?

malloc can help you obtain a piece of memory and return the first address of this memory;
The bottom layer of the new operator is also implemented using malloc, but compared to malloc, it will not only give you a piece of memory, but also help you automatically initialize this memory, that is, call the constructor of the corresponding object.
Operator new is a way for C++ to obtain memory. Note: new operator and operator new are two different things. It also calls malloc to obtain memory. It just encapsulates some things and adds some exception mechanisms.
The bottom layer of allocate initially provided by compiler manufacturers such as VC, BC, GNU C, etc. is also implemented by calling operator new.

So, have you noticed? Different approaches lead to the same goal. Almost everyone achieves the operation of acquiring memory by calling malloc. Depending on the machine, malloc calls the API interface provided by the bottom layer of the operating system to obtain real memory.

Insert image description here

However, if you apply for a 10-byte memory, the size of the memory malloc gives you is not really 10 bytes. The memory you can use here is 10 bytes, yes, but there will be some additional overhead. They will add so-called "cookies" at both ends of this memory to handle other things, such as your What you receive when you buy something is not actually the thing itself, but also extra things such as express boxes, express bags, express delivery notes, etc. to help the things you buy yourself reach your hands. These things may not be useful to you, but they are unavoidable expenses.

From this perspective, if the things placed in a container are small, but the number of elements is large, if you want to put a 2-byte short type element in the container, and there are 1 million such containers, In this way, when it is the turn of the underlying allocator of this container to help you allocate memory, due to the existence of cookies, you may get 10 bytes when applying for such a container, of which 2 bytes are the memory you want, and the remaining 8 Bytes are additional overhead. In this way, 1 million containers originally only required 2 million bytes, but now you have to get 10 million bytes. The performance is really not that high.

This is not to say that cookies consume a lot of memory and cause your performance to be unsatisfactory, but there is a proportion problem. If the memory of the elements placed in your container is large, then this additional overhead will be very small and completely acceptable; but in more cases, the elements placed in the container are not actually that large, which also affects the performance. not ideal.

How to solve this problem?

One idea given in SGI STL is to put a lot of allocators first, but each allocator is only responsible for applying for a certain fixed size of memory. When the container actually applies for memory, the allocator of the corresponding size will apply for a block. It has a large memory, but it cuts the memory into fixed-size memories, and then returns the first address of a fixed-size memory to the user.

Using this strategy, you will no longer be troubled by additional overhead, because the real application for memory is only the first time, so you will only get the cookie once. When the large memory is cut into a fixed size, each piece of memory There are no cookies on the server, so there is no additional overhead.
Insert image description here

STL provides two levels of memory allocator:

When the allocation is greater than 128KB, the new operator is directly used, which is a first-level memory allocator;
When the allocation is less than 128KB, a secondary memory allocator, that is, a memory pool, is used, which is implemented through a free linked list. ReferenceArticle.

Why are there two levels? Mainly to reduce memory fragmentation and reduce the number of mallocs. Therefore, the memory pool is equivalent to the middleware for application code and system calls to apply for memory.

First level memory allocator

operator new

operator newCan be overloaded:

When overloaded, the return type must be declared as void*;
When overloading, the first parameter type must be the size of the allocated space (bytes), and the type is size_t. Of course, other parameters can also be taken;

like:

   class Foo
   {
    
    
   public:
     static void *operator new (size_t size)
     {
    
    
         Foo *p = (Foo*)malloc(size);
         return p;
     }
   
    static void operator delete(void *p, size_t size)
     {
    
    
         free(p);
     }
   };

This is simply implemented using malloc and free, and the memory pool can be used later.

C++ also provides global operator new and operator delete, which can be passed ::operator new and ::operator deleteTo access the global operator.

placement new

operator new implements the first step of the new expression, which is to allocate memory, so who will call the constructor? It isplacement new, its syntax is:

 Object * p = new (address) ClassConstruct(...)

Here requires that address is void*, and placement new is defined in the #include<new> header in the file. It can also be overloaded, and it also provides global placement new access through ::.

for example

 int* ptr = ::operator new(sizeof(int));
 ::new ((void*)ptr) int();

In fact, placement new is also an overloaded version of ! However, this overloaded version is often used to call the constructor. Such as:operator new

 class Foo
 {
    
    
 public:
     //一般的 operator new 重载
     void* operator new(size_t size)
     {
    
     return malloc(size); }
 
     //标准库已经提供的 placement new() 的重载形式
     void* operator new(size_t size, void* start)
     {
    
     
       dosomething;
       return start; 
     }
 };

What are the benefits of splitting into two parts for new operator and delete operator? When using the new expression to allocate memory, you need to find a large enough remaining space in the heap. Obviously this operation is very slow, and there may be an exception that cannot allocate memory (not enough space). .

placement new can solve this problem. The constructor is performed on a pre-prepared memory buffer, no memory search is required, and the memory allocation time is constant. And there will be no out of memory exception during the program running. Therefore, placement new is very suitable for applications that have high time requirements and do not want to be interrupted when running for a long time.

In short, the repeated allocation of memory caused by new is wasteful, so placement new directly fixes the memory, and repeatedly constructs and destroys the fixed memory, but no longer allocates and releases memory repeatedly.

note: If you use placement new, don’t forget to call the destructor before operator delete! Unless the element's destructor is irrelevant.

allocator

STLallocator is responsible for allocating memory to the container, releasing memory, calling the constructor of the element, and calling the destructor of the element.

In fact, after understanding the above content, STL's allocator is very simple.

Four major methods are provided to the outside world:

allocator method: calloperator new
construct method: callplacement new
deallocator method: calloperator delete
destroy method: call~T()

note: Not all classes need to call destroy. When the destructor of the class is irrelevant, we can not perform the destructor, so what kind of destructor is irrelevant ? can be judged using the std::is_trivially_destructible class template. Specifically:

Use an implicitly defined destructor, that is, do not define your own destructor
Destructor is not a virtual function
Its base class and non-static members are also trivially destructible.

In fact, you will find that basic_string does not call the destructor before releasing the memory, precisely because basic_string strictly requires that the destructor of the element class is irrelevant. Vector, etc. need to call the destructor before releasing the memory.

Second level memory allocator

First apply for a large block of memory, and then cut it into small blocks, which are strung together by a one-way linked list. The memory pool includes sixteen linked lists, which are responsible for different sizes of memory. For example, the seventh block is responsible for 256 bytes, and the 8 doubled growth.

As for the quality of the STL memory pool design, it is also controversial: the allocator in the C++ standard library is redundant, and the allocator is used as a template parameter, which results in different allocators being of different types.