"Malloc Source Code Interpretation II" - behind the language, all algorithms

Since there is no more detailed documentation, we can only continue to see the implementation of the source code in malloc. Continue to dismantle this place that you can understand.

struct malloc_chunk
{
  INTERNAL_SIZE_T prev_size; /* Size of previous chunk (if free). */
  INTERNAL_SIZE_T size;      /* Size in bytes, including overhead. */
  struct malloc_chunk* fd;   /* double links -- used only if free. */
  struct malloc_chunk* bk;
};

typedef struct malloc_chunk* mchunkptr;

/*

   malloc_chunk details:

    (The following includes lightly edited explanations by Colin Plumb.)

    Chunks of memory are maintained using a `boundary tag' method as
    described in e.g., Knuth or Standish.  (See the paper by Paul
    Wilson ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for a
    survey of such techniques.)  Sizes of free chunks are stored both
    in the front of each chunk and at the end.  This makes
    consolidating fragmented chunks into bigger chunks very fast.  The
    size fields also hold bits representing whether chunks are free or
    in use.
    malloc_chunk细节:

    (以下是科林·普拉姆略加编辑的解释。)

    使用“边界标签”方法维护内存块
    例如，Knuth或Standish描述。(见保罗的论文
    威尔逊ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps
    这类技术的调查。)空闲块的大小同时存储
    在每块的前面和后面。这使得
    快速地将碎片块合并成更大的块。的
    Size字段还包含表示块是空闲还是空闲的位
    在使用。

When I saw this, it turns out that the design of malloc is so ingenious. A double-linked list structure is used to represent the chunk (meaning block) in memory. Knuth mentioned in the sentence is an algorithm master, but his achievements are far more than these. Children's shoes who have studied computer algorithms will probably know it more or less; it doesn't matter if they don't know, anyway, they will know as soon as possible. Use the double-linked list structure to connect memory fragments (small pieces of memory) in series to merge into a large chunk. So for 30 years, people have considered the problem of memory fragmentation. It's just that I didn't expect that people would use such a simple data structure to manage the allocation of memory. This is my humble opinion. So until now, people who still use memory fragments to ask people are pure nonsense, and half a bottle of water is dangling. What is revealed is only the lack of one's own ability.

Let's see how our chunk is designed!

detail:

    An allocated chunk looks like this:


    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of previous chunk, if allocated            | |
     块                    前一个块的大小，如果已分配的情况下   
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of chunk, in bytes                         |P|
                           块大小，以字节单位 
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             User data starts here...                          .
            .             用户数据，从这里开始..                                                                    
            .             (malloc_usable_space() bytes)                     .
            .             可用空间 字节                                                  |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  下一个块   |             Size of chunk   块大小                                   |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Among them, chunk, mem, and nextchunk are three pointers, which point to different contents. As shown above. Continue to see ta's explanation and design details of chunk.

   Where "chunk" is the front of the chunk for the purpose of most of
    the malloc code, but "mem" is the pointer that is returned to the
    user.  "Nextchunk" is the beginning of the next contiguous chunk.

    Chunks always begin on even word boundaries, so the mem portion
    (which is returned to the user) is also on an even word boundary, and
    thus double-word aligned.
    
    “块”在哪里是块的前面，为大多数的目的
    对象的malloc代码，但“mem”是返回到
    用户。“Nextchunk”是下一个连续块的开始。

    块总是从偶数单词边界开始，所以是mem部分
    (返回给用户)也是在一个偶数字边界上，并且
    因此双字对齐。

Considering the way of memory alignment here, the boundary of even number of bits is returned.

  Free chunks are stored in circular doubly-linked lists, and look like this:
       自由块储存在一个双向循环链表中
    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of previous chunk   前一个块大小              |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `head:' |             Size of chunk, in bytes  块大小 字节单位          |P|
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Forward pointer to next chunk in list             |
                            指向链表中下一个块的前向指针
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Back pointer to previous chunk in list
                            指向列表中上一个块的返回指针            |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Unused space (may be 0 bytes long)                .
            .               未使用的空间(可能是o字节长)                                   
            .                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `foot:' |             Size of chunk, in bytes   块大小 字节单位       
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

So, this is the structure of a memory block ta where bits are used. It is worth noting that the so-called heap memory is actually a continuous virtual memory with variable length. For details, see Chapter 7 of "Linux-UNIX System Programming Manual". No more pressing, continue with our source code interpretation.

   The P (PREV_INUSE) bit, stored in the unused low-order bit of the
    chunk size (which is always a multiple of two words), is an in-use
    bit for the *previous* chunk.  If that bit is *clear*, then the
    word before the current chunk size contains the previous chunk
    size, and can be used to find the front of the previous chunk.
    (The very first chunk allocated always has this bit set,
    preventing access to non-existent (or non-owned) memory.)

    Note that the `foot' of the current chunk is actually represented
    as the prev_size of the NEXT chunk. (This makes it easier to
    deal with alignments etc).

    类型的未使用的低阶位中存储的P (PREV_INUSE)位
    块大小(通常是两个words的倍数)是一个常用的词
    *前一个*块的位。如果该位是*clear*，则
    当前块大小前面的字包含前一个块
    大小，并可用于查找前一个块的前面。
    (分配的第一个块总是有这个位，
    阻止访问不存在的(或非拥有的)内存。

    注意，当前块的“foot”实际上是表示的
    作为NEXT块的prev_size。(这样更容易
    处理对齐等)。

The meaning here probably means that the size of a chunk is usually two words (word 16 bits), PREV_INUSE is actually a macro definition, using ta to judge with other conditions, the result is always true, because the definition of ta is 0x1 .


/* size field is or'ed with PREV_INUSE when previous adjacent chunk in use */

#define PREV_INUSE 0x1

/* size field is or'ed with IS_MMAPPED if the chunk was obtained with mmap() */

#define IS_MMAPPED 0x2

/* Bits to mask off when extracting size */

#define SIZE_BITS (PREV_INUSE|IS_MMAPPED)


/* Ptr to next physical malloc_chunk. */

#define next_chunk(p) ((mchunkptr)( ((char*)(p)) + ((p)->size & ~PREV_INUSE) ))

Exceptions, and caveats.

 The two exceptions to all this are

     1. The special chunk `top', which doesn't bother using the
        trailing size field since there is no
        next contiguous chunk that would have to index off it. (After
        initialization, `top' is forced to always exist.  If it would
        become less than MINSIZE bytes long, it is replenished via
        malloc_extend_top.)

     2. Chunks allocated via mmap, which have the second-lowest-order
        bit (IS_MMAPPED) set in their size fields.  Because they are
        never merged or traversed from any other chunk, they have no
        foot size or inuse information.

这一切的两个例外是

    1. 特殊的块' top'，它不需要使用
       拖尾大小字段，因为没有
       下一个连续的块，必须对它进行索引。(在
       初始化时，' top'被强制始终存在。如果可以的话
       如果长度小于MINSIZE字节，则通过补全
        malloc_extend_top)。

    2. 通过mmap分配的块，具有次低的顺序
        bit (is_mmapping)设置在它们的大小字段中。因为他们是
        从来没有合并或从任何其他块遍历，他们没有
        脚的大小或使用信息。

bins: The concept of the box is here, and the concept of the box is a more important part of memory allocation. All the boxes are roughly divided into 128, which are connected in series with each other in a double-linked list structure. Just as explained below. (It is worth noting that the source code in the latest version has redesigned the box. It can only be said that it is rough.) Don't ask me why there are 128

#define NAV             128   /* number of bins */

   Available chunks are kept in any of several places (all declared below):
            可用的块保存在以下任何一个地方(都在下面声明):
 、
   * `av': An array of chunks serving as bin headers for consolidated
       chunks. Each bin is doubly linked.  The bins are approximately
       proportionally (log) spaced.  There are a lot of these bins
       (128). This may look excessive, but works very well in
       practice.  All procedures maintain the invariant that no
       consolidated chunk physically borders another one. Chunks in
       bins are kept in size order, with ties going to the
       approximately least recently used chunk.
       
       av':作为合并块的bin头的块数组。每个箱子都是双向连接的。
       这些箱子大约按比例(对数)间隔。有很多这样的箱子(128个)。
       这看起来可能有些过分，但实际上效果非常好。
       所有过程都保持不变，即没有合并块在物理上与另一个块相邻。
       箱子中的块按大小顺序排列，并将其连接到最近使用最少的块。

Box design and arrangement:


       The chunks in each bin are maintained in decreasing sorted order by
       size.  This is irrelevant for the small bins, which all contain
       the same-sized chunks, but facilitates best-fit allocation for
       larger chunks. (These lists are just sequential. Keeping them in
       order almost never requires enough traversal to warrant using
       fancier ordered data structures.)  Chunks of the same size are
       linked with the most recently freed at the front, and allocations
       are taken from the back.  This results in LRU or FIFO allocation
       order, which tends to give each chunk an equal opportunity to be
       consolidated with adjacent freed chunks, resulting in larger free
       chunks and less fragmentation.
         
        每个bin中的块按大小递减排序。这与小的容器无关，它们都包含相同大小的块，
        但有助于为较大的块进行最合适的分配。(这些列表只是顺序的。保持它们的有序
        几乎不需要足够的遍历来保证使用更漂亮的有序数据结构。)相同大小的块与前面
        最近释放的块相链接，分配从后面获取。这导致了LRU或FIFO分配顺序，它倾向于
        给每个块与相邻的释放块合并的平等机会，从而产生更大的空闲块和更少的碎片。      


    * `top': The top-most available chunk (i.e., the one bordering the
       end of available memory) is treated specially. It is never
       included in any bin, is used only if no other chunk is
       available, and is released back to the system if it is very
       large (see M_TRIM_THRESHOLD).
       
       top':最顶端的可用块(即，与可用内存末端相邻的块)被特殊对待。
       它永远不会包含在任何bin中，仅在没有其他可用的块时使用，并且在非常时
       释放回系统large(参见M_TRIM_THRESHOLD)。 
        

    * `last_remainder': A bin holding only the remainder of the
       most recently split (non-top) chunk. This bin is checked
       before other non-fitting chunks, so as to provide better
       locality for runs of sequentially allocated chunks.
        
       last_remainder':只保存最近分割(非顶部)块的剩余部分的bin。
       在其他不合适的块之前检查这个bin，以便为按顺序分配的块的
       运行提供更好的局部性。

    *  Implicitly, through the host system's memory mapping tables.
       If supported, requests greater than a threshold are usually
       serviced via calls to mmap, and then later released via munmap.

       隐式。通过主机系统的内存映射表。如果支持，则通常通过对mmap的调用来
       处理大于阈值的请求，然后通过munmap释放。

There is also a design about arena in multi-threading... let's go here for the time being.

To be continued...

references:

The source of the original author of the document

Memory Allocator (oswego.edu)

The latest version of the source code adjustment: https://ftp.gnu.org/gnu/glibc/glibc-2.37.tar.gz

If you haven't read it, don't read it, there are too many. Looks less like code than source code. The revision of the latest version is too different from the old version in the original one, with a lot of structure adjustments and notes rewritten. Nearly 2,000 more lines of code have been added, the concept of bins (box) has been readjusted and designed, and there are some redesigns of the Arena class on thread safety.

Next chapter Take your time, let's adjust and design ta!

"Malloc Source Code Interpretation II" - behind the language, all algorithms

Guess you like