引言

前文提到allocator类将new操作符内存的分配与对象的构造或delete操作符对象的析构和内存的释放分为两个阶段。内存的分配拟和释放分别由::allocate()和::deallocate()负责;对象的构造和析构分别由 ::construct()和::destroy()负责。本文参考《STL源码剖析》和SGI STL实现，分析allocator类中的构造与析构。

destroy函数和construct函数的实现

construct函数很简单，只是利用placement new调用对象的构造函数。而destory函数就稍微复杂一些，利用了type_traits编程技巧，判断迭代器所指元素类型中是否有non-trivial 析构函数，如果没有则直接结束，否则依次调用析构函数。这样处理的话，万一需要对很大范围的对象执行析构函数，但是析构函数都是无关痛痒的(trivial)则没有必要一次次调用析构函数，这样能大大提高效率。

#include<new>
#include<iostream>
#include<type_traits>
template<class T1,class T2>
inline void construct(T1 *p, const T2& value){
    new (p)T1(value); //placement new;call T1::T1(value);
}

//destroy第一个版本，接受一个指针
template<class T>
inline void destroy(T* pointer){
    pointer->~T();
}

//第二个版本接受两个迭代器，释放一个范围的内存，利用type_traits<>根据不同类型求得最适当的措施
template<class ForwardIterator>
inline void destroy(ForwardIterator first, ForwardIterator last){
    __destroy(first, last, std::_Val_type(first));
}

template<class ForwardIterator,class T>
inline void __destroy(ForwardIterator first, ForwardIterator last, T*){
    __destroy_aux(first, last, std::has_trivial_destructor<T>());
}

template<class ForwardIterator>
inline void __destroy_aux(ForwardIterator first, ForwardIterator last, std::false_type){
    std::cout << "in false_type" << std::endl;
    for (; first < last; ++first)
        destroy(&(*first));
}

template<class ForwardIterator>
inline void __destroy_aux(ForwardIterator first, ForwardIterator last, std::true_type){
    std::cout << "in true_type" << std::endl;
}

SGI STL的实现

SGI STL是STL的经典实现之一，那么SGI是如何考虑allocator的设计呢？
1. 从heap申请空间
2. 考虑多线程
3. 考虑内存不足
4. 考虑内存碎片
为了简化问题，我们不考虑多线程的情况。

剩下的三个问题中，申请内存可以采用malloc和free库函数，内存不足也在其中相应的考虑。而内存碎片则需要更精细的考虑。

内存碎片

当程序同时处理一系列内存需求，其中包括大块内存，小块内存，大小混合内存分配和释放时，有可能有大量的小块内存持续的分配和释放。这样很容易产生内存碎片，也就是说请求一个相对较大的内存时，空闲内存总和足够分配，但是不能分配一个连续的大块内存。

为了解决内存碎片的问题，SGI 设计了两级allocator:

当请求内存较大(大于128bytes)时,采用第一级allocator
小于128bytes时，采用第二级allocator，采用内存池的方法管理小内存

为了符合STL标准，SGI还定义了simple_alloc接口,无论采用第一级还是第二级allocator都能满足标准:

template<class T, class Alloc>
class simple_alloc{
public:
    static T*allocate(size_t n){
        return 0 == n ? 0 : (T*)Alloc::allocate(n*sizeof(T));
    }
    static T*allocate(){
        return (T*)Alloc::allocate(sizeof(T));
    }
    static void deallocate(T *p, size_t n){
        if(0 != n) Alloc::deallocate(p, n * sizeof(T));

    }
    static void deallocate(T *p){
        if (0 != n) Alloc::deallocate(p, sizeof(T));
    }

};

第一级和第二级allocator的实现

第一级和第二级allocator主要区别在于处理不同大小的内存申请。内存申请大于128bytes时，第一级allocator直接调用malloc和free函数，并模拟C++的set_new_handler函数处理内存不足的情况。

第二级allocator则复杂很多，具体实现包括：

维护16个自由链表，负责16种不同大小的内存空间分配。
自由链表的内存由内存池分配，而内存池则由malloc函数分配内存
如果内存不足或请求内存大于128bytes则转调第一级allocator

具体代码如下所示，第二级allocator用内存池管理已分配但未使用的内存，当客户端请求内存时，首先查询free-list中是否有适当大小的内存块，如果有直接分配该内存块并将该内存块从freelist中移除。如果freelist没有匹配大小的可用内存块，则调用chunk_alloc()函数向内存池申请内存。chunk_alloc函数根据情况，处理申请请求：

默认分配20个用户索取区块大小的内存块，将其中一个返回给客户，剩余19个加入合适的free-list中
如果内存池中内存不足以分配20个区块(根据end_free和start_free可以查看内存池“水位”)，那么就分配内存池剩余内存可以分配的最大区块数，比如剩余100bytes，请求内存块大小为32bytes，则分配3个区块
如果内存池中剩余内存连一个区块都不能分配了，那么就向heap申请内存，但是需要把残余的内存分配到相应的freelist中，比如客户申请32bytes内存，但是内存池只剩下20bytes，那么就将这20bytes加入free-list管理16bytes的list中，这样只会产生4bytes的内存碎片

如果向heap申请内存失败，就只能检查free-list中是否有更大的区块能够分配给当前申请，注意这里需要递归调用chunk_alloc函数确保nobjs的正确。
最后，如果free-list中没有足够大可分配的内存，只能调用一级allocator查看是否能通过它的out-of-memory机制释放内存，或者从一级allocator中抛出异常。

enum {__NOBJS = 20};//每次填充freelist的节点数
enum {__ALIGN = 8};// 小型区块的上调边界
enum {__MAX_BYTES = 128};//小型区块的上限
enum {__NFREELISTS = __MAX_BYTES / __ALIGN};//free-lists 的个数 
template<bool threads,int inst>
class __default_alloc_template{
    private:
        static size_t ROUND_UP(size_t bytes){
            return (((bytes)+__ALIGN - 1) &~(__ALIGN - 1));
        }
private:

        union obj{
            obj *free_list_link;
            char client_data[1];
        };
    private:
        static obj * volatile free_list[__NFREELISTS];
        static size_t FREELIST_INDEX(size_t bytes){
            return (((bytes)+__ALIGN - 1) / __ALIGN - 1);
        }
        static void *refill(size_t n);
        static char*chunk_alloc(size_t size, int &nobjs);

        static char *start_free;
        static char *end_free;
        static size_t heap_size;
public:
    static void *allocate(size_t n);
    static void deallocate(void *p, size_t n);
    static void *reallocate(void *p, size_t old_sz, size_t new_sz);


};
template<bool threads, int inst>
char *__default_alloc_template<threads, inst>::start_free = 0;

template<bool threads, int inst>
char *__default_alloc_template<threads, inst>::end_free = 0;


template<bool threads, int inst>
size_t __default_alloc_template<threads, inst>::heap_size = 0;


template<bool threads, int inst>
typename __default_alloc_template<threads, inst>::obj * volatile __default_alloc_template<threads, inst>::free_list[__NFREELISTS] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };

template<bool threads, int inst>
void * __default_alloc_template<threads, inst>::allocate(size_t n){
    obj *volatile * my_free_list;
    obj *result;
    if (n > (size_t)__MAX_BYTES)
        return (malloc_alloc::allocate(n));
    //寻找free-lists中适当的一个list
    my_free_list = free_list + FREELIST_INDEX(n);
    result = *my_free_list;
    if ( 0 == result){//如果没有可用的free-list则重新填充free-list
        void *r = refill(ROUND_UP(n));
        return r;
    }
    //将free-list的首个块取出，调整至下一个块
    *my_free_list = result->free_list_link;
    return result;
}

template<bool threads, int inst>
void  __default_alloc_template<threads, inst>::deallocate(void *p, size_t n){
    obj *q = (obj *)p;
    obj * volatile *my_free_list;
    if (n > (size_t)__MAX_BYTES){
        malloc_alloc::deallocate(p, n);
        return;
    }
    my_free_list = free_list + FREELIST_INDEX(n);
    //将释放的内存加入到free-list的首部
    q->free_list_link = *my_free_list;
    *my_free_list = q;
}

template<bool threads, int inst>
void * __default_alloc_template<threads, inst>::refill(size_t n){
    int nobjs = __NOBJS;
    char *chunk = chunk_alloc(n, nobjs);
    obj * volatile *my_free_list;
    obj *result;
    obj * current_obj, *next_obj;
    int i;
    //如果只获得一个节点则直接返回
    if (1 == nobjs) return chunk;

    my_free_list = free_list + FREELIST_INDEX(n);

    result = (obj *)chunk;
    *my_free_list = next_obj = (obj *)(chunk + n);
    //下面从1开始(第0块返回给调用者)将节点串联形成链表
    for (i = 1;; ++i){
        current_obj = next_obj;
        if (nobjs - 1 == i){ //list尾部的link指向0
            current_obj->free_list_link = 0;
            break;
        }
        current_obj->free_list_link = next_obj = ((obj *)(next_obj) + n);
    }

}

//size已经ROUND_UP
//nobjs传递引用,因为有可能修改它的值 
template<bool threads, int inst>
char*  __default_alloc_template<threads, inst>::chunk_alloc(size_t size, int &nobjs){
    char * result;
    size_t total_bytes = size * nobjs;
    size_t bytes_left = end_free - start_free;
    //内存池剩余内存满足申请内存要求,返回start_free并调整内存池水位
    if (bytes_left >= total_bytes){
        result = start_free;
        start_free += total_bytes;
        return result;
    }
    //内存池不能完全满足需求，但是可以供应一个以上的内存块
    else if (bytes_left >= size){
        nobjs = bytes_left / size;
        total_bytes = size * nobjs;
        result = start_free;
        start_free += total_bytes;
        return result;
    }
    //内存池一个区块都不能提供
    else{
        size_t bytes_to_get = 2 * total_bytes + ROUND_UP(heap_size >> 4);
        //利用内存池残余的内存,将残余内存编入合适的free-list中
        if (bytes_left > 0){
            obj *volatile *my_free_list = free_list + FREELIST_INDEX(bytes_left);
            ((obj*)start_free)->free_list_link = *my_free_list;
            *my_free_list = (obj*)start_free;
        }
        start_free = (char*)malloc(bytes_to_get);
        //heap 空间不足
        if (0 == start_free){
            int i;
            obj * volatile *my_free_list, *p;
            //搜索比当前需求的size更大的free-list是否有可用区块
            for (i = size; i <= __MAX_BYTES; i += __ALIGN){
                my_free_list = free_list + FREELIST_INDEX(i);
                p = *my_free_list;
                //free-list中有可用区块
                if (0 != p){
                    *my_free_list = p->free_list_link;
                    start_free = (char *)p;
                    end_free = start_free + i;
                    //递归调用根据i的大小修正nobjs
                    return (chunk_alloc(size, nobjs));
                }

            }

            //如果free-list中没有足够大的可用区块,尝试第一级allocator能够起作用，不过第一级也采用了malloc库函数
            //但是第一级allocator设置了内存不足的处理程序，有可能释放一些内存或抛出异常
            end_free = 0;
            start_free = (char *)malloc_alloc::allocate(bytes_to_get);
        }
        heap_size += bytes_to_get;
        end_free = start_free + bytes_to_get;
        return (chunk_alloc(size, nobjs));

        }
    }

小结

根据SGI STL实现了两级allocator管理内存分配，第一级处理大于128bytes的申请，第二级则采用free-list + 内存池处理小于128bytes的申请。

C++动态内存分配---两级allocator设计与实现(二)

引言

destroy函数和construct函数的实现

SGI STL的实现

内存碎片

第一级和第二级allocator的实现

小结

猜你喜欢