http://blog.csdn.net/cosmoslhf/article/details/41209925

ION基本概念介绍和原理分析

转载 2014年11月17日 15:17:05

参考文档：

http://lwn.net/Articles/480055/

《ARM体系结构与编程》存储系统章节。

转载前的话：

ION将内核态形形色色的内存分配纳入统一的管理接口之中，更重要的设计意图是为内存在不同用户态进程之间传递和访问提供了支持。

每个ion_buffer与一个struct file关联，其handle纳入进程file desc空间而不是/dev/ion设备内单独的handle空间，方便之处如下：

每个buffer一个handle，便于更灵活地细粒度地控制每个buffer的使用周期；

向用户进程输出fd，细粒度地对每个buffer进行mmap；

使用struct file可以重用已有struct file_operations进行mmap；

在binder driver中以BINDER_TYPE_FD类型为不同进程传递提供支撑，并借助fget/fput从struct file级别进行kref控制；

当不需要在用户态访问时，是不需要与struct file关联的，内核结构ion_handle/ion_buffer唯一的表征了该buffer，所以与struct file关联的工作是在ioctl(ion, ION_IOC_SHARE/ION_ION_MAP, &share)中完成并输出的，用于后续的mmap调用；或者该进程不需要mmap而是仅仅向别的进程binder transfer，这就实现了用户态进行buffer流转控制，而内核态完成buffer数据流转。

转自http://blog.csdn.net/kris_fei/article/details/8588661 ＆http://blog.csdn.net/kris_fei/article/details/8618587

考察平台：

chipset: MSM8X25Q

codebase: Android 4.1

ION概念：

ION是Google的下一代内存管理器，用来支持不同的内存分配机制，如CARVOUT(PMEM)，物理连续内存(kmalloc), 虚拟地址连续但物理不连续内存(vmalloc)， IOMMU等。

用户空间和内核空间都可以使用ION，用户空间是通过/dev/ion来创建client的。

说到client, 顺便看下ION相关比较重要的几个概念。

Heap: 用来表示内存分配的相关信息，包括id, type, name等。用struct ion_heap表示。

Client: Ion的使用者，用户空间和内核控件要使用ION的buffer,必须先创建一个client,一个client可以有多个buffer，用struct ion_buffer表示。

Handle: 将buffer该抽象出来，可以认为ION用handle来管理buffer，一般用户直接拿到的是handle,而不是buffer。用struct ion_handle表示。

heap类型：

由于ION可以使用多种memory分配机制，例如物理连续和不连续的，所以ION使用enum ion_heap_type表示。

[cpp] view plain copy

/**
* enum ion_heap_types - list of all possible types of heaps
* @ION_HEAP_TYPE_SYSTEM: memory allocated via vmalloc
* @ION_HEAP_TYPE_SYSTEM_CONTIG: memory allocated via kmalloc
* @ION_HEAP_TYPE_CARVEOUT: memory allocated from a prereserved
* carveout heap, allocations are physically
* contiguous
* @ION_HEAP_TYPE_IOMMU: IOMMU memory
* @ION_HEAP_TYPE_CP: memory allocated from a prereserved
* carveout heap, allocations are physically
* contiguous. Used for content protection.
* @ION_HEAP_TYPE_DMA: memory allocated via DMA API
* @ION_HEAP_END: helper for iterating over heaps
*/
enum ion_heap_type {
ION_HEAP_TYPE_SYSTEM,
ION_HEAP_TYPE_SYSTEM_CONTIG,
ION_HEAP_TYPE_CARVEOUT,
ION_HEAP_TYPE_IOMMU,
ION_HEAP_TYPE_CP,
ION_HEAP_TYPE_DMA,
ION_HEAP_TYPE_CUSTOM, /* must be last so device specific heaps always
are at the end of this enum */
ION_NUM_HEAPS,
};

[cpp] view plain copy

/**
* enum ion_heap_types - list of all possible types of heaps
* @ION_HEAP_TYPE_SYSTEM: memory allocated via vmalloc
* @ION_HEAP_TYPE_SYSTEM_CONTIG: memory allocated via kmalloc
* @ION_HEAP_TYPE_CARVEOUT: memory allocated from a prereserved
* carveout heap, allocations are physically
* contiguous
* @ION_HEAP_TYPE_IOMMU: IOMMU memory
* @ION_HEAP_TYPE_CP: memory allocated from a prereserved
* carveout heap, allocations are physically
* contiguous. Used for content protection.
* @ION_HEAP_TYPE_DMA: memory allocated via DMA API
* @ION_HEAP_END: helper for iterating over heaps
*/
enum ion_heap_type {
ION_HEAP_TYPE_SYSTEM,
ION_HEAP_TYPE_SYSTEM_CONTIG,
ION_HEAP_TYPE_CARVEOUT,
ION_HEAP_TYPE_IOMMU,
ION_HEAP_TYPE_CP,
ION_HEAP_TYPE_DMA,
ION_HEAP_TYPE_CUSTOM, /* must be last so device specific heaps always
are at the end of this enum */
ION_NUM_HEAPS,
};

代码中的注释很明确地说明了哪种type对应的是分配哪种memory。不同type的heap需要不同的method去分配，不过都是用struct ion_heap_ops来表示的。如以下例子：

[cpp] view plain copy

static struct ion_heap_ops carveout_heap_ops = {
.allocate = ion_carveout_heap_allocate,
.free = ion_carveout_heap_free,
.phys = ion_carveout_heap_phys,
.map_user = ion_carveout_heap_map_user,
.map_kernel = ion_carveout_heap_map_kernel,
.unmap_user = ion_carveout_heap_unmap_user,
.unmap_kernel = ion_carveout_heap_unmap_kernel,
.map_dma = ion_carveout_heap_map_dma,
.unmap_dma = ion_carveout_heap_unmap_dma,
.cache_op = ion_carveout_cache_ops,
.print_debug = ion_carveout_print_debug,
.map_iommu = ion_carveout_heap_map_iommu,
.unmap_iommu = ion_carveout_heap_unmap_iommu,
};
static struct ion_heap_ops kmalloc_ops = {
.allocate = ion_system_contig_heap_allocate,
.free = ion_system_contig_heap_free,
.phys = ion_system_contig_heap_phys,
.map_dma = ion_system_contig_heap_map_dma,
.unmap_dma = ion_system_heap_unmap_dma,
.map_kernel = ion_system_heap_map_kernel,
.unmap_kernel = ion_system_heap_unmap_kernel,
.map_user = ion_system_contig_heap_map_user,
.cache_op = ion_system_contig_heap_cache_ops,
.print_debug = ion_system_contig_print_debug,
.map_iommu = ion_system_contig_heap_map_iommu,
.unmap_iommu = ion_system_heap_unmap_iommu,
};

[cpp] view plain copy

static struct ion_heap_ops carveout_heap_ops = {
.allocate = ion_carveout_heap_allocate,
.free = ion_carveout_heap_free,
.phys = ion_carveout_heap_phys,
.map_user = ion_carveout_heap_map_user,
.map_kernel = ion_carveout_heap_map_kernel,
.unmap_user = ion_carveout_heap_unmap_user,
.unmap_kernel = ion_carveout_heap_unmap_kernel,
.map_dma = ion_carveout_heap_map_dma,
.unmap_dma = ion_carveout_heap_unmap_dma,
.cache_op = ion_carveout_cache_ops,
.print_debug = ion_carveout_print_debug,
.map_iommu = ion_carveout_heap_map_iommu,
.unmap_iommu = ion_carveout_heap_unmap_iommu,
};
static struct ion_heap_ops kmalloc_ops = {
.allocate = ion_system_contig_heap_allocate,
.free = ion_system_contig_heap_free,
.phys = ion_system_contig_heap_phys,
.map_dma = ion_system_contig_heap_map_dma,
.unmap_dma = ion_system_heap_unmap_dma,
.map_kernel = ion_system_heap_map_kernel,
.unmap_kernel = ion_system_heap_unmap_kernel,
.map_user = ion_system_contig_heap_map_user,
.cache_op = ion_system_contig_heap_cache_ops,
.print_debug = ion_system_contig_print_debug,
.map_iommu = ion_system_contig_heap_map_iommu,
.unmap_iommu = ion_system_heap_unmap_iommu,
};

Heap ID：

同一种type的heap上当然可以分为若该干个chunk供用户使用，所以ION又使用ID来区分了。例如在type为ION_HEAP_TYPE_CARVEOUT的heap上，audio和display部分都需要使用，ION就用ID来区分。

Heap id用enumion_heap_ids表示。

[cpp] view plain copy

/**
* These are the only ids that should be used for Ion heap ids.
* The ids listed are the order in which allocation will be attempted
* if specified. Don't swap the order of heap ids unless you know what
* you are doing!
* Id's are spaced by purpose to allow new Id's to be inserted in-between (for
* possible fallbacks)
*/
enum ion_heap_ids {
INVALID_HEAP_ID = -1,
ION_CP_MM_HEAP_ID = 8,
ION_CP_MFC_HEAP_ID = 12,
ION_CP_WB_HEAP_ID = 16, /* 8660 only */
ION_CAMERA_HEAP_ID = 20, /* 8660 only */
ION_SF_HEAP_ID = 24,
ION_IOMMU_HEAP_ID = 25,
ION_QSECOM_HEAP_ID = 26,
ION_AUDIO_HEAP_BL_ID = 27,
ION_AUDIO_HEAP_ID = 28,
ION_MM_FIRMWARE_HEAP_ID = 29,
ION_SYSTEM_HEAP_ID = 30,
ION_HEAP_ID_RESERVED = 31 /** Bit reserved for ION_SECURE flag */
};

[cpp] view plain copy

/**
* These are the only ids that should be used for Ion heap ids.
* The ids listed are the order in which allocation will be attempted
* if specified. Don't swap the order of heap ids unless you know what
* you are doing!
* Id's are spaced by purpose to allow new Id's to be inserted in-between (for
* possible fallbacks)
*/
enum ion_heap_ids {
INVALID_HEAP_ID = -1,
ION_CP_MM_HEAP_ID = 8,
ION_CP_MFC_HEAP_ID = 12,
ION_CP_WB_HEAP_ID = 16, /* 8660 only */
ION_CAMERA_HEAP_ID = 20, /* 8660 only */
ION_SF_HEAP_ID = 24,
ION_IOMMU_HEAP_ID = 25,
ION_QSECOM_HEAP_ID = 26,
ION_AUDIO_HEAP_BL_ID = 27,
ION_AUDIO_HEAP_ID = 28,
ION_MM_FIRMWARE_HEAP_ID = 29,
ION_SYSTEM_HEAP_ID = 30,
ION_HEAP_ID_RESERVED = 31 /** Bit reserved for ION_SECURE flag */
};

Heap 定义：

了解了heaptype和id，看看如何被用到了，本平台使用的文件为board-qrd7627a.c，有如下定义：

[cpp] view plain copy

/**
* These heaps are listed in the order they will be allocated.
* Don't swap the order unless you know what you are doing!
*/
struct ion_platform_heap msm7627a_heaps[] = {
{
.id = ION_SYSTEM_HEAP_ID,
.type = ION_HEAP_TYPE_SYSTEM,
.name = ION_VMALLOC_HEAP_NAME,
},
#ifdef CONFIG_MSM_MULTIMEDIA_USE_ION
/* PMEM_ADSP = CAMERA */
{
.id = ION_CAMERA_HEAP_ID,
.type = CAMERA_HEAP_TYPE,
.name = ION_CAMERA_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_mm_ion_pdata,
.priv = (void *)&ion_cma_device.dev,
},
/* AUDIO HEAP 1*/
{
.id = ION_AUDIO_HEAP_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_AUDIO_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
},
/* PMEM_MDP = SF */
{
.id = ION_SF_HEAP_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_SF_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
},
/* AUDIO HEAP 2*/
{
.id = ION_AUDIO_HEAP_BL_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_AUDIO_BL_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
.base = BOOTLOADER_BASE_ADDR,
},
#endif
};

[cpp] view plain copy

/**
* These heaps are listed in the order they will be allocated.
* Don't swap the order unless you know what you are doing!
*/
struct ion_platform_heap msm7627a_heaps[] = {
{
.id = ION_SYSTEM_HEAP_ID,
.type = ION_HEAP_TYPE_SYSTEM,
.name = ION_VMALLOC_HEAP_NAME,
},
#ifdef CONFIG_MSM_MULTIMEDIA_USE_ION
/* PMEM_ADSP = CAMERA */
{
.id = ION_CAMERA_HEAP_ID,
.type = CAMERA_HEAP_TYPE,
.name = ION_CAMERA_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_mm_ion_pdata,
.priv = (void *)&ion_cma_device.dev,
},
/* AUDIO HEAP 1*/
{
.id = ION_AUDIO_HEAP_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_AUDIO_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
},
/* PMEM_MDP = SF */
{
.id = ION_SF_HEAP_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_SF_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
},
/* AUDIO HEAP 2*/
{
.id = ION_AUDIO_HEAP_BL_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_AUDIO_BL_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
.base = BOOTLOADER_BASE_ADDR,
},
#endif
};

ION Handle：

当Ion client分配buffer时，相应的一个唯一的handle也会被指定，当然client可以多次申请ion buffer。申请好buffer之后，返回的是一个ion handle, 不过要知道Ion buffer才和实际的内存相关，包括size, address等信息。Struct ion_handle和struct ion_buffer如下：

[cpp] view plain copy

/**
* ion_handle - a client local reference to a buffer
* @ref: reference count
* @client: back pointer to the client the buffer resides in
* @buffer: pointer to the buffer
* @node: node in the client's handle rbtree
* @kmap_cnt: count of times this client has mapped to kernel
* @dmap_cnt: count of times this client has mapped for dma
*
* Modifications to node, map_cnt or mapping should be protected by the
* lock in the client. Other fields are never changed after initialization.
*/
struct ion_handle {
struct kref ref;
struct ion_client *client;
struct ion_buffer *buffer;
struct rb_node node;
unsigned int kmap_cnt;
unsigned int iommu_map_cnt;
};
/**
* struct ion_buffer - metadata for a particular buffer
* @ref: refernce count
* @node: node in the ion_device buffers tree
* @dev: back pointer to the ion_device
* @heap: back pointer to the heap the buffer came from
* @flags: buffer specific flags
* @size: size of the buffer
* @priv_virt: private data to the buffer representable as
* a void *
* @priv_phys: private data to the buffer representable as
* an ion_phys_addr_t (and someday a phys_addr_t)
* @lock: protects the buffers cnt fields
* @kmap_cnt: number of times the buffer is mapped to the kernel
* @vaddr: the kenrel mapping if kmap_cnt is not zero
* @dmap_cnt: number of times the buffer is mapped for dma
* @sg_table: the sg table for the buffer if dmap_cnt is not zero
*/
struct ion_buffer {
struct kref ref;
struct rb_node node;
struct ion_device *dev;
struct ion_heap *heap;
unsigned long flags;
size_t size;
union {
void *priv_virt;
ion_phys_addr_t priv_phys;
};
struct mutex lock;
int kmap_cnt;
void *vaddr;
int dmap_cnt;
struct sg_table *sg_table;
int umap_cnt;
unsigned int iommu_map_cnt;
struct rb_root iommu_maps;
int marked;
};

[cpp] view plain copy

/**
* ion_handle - a client local reference to a buffer
* @ref: reference count
* @client: back pointer to the client the buffer resides in
* @buffer: pointer to the buffer
* @node: node in the client's handle rbtree
* @kmap_cnt: count of times this client has mapped to kernel
* @dmap_cnt: count of times this client has mapped for dma
*
* Modifications to node, map_cnt or mapping should be protected by the
* lock in the client. Other fields are never changed after initialization.
*/
struct ion_handle {
struct kref ref;
struct ion_client *client;
struct ion_buffer *buffer;
struct rb_node node;
unsigned int kmap_cnt;
unsigned int iommu_map_cnt;
};
/**
* struct ion_buffer - metadata for a particular buffer
* @ref: refernce count
* @node: node in the ion_device buffers tree
* @dev: back pointer to the ion_device
* @heap: back pointer to the heap the buffer came from
* @flags: buffer specific flags
* @size: size of the buffer
* @priv_virt: private data to the buffer representable as
* a void *
* @priv_phys: private data to the buffer representable as
* an ion_phys_addr_t (and someday a phys_addr_t)
* @lock: protects the buffers cnt fields
* @kmap_cnt: number of times the buffer is mapped to the kernel
* @vaddr: the kenrel mapping if kmap_cnt is not zero
* @dmap_cnt: number of times the buffer is mapped for dma
* @sg_table: the sg table for the buffer if dmap_cnt is not zero
*/
struct ion_buffer {
struct kref ref;
struct rb_node node;
struct ion_device *dev;
struct ion_heap *heap;
unsigned long flags;
size_t size;
union {
void *priv_virt;
ion_phys_addr_t priv_phys;
};
struct mutex lock;
int kmap_cnt;
void *vaddr;
int dmap_cnt;
struct sg_table *sg_table;
int umap_cnt;
unsigned int iommu_map_cnt;
struct rb_root iommu_maps;
int marked;
};

ION Client：

用户空间和内核空间都可以成为client，不过创建的方法稍稍有点区别，先了解下基本的操作流程吧。

内核空间:

先创建client:

[cpp] view plain copy

struct ion_client *ion_client_create(struct ion_device *dev,
unsigned int heap_mask,
const char *name)

[cpp] view plain copy

struct ion_client *ion_client_create(struct ion_device *dev,
unsigned int heap_mask,
const char *name)

heap_mask: 可以分配的heap type，如carveout,system heap, iommu等。

高通使用msm_ion_client_create函数封装了下。

有了client之后就可以分配内存：

[cpp] view plain copy

struct ion_handle *ion_alloc(struct ion_client *client, size_t len,
size_t align, unsigned int flags)

[cpp] view plain copy

struct ion_handle *ion_alloc(struct ion_client *client, size_t len,
size_t align, unsigned int flags)

flags: 分配的heap id.

有了handle也就是buffer之后就准备使用了，不过还是物理地址，需要map：

[cpp] view plain copy

void *ion_map_kernel(struct ion_client *client, struct ion_handle *handle,
unsigned long flags)

[cpp] view plain copy

void *ion_map_kernel(struct ion_client *client, struct ion_handle *handle,
unsigned long flags)

用户空间:

用户空间如果想使用ION，也必须先要创建client,不过它是打开/dev/ion,实际上它最终也会调用ion_client_create。

不过和内核空间创建client的一点区别是，用户空间不能选择heap type（使用预订的heap id隐含heap type），但是内核空间却可以。

另外，用户空间是通过IOCTL来分配内存的，cmd为ION_IOC_ALLOC.

[cpp] view plain copy

ion_fd = open("/dev/ion", O_ RDONLY | O_SYNC);
ioctl(ion_fd, ION_IOC_ALLOC, alloc);

[cpp] view plain copy

ion_fd = open("/dev/ion", O_ RDONLY | O_SYNC);
ioctl(ion_fd, ION_IOC_ALLOC, alloc);

alloc为struct ion_allocation_data,len是申请buffer的长度，flags是heap id。

[cpp] view plain copy

/**
* struct ion_allocation_data - metadata passed from userspace for allocations
* @len: size of the allocation
* @align: required alignment of the allocation
* @flags: flags passed to heap
* @handle: pointer that will be populated with a cookie to use to refer
* to this allocation
*
* Provided by userspace as an argument to the ioctl
*/
struct ion_allocation_data {
size_t len;
size_t align;
unsigned int flags;
struct ion_handle *handle;
};

[cpp] view plain copy

/**
* struct ion_allocation_data - metadata passed from userspace for allocations
* @len: size of the allocation
* @align: required alignment of the allocation
* @flags: flags passed to heap
* @handle: pointer that will be populated with a cookie to use to refer
* to this allocation
*
* Provided by userspace as an argument to the ioctl
*/
struct ion_allocation_data {
size_t len;
size_t align;
unsigned int flags;
struct ion_handle *handle;
};

分配好了buffer之后，如果用户空间想使用buffer，先需要mmap. ION是通过先调用IOCTL中的ION_IOC_SHARE/ION_IOC_MAP来得到可以mmap的fd,然后再执行mmap得到bufferaddress.

然后，你也可以将此fd传给另一个进程，如通过binder传递。在另一个进程中通过ION_IOC_IMPORT这个IOCTL来得到这块共享buffer了。

来看一个例子：

[cpp] view plain copy

进程A：
int ionfd = open("/dev/ion", O_RDONLY | O_DSYNC);
alloc_data.len = 0x1000;
alloc_data.align = 0x1000;
alloc_data.flags = ION_HEAP(ION_CP_MM_HEAP_ID);
rc = ioctl(ionfd,ION_IOC_ALLOC, &alloc_data);
fd_data.handle = alloc_data.handle;
rc = ioctl(ionfd,ION_IOC_SHARE,&fd_data);
shared_fd = fd_data.fd;
进程B：
fd_data.fd = shared_fd;
rc = ioctl(ionfd,ION_IOC_IMPORT,&fd_data);

[cpp] view plain copy

进程A：
int ionfd = open("/dev/ion", O_RDONLY | O_DSYNC);
alloc_data.len = 0x1000;
alloc_data.align = 0x1000;
alloc_data.flags = ION_HEAP(ION_CP_MM_HEAP_ID);
rc = ioctl(ionfd,ION_IOC_ALLOC, &alloc_data);
fd_data.handle = alloc_data.handle;
rc = ioctl(ionfd,ION_IOC_SHARE,&fd_data);
shared_fd = fd_data.fd;
进程B：
fd_data.fd = shared_fd;
rc = ioctl(ionfd,ION_IOC_IMPORT,&fd_data);

从上一篇ION基本概念中，我们了解了heaptype, heap id, client, handle以及如何使用，本篇再从原理上分析下ION的运作流程。

MSM8x25Q平台使用的是board-qrd7627.c，ION相关定义如下：

[cpp] view plain copy

/**
* These heaps are listed in the order they will be allocated.
* Don't swap the order unless you know what you are doing!
*/
struct ion_platform_heap msm7627a_heaps[] = {
{
.id = ION_SYSTEM_HEAP_ID,
.type = ION_HEAP_TYPE_SYSTEM,
.name = ION_VMALLOC_HEAP_NAME,
},
#ifdef CONFIG_MSM_MULTIMEDIA_USE_ION
/* PMEM_ADSP = CAMERA */
{
.id = ION_CAMERA_HEAP_ID,
.type = CAMERA_HEAP_TYPE,
.name = ION_CAMERA_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_mm_ion_pdata,
.priv = (void *)&ion_cma_device.dev,
},
/* AUDIO HEAP 1*/
{
.id = ION_AUDIO_HEAP_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_AUDIO_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
},
/* PMEM_MDP = SF */
{
.id = ION_SF_HEAP_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_SF_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
},
/* AUDIO HEAP 2*/
{
.id = ION_AUDIO_HEAP_BL_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_AUDIO_BL_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
.base = BOOTLOADER_BASE_ADDR,
},
#endif
};
static struct ion_co_heap_pdata co_ion_pdata = {
.adjacent_mem_id = INVALID_HEAP_ID,
.align = PAGE_SIZE,
};
static struct ion_co_heap_pdata co_mm_ion_pdata = {
.adjacent_mem_id = INVALID_HEAP_ID,
.align = PAGE_SIZE,
};
static u64 msm_dmamask = DMA_BIT_MASK(32);
static struct platform_device ion_cma_device = {
.name = "ion-cma-device",
.id = -1,
.dev = {
.dma_mask = &msm_dmamask,
.coherent_dma_mask = DMA_BIT_MASK(32),
}
};

[cpp] view plain copy

/**
* These heaps are listed in the order they will be allocated.
* Don't swap the order unless you know what you are doing!
*/
struct ion_platform_heap msm7627a_heaps[] = {
{
.id = ION_SYSTEM_HEAP_ID,
.type = ION_HEAP_TYPE_SYSTEM,
.name = ION_VMALLOC_HEAP_NAME,
},
#ifdef CONFIG_MSM_MULTIMEDIA_USE_ION
/* PMEM_ADSP = CAMERA */
{
.id = ION_CAMERA_HEAP_ID,
.type = CAMERA_HEAP_TYPE,
.name = ION_CAMERA_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_mm_ion_pdata,
.priv = (void *)&ion_cma_device.dev,
},
/* AUDIO HEAP 1*/
{
.id = ION_AUDIO_HEAP_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_AUDIO_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
},
/* PMEM_MDP = SF */
{
.id = ION_SF_HEAP_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_SF_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
},
/* AUDIO HEAP 2*/
{
.id = ION_AUDIO_HEAP_BL_ID,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = ION_AUDIO_BL_HEAP_NAME,
.memory_type = ION_EBI_TYPE,
.extra_data = (void *)&co_ion_pdata,
.base = BOOTLOADER_BASE_ADDR,
},
#endif
};
static struct ion_co_heap_pdata co_ion_pdata = {
.adjacent_mem_id = INVALID_HEAP_ID,
.align = PAGE_SIZE,
};
static struct ion_co_heap_pdata co_mm_ion_pdata = {
.adjacent_mem_id = INVALID_HEAP_ID,
.align = PAGE_SIZE,
};
static u64 msm_dmamask = DMA_BIT_MASK(32);
static struct platform_device ion_cma_device = {
.name = "ion-cma-device",
.id = -1,
.dev = {
.dma_mask = &msm_dmamask,
.coherent_dma_mask = DMA_BIT_MASK(32),
}
};

Qualcomm提示了不要轻易调换顺序，因为后面代码处理是将顺序定死了的，一旦你调换了，代码就无法正常运行了。

另外，本系统中只使用了ION_HEAP_TYPE_CARVEOUT和 ION_HEAP_TYPE_SYSTEM这两种heap type.

对于ION_HEAP_TYPE_CARVEOUT的内存分配，后面将会发现，其实就是之前讲述过的使用mem pool来分配的。

Platform device如下,在msm_ion.c中用到。

[cpp] view plain copy

static struct ion_platform_data ion_pdata = {
.nr = MSM_ION_HEAP_NUM,
.has_outer_cache = 1,
.heaps = msm7627a_heaps,
};
static struct platform_device ion_dev = {
.name = "ion-msm",
.id = 1,
.dev = { .platform_data = &ion_pdata },
};

[cpp] view plain copy

static struct ion_platform_data ion_pdata = {
.nr = MSM_ION_HEAP_NUM,
.has_outer_cache = 1,
.heaps = msm7627a_heaps,
};
static struct platform_device ion_dev = {
.name = "ion-msm",
.id = 1,
.dev = { .platform_data = &ion_pdata },
};

ION初始化

转到msm_ion.c，ion.c的某些函数也被重新封装了下.万事都从设备匹配开始：

[cpp] view plain copy

static struct platform_driver msm_ion_driver = {
.probe = msm_ion_probe,
.remove = msm_ion_remove,
.driver = { .name = "ion-msm" }
};
static int __init msm_ion_init(void)
{
/*调用msm_ion_probe */
return platform_driver_register(&msm_ion_driver);
}
static int msm_ion_probe(struct platform_device *pdev)
{
/*即board-qrd7627a.c中的ion_pdata */
struct ion_platform_data *pdata = pdev->dev.platform_data;
int err;
int i;
/*heap数量*/
num_heaps = pdata->nr;
/*分配struct ion_heap */
heaps = kcalloc(pdata->nr, sizeof(struct ion_heap *), GFP_KERNEL);
if (!heaps) {
err = -ENOMEM;
goto out;
}
/*创建节点，最终是/dev/ion,供用户空间操作。*/
idev = ion_device_create(NULL);
if (IS_ERR_OR_NULL(idev)) {
err = PTR_ERR(idev);
goto freeheaps;
}
/*最终是根据adjacent_mem_id 是否定义了来分配相邻内存，
我们没用到，忽略此函数。*/
msm_ion_heap_fixup(pdata->heaps, num_heaps);
/* create the heaps as specified in the board file */
for (i = 0; i < num_heaps; i++) {
struct ion_platform_heap *heap_data = &pdata->heaps[i];
/*分配ion*/
msm_ion_allocate(heap_data);
heap_data->has_outer_cache = pdata->has_outer_cache;
/*创建ion heap。*/
heaps[i] = ion_heap_create(heap_data);
if (IS_ERR_OR_NULL(heaps[i])) {
heaps[i] = 0;
continue;
} else {
if (heap_data->size)
pr_info("ION heap %s created at %lx "
"with size %x\n", heap_data->name,
heap_data->base,
heap_data->size);
else
pr_info("ION heap %s created\n",
heap_data->name);
}
/*创建的heap添加到idev中，以便后续使用。*/
ion_device_add_heap(idev, heaps[i]);
}
/*检查heap之间是否有重叠部分*/
check_for_heap_overlap(pdata->heaps, num_heaps);
platform_set_drvdata(pdev, idev);
return 0;
freeheaps:
kfree(heaps);
out:
return err;
}
通过ion_device_create创建/dev/ion节点：
struct ion_device *ion_device_create(long (*custom_ioctl)
(struct ion_client *client,
unsigned int cmd,
unsigned long arg))
{
struct ion_device *idev;
int ret;
idev = kzalloc(sizeof(struct ion_device), GFP_KERNEL);
if (!idev)
return ERR_PTR(-ENOMEM);
/*是个misc设备*/
idev->dev.minor = MISC_DYNAMIC_MINOR;
/*节点名字为ion*/
idev->dev.name = "ion";
/*fops为ion_fops,所以对应ion的操作都会调用ion_fops的函数指针。*/
idev->dev.fops = &ion_fops;
idev->dev.parent = NULL;
ret = misc_register(&idev->dev);
if (ret) {
pr_err("ion: failed to register misc device.\n");
return ERR_PTR(ret);
}
/*创建debugfs目录，路径为/sys/kernel/debug/ion/*/
idev->debug_root = debugfs_create_dir("ion", NULL);
if (IS_ERR_OR_NULL(idev->debug_root))
pr_err("ion: failed to create debug files.\n");
idev->custom_ioctl = custom_ioctl;
idev->buffers = RB_ROOT;
mutex_init(&idev->lock);
idev->heaps = RB_ROOT;
idev->clients = RB_ROOT;
/*在ion目录下创建一个check_leaked_fds文件，用来检查Ion的使用是否有内存泄漏。如果申请了ion之后不需要使用却没有释放，就会导致memory leak.*/
debugfs_create_file("check_leaked_fds", 0664, idev->debug_root, idev,
&debug_leak_fops);
return idev;
}
msm_ion_allocate：
static void msm_ion_allocate(struct ion_platform_heap *heap)
{
if (!heap->base && heap->extra_data) {
unsigned int align = 0;
switch (heap->type) {
/*获取align参数*/
case ION_HEAP_TYPE_CARVEOUT:
align =
((struct ion_co_heap_pdata *) heap->extra_data)->align;
break;
/*此type我们没使用到。*/
case ION_HEAP_TYPE_CP:
{
struct ion_cp_heap_pdata *data =
(struct ion_cp_heap_pdata *)
heap->extra_data;
if (data->reusable) {
const struct fmem_data *fmem_info =
fmem_get_info();
heap->base = fmem_info->phys;
data->virt_addr = fmem_info->virt;
pr_info("ION heap %s using FMEM\n", heap->name);
} else if (data->mem_is_fmem) {
const struct fmem_data *fmem_info =
fmem_get_info();
heap->base = fmem_info->phys + fmem_info->size;
}
align = data->align;
break;
}
default:
break;
}
if (align && !heap->base) {
/*获取heap的base address。*/
heap->base = msm_ion_get_base(heap->size,
heap->memory_type,
align);
if (!heap->base)
pr_err("%s: could not get memory for heap %s "
"(id %x)\n", __func__, heap->name, heap->id);
}
}
}
static unsigned long msm_ion_get_base(unsigned long size, int memory_type,
unsigned int align)
{
switch (memory_type) {
/*我们定义的是ebi type，看见没，此函数在mem pool中分析过了。
原理就是使用Mempool 来管理分配内存。*/
case ION_EBI_TYPE:
return allocate_contiguous_ebi_nomap(size, align);
break;
case ION_SMI_TYPE:
return allocate_contiguous_memory_nomap(size, MEMTYPE_SMI,
align);
break;
default:
pr_err("%s: Unknown memory type %d\n", __func__, memory_type);
return 0;
}
}
ion_heap_create：
struct ion_heap *ion_heap_create(struct ion_platform_heap *heap_data)
{
struct ion_heap *heap = NULL;
/*根据Heap type调用相应的创建函数。*/
switch (heap_data->type) {
case ION_HEAP_TYPE_SYSTEM_CONTIG:
heap = ion_system_contig_heap_create(heap_data);
break;
case ION_HEAP_TYPE_SYSTEM:
heap = ion_system_heap_create(heap_data);
break;
case ION_HEAP_TYPE_CARVEOUT:
heap = ion_carveout_heap_create(heap_data);
break;
case ION_HEAP_TYPE_IOMMU:
heap = ion_iommu_heap_create(heap_data);
break;
case ION_HEAP_TYPE_CP:
heap = ion_cp_heap_create(heap_data);
break;
#ifdef CONFIG_CMA
case ION_HEAP_TYPE_DMA:
heap = ion_cma_heap_create(heap_data);
break;
#endif
default:
pr_err("%s: Invalid heap type %d\n", __func__,
heap_data->type);
return ERR_PTR(-EINVAL);
}
if (IS_ERR_OR_NULL(heap)) {
pr_err("%s: error creating heap %s type %d base %lu size %u\n",
__func__, heap_data->name, heap_data->type,
heap_data->base, heap_data->size);
return ERR_PTR(-EINVAL);
}
/*保存Heap的name,id和私有数据。*/
heap->name = heap_data->name;
heap->id = heap_data->id;
heap->priv = heap_data->priv;
return heap;
}

[cpp] view plain copy

static struct platform_driver msm_ion_driver = {
.probe = msm_ion_probe,
.remove = msm_ion_remove,
.driver = { .name = "ion-msm" }
};
static int __init msm_ion_init(void)
{
/*调用msm_ion_probe */
return platform_driver_register(&msm_ion_driver);
}
static int msm_ion_probe(struct platform_device *pdev)
{
/*即board-qrd7627a.c中的ion_pdata */
struct ion_platform_data *pdata = pdev->dev.platform_data;
int err;
int i;
/*heap数量*/
num_heaps = pdata->nr;
/*分配struct ion_heap */
heaps = kcalloc(pdata->nr, sizeof(struct ion_heap *), GFP_KERNEL);
if (!heaps) {
err = -ENOMEM;
goto out;
}
/*创建节点，最终是/dev/ion,供用户空间操作。*/
idev = ion_device_create(NULL);
if (IS_ERR_OR_NULL(idev)) {
err = PTR_ERR(idev);
goto freeheaps;
}
/*最终是根据adjacent_mem_id 是否定义了来分配相邻内存，
我们没用到，忽略此函数。*/
msm_ion_heap_fixup(pdata->heaps, num_heaps);
/* create the heaps as specified in the board file */
for (i = 0; i < num_heaps; i++) {
struct ion_platform_heap *heap_data = &pdata->heaps[i];
/*分配ion*/
msm_ion_allocate(heap_data);
heap_data->has_outer_cache = pdata->has_outer_cache;
/*创建ion heap。*/
heaps[i] = ion_heap_create(heap_data);
if (IS_ERR_OR_NULL(heaps[i])) {
heaps[i] = 0;
continue;
} else {
if (heap_data->size)
pr_info("ION heap %s created at %lx "
"with size %x\n", heap_data->name,
heap_data->base,
heap_data->size);
else
pr_info("ION heap %s created\n",
heap_data->name);
}
/*创建的heap添加到idev中，以便后续使用。*/
ion_device_add_heap(idev, heaps[i]);
}
/*检查heap之间是否有重叠部分*/
check_for_heap_overlap(pdata->heaps, num_heaps);
platform_set_drvdata(pdev, idev);
return 0;
freeheaps:
kfree(heaps);
out:
return err;
}
通过ion_device_create创建/dev/ion节点：
struct ion_device *ion_device_create(long (*custom_ioctl)
(struct ion_client *client,
unsigned int cmd,
unsigned long arg))
{
struct ion_device *idev;
int ret;
idev = kzalloc(sizeof(struct ion_device), GFP_KERNEL);
if (!idev)
return ERR_PTR(-ENOMEM);
/*是个misc设备*/
idev->dev.minor = MISC_DYNAMIC_MINOR;
/*节点名字为ion*/
idev->dev.name = "ion";
/*fops为ion_fops,所以对应ion的操作都会调用ion_fops的函数指针。*/
idev->dev.fops = &ion_fops;
idev->dev.parent = NULL;
ret = misc_register(&idev->dev);
if (ret) {
pr_err("ion: failed to register misc device.\n");
return ERR_PTR(ret);
}
/*创建debugfs目录，路径为/sys/kernel/debug/ion/*/
idev->debug_root = debugfs_create_dir("ion", NULL);
if (IS_ERR_OR_NULL(idev->debug_root))
pr_err("ion: failed to create debug files.\n");
idev->custom_ioctl = custom_ioctl;
idev->buffers = RB_ROOT;
mutex_init(&idev->lock);
idev->heaps = RB_ROOT;
idev->clients = RB_ROOT;
/*在ion目录下创建一个check_leaked_fds文件，用来检查Ion的使用是否有内存泄漏。如果申请了ion之后不需要使用却没有释放，就会导致memory leak.*/
debugfs_create_file("check_leaked_fds", 0664, idev->debug_root, idev,
&debug_leak_fops);
return idev;
}
msm_ion_allocate：
static void msm_ion_allocate(struct ion_platform_heap *heap)
{
if (!heap->base && heap->extra_data) {
unsigned int align = 0;
switch (heap->type) {
/*获取align参数*/
case ION_HEAP_TYPE_CARVEOUT:
align =
((struct ion_co_heap_pdata *) heap->extra_data)->align;
break;
/*此type我们没使用到。*/
case ION_HEAP_TYPE_CP:
{
struct ion_cp_heap_pdata *data =
(struct ion_cp_heap_pdata *)
heap->extra_data;
if (data->reusable) {
const struct fmem_data *fmem_info =
fmem_get_info();
heap->base = fmem_info->phys;
data->virt_addr = fmem_info->virt;
pr_info("ION heap %s using FMEM\n", heap->name);
} else if (data->mem_is_fmem) {
const struct fmem_data *fmem_info =
fmem_get_info();
heap->base = fmem_info->phys + fmem_info->size;
}
align = data->align;
break;
}
default:
break;
}
if (align && !heap->base) {
/*获取heap的base address。*/
heap->base = msm_ion_get_base(heap->size,
heap->memory_type,
align);
if (!heap->base)
pr_err("%s: could not get memory for heap %s "
"(id %x)\n", __func__, heap->name, heap->id);
}
}
}
static unsigned long msm_ion_get_base(unsigned long size, int memory_type,
unsigned int align)
{
switch (memory_type) {
/*我们定义的是ebi type，看见没，此函数在mem pool中分析过了。
原理就是使用Mempool 来管理分配内存。*/
case ION_EBI_TYPE:
return allocate_contiguous_ebi_nomap(size, align);
break;
case ION_SMI_TYPE:
return allocate_contiguous_memory_nomap(size, MEMTYPE_SMI,
align);
break;
default:
pr_err("%s: Unknown memory type %d\n", __func__, memory_type);
return 0;
}
}
ion_heap_create：
struct ion_heap *ion_heap_create(struct ion_platform_heap *heap_data)
{
struct ion_heap *heap = NULL;
/*根据Heap type调用相应的创建函数。*/
switch (heap_data->type) {
case ION_HEAP_TYPE_SYSTEM_CONTIG:
heap = ion_system_contig_heap_create(heap_data);
break;
case ION_HEAP_TYPE_SYSTEM:
heap = ion_system_heap_create(heap_data);
break;
case ION_HEAP_TYPE_CARVEOUT:
heap = ion_carveout_heap_create(heap_data);
break;
case ION_HEAP_TYPE_IOMMU:
heap = ion_iommu_heap_create(heap_data);
break;
case ION_HEAP_TYPE_CP:
heap = ion_cp_heap_create(heap_data);
break;
#ifdef CONFIG_CMA
case ION_HEAP_TYPE_DMA:
heap = ion_cma_heap_create(heap_data);
break;
#endif
default:
pr_err("%s: Invalid heap type %d\n", __func__,
heap_data->type);
return ERR_PTR(-EINVAL);
}
if (IS_ERR_OR_NULL(heap)) {
pr_err("%s: error creating heap %s type %d base %lu size %u\n",
__func__, heap_data->name, heap_data->type,
heap_data->base, heap_data->size);
return ERR_PTR(-EINVAL);
}
/*保存Heap的name,id和私有数据。*/
heap->name = heap_data->name;
heap->id = heap_data->id;
heap->priv = heap_data->priv;
return heap;
}

从下面的代码可以得知，ION_HEAP_TYPE_SYSTEM_CONTIG使用kmalloc创建的，ION_HEAP_TYPE_SYSTEM使用的是vmalloc,而ion_carveout_heap_create就是系统预分配了一片内存区域供其使用。Ion在申请使用的时候，会根据当前的type来操作各自的heap->ops。分别看下三个函数：

[cpp] view plain copy

struct ion_heap *ion_system_contig_heap_create(struct ion_platform_heap *pheap)
{
struct ion_heap *heap;
heap = kzalloc(sizeof(struct ion_heap), GFP_KERNEL);
if (!heap)
return ERR_PTR(-ENOMEM);
/*使用的是kmalloc_ops，上篇有提到哦*/
heap->ops = &kmalloc_ops;
heap->type = ION_HEAP_TYPE_SYSTEM_CONTIG;
system_heap_contig_has_outer_cache = pheap->has_outer_cache;
return heap;
}
struct ion_heap *ion_system_heap_create(struct ion_platform_heap *pheap)
{
struct ion_heap *heap;
heap = kzalloc(sizeof(struct ion_heap), GFP_KERNEL);
if (!heap)
return ERR_PTR(-ENOMEM);
/*和上面函数的区别仅在于ops*/
heap->ops = &vmalloc_ops;
heap->type = ION_HEAP_TYPE_SYSTEM;
system_heap_has_outer_cache = pheap->has_outer_cache;
return heap;
}
struct ion_heap *ion_carveout_heap_create(struct ion_platform_heap *heap_data)
{
struct ion_carveout_heap *carveout_heap;
int ret;
carveout_heap = kzalloc(sizeof(struct ion_carveout_heap), GFP_KERNEL);
if (!carveout_heap)
return ERR_PTR(-ENOMEM);
/* 重新创建一个新的pool，这里有点想不通的是为什么不直接使用全局的mempools呢？*/
carveout_heap->pool = gen_pool_create(12, -1);
if (!carveout_heap->pool) {
kfree(carveout_heap);
return ERR_PTR(-ENOMEM);
}
carveout_heap->base = heap_data->base;
ret = gen_pool_add(carveout_heap->pool, carveout_heap->base,
heap_data->size, -1);
if (ret < 0) {
gen_pool_destroy(carveout_heap->pool);
kfree(carveout_heap);
return ERR_PTR(-EINVAL);
}
carveout_heap->heap.ops = &carveout_heap_ops;
carveout_heap->heap.type = ION_HEAP_TYPE_CARVEOUT;
carveout_heap->allocated_bytes = 0;
carveout_heap->total_size = heap_data->size;
carveout_heap->has_outer_cache = heap_data->has_outer_cache;
if (heap_data->extra_data) {
struct ion_co_heap_pdata *extra_data =
heap_data->extra_data;
if (extra_data->setup_region)
carveout_heap->bus_id = extra_data->setup_region();
if (extra_data->request_region)
carveout_heap->request_region =
extra_data->request_region;
if (extra_data->release_region)
carveout_heap->release_region =
extra_data->release_region;
}
return &carveout_heap->heap;
}
Heap创建完成，然后保存到idev中：
void ion_device_add_heap(struct ion_device *dev, struct ion_heap *heap)
{
struct rb_node **p = &dev->heaps.rb_node;
struct rb_node *parent = NULL;
struct ion_heap *entry;
if (!heap->ops->allocate || !heap->ops->free || !heap->ops->map_dma ||
!heap->ops->unmap_dma)
pr_err("%s: can not add heap with invalid ops struct.\n",
__func__);
heap->dev = dev;
mutex_lock(&dev->lock);
while (*p) {
parent = *p;
entry = rb_entry(parent, struct ion_heap, node);
if (heap->id < entry->id) {
p = &(*p)->rb_left;
} else if (heap->id > entry->id ) {
p = &(*p)->rb_right;
} else {
pr_err("%s: can not insert multiple heaps with "
"id %d\n", __func__, heap->id);
goto end;
}
}
/*使用红黑树保存*/
rb_link_node(&heap->node, parent, p);
rb_insert_color(&heap->node, &dev->heaps);
/*以heap name创建fs,位于ion目录下。如vamlloc, camera_preview , audio 等*/
debugfs_create_file(heap->name, 0664, dev->debug_root, heap,
&debug_heap_fops);
end:
mutex_unlock(&dev->lock);
}

[cpp] view plain copy

struct ion_heap *ion_system_contig_heap_create(struct ion_platform_heap *pheap)
{
struct ion_heap *heap;
heap = kzalloc(sizeof(struct ion_heap), GFP_KERNEL);
if (!heap)
return ERR_PTR(-ENOMEM);
/*使用的是kmalloc_ops，上篇有提到哦*/
heap->ops = &kmalloc_ops;
heap->type = ION_HEAP_TYPE_SYSTEM_CONTIG;
system_heap_contig_has_outer_cache = pheap->has_outer_cache;
return heap;
}
struct ion_heap *ion_system_heap_create(struct ion_platform_heap *pheap)
{
struct ion_heap *heap;
heap = kzalloc(sizeof(struct ion_heap), GFP_KERNEL);
if (!heap)
return ERR_PTR(-ENOMEM);
/*和上面函数的区别仅在于ops*/
heap->ops = &vmalloc_ops;
heap->type = ION_HEAP_TYPE_SYSTEM;
system_heap_has_outer_cache = pheap->has_outer_cache;
return heap;
}
struct ion_heap *ion_carveout_heap_create(struct ion_platform_heap *heap_data)
{
struct ion_carveout_heap *carveout_heap;
int ret;
carveout_heap = kzalloc(sizeof(struct ion_carveout_heap), GFP_KERNEL);
if (!carveout_heap)
return ERR_PTR(-ENOMEM);
/* 重新创建一个新的pool，这里有点想不通的是为什么不直接使用全局的mempools呢？*/
carveout_heap->pool = gen_pool_create(12, -1);
if (!carveout_heap->pool) {
kfree(carveout_heap);
return ERR_PTR(-ENOMEM);
}
carveout_heap->base = heap_data->base;
ret = gen_pool_add(carveout_heap->pool, carveout_heap->base,
heap_data->size, -1);
if (ret < 0) {
gen_pool_destroy(carveout_heap->pool);
kfree(carveout_heap);
return ERR_PTR(-EINVAL);
}
carveout_heap->heap.ops = &carveout_heap_ops;
carveout_heap->heap.type = ION_HEAP_TYPE_CARVEOUT;
carveout_heap->allocated_bytes = 0;
carveout_heap->total_size = heap_data->size;
carveout_heap->has_outer_cache = heap_data->has_outer_cache;
if (heap_data->extra_data) {
struct ion_co_heap_pdata *extra_data =
heap_data->extra_data;
if (extra_data->setup_region)
carveout_heap->bus_id = extra_data->setup_region();
if (extra_data->request_region)
carveout_heap->request_region =
extra_data->request_region;
if (extra_data->release_region)
carveout_heap->release_region =
extra_data->release_region;
}
return &carveout_heap->heap;
}
Heap创建完成，然后保存到idev中：
void ion_device_add_heap(struct ion_device *dev, struct ion_heap *heap)
{
struct rb_node **p = &dev->heaps.rb_node;
struct rb_node *parent = NULL;
struct ion_heap *entry;
if (!heap->ops->allocate || !heap->ops->free || !heap->ops->map_dma ||
!heap->ops->unmap_dma)
pr_err("%s: can not add heap with invalid ops struct.\n",
__func__);
heap->dev = dev;
mutex_lock(&dev->lock);
while (*p) {
parent = *p;
entry = rb_entry(parent, struct ion_heap, node);
if (heap->id < entry->id) {
p = &(*p)->rb_left;
} else if (heap->id > entry->id ) {
p = &(*p)->rb_right;
} else {
pr_err("%s: can not insert multiple heaps with "
"id %d\n", __func__, heap->id);
goto end;
}
}
/*使用红黑树保存*/
rb_link_node(&heap->node, parent, p);
rb_insert_color(&heap->node, &dev->heaps);
/*以heap name创建fs,位于ion目录下。如vamlloc, camera_preview , audio 等*/
debugfs_create_file(heap->name, 0664, dev->debug_root, heap,
&debug_heap_fops);
end:
mutex_unlock(&dev->lock);
}

到此，ION初始化已经完成了。接下来该如何使用呢？嗯，通过前面创建的misc设备也就是idev了！还记得里面有个fops为ion_fops吗？先来看下用户空间如何使用ION，最后看内核空间如何使用。

ION用户空间使用

[cpp] view plain copy

Ion_fops结构如下：
static const struct file_operations ion_fops = {
.owner = THIS_MODULE,
.open = ion_open,
.release = ion_release,
.unlocked_ioctl = ion_ioctl,
};
用户空间都是通过ioctl来控制。先看ion_open.
static int ion_open(struct inode *inode, struct file *file)
{
struct miscdevice *miscdev = file->private_data;
struct ion_device *dev = container_of(miscdev, struct ion_device, dev);
struct ion_client *client;
char debug_name[64];
pr_debug("%s: %d\n", __func__, __LINE__);
snprintf(debug_name, 64, "%u", task_pid_nr(current->group_leader));
/*根据idev和task pid为name创建ion client*/
client = ion_client_create(dev, -1, debug_name);
if (IS_ERR_OR_NULL(client))
return PTR_ERR(client);
file->private_data = client;
return 0;
}

[cpp] view plain copy

Ion_fops结构如下：
static const struct file_operations ion_fops = {
.owner = THIS_MODULE,
.open = ion_open,
.release = ion_release,
.unlocked_ioctl = ion_ioctl,
};
用户空间都是通过ioctl来控制。先看ion_open.
static int ion_open(struct inode *inode, struct file *file)
{
struct miscdevice *miscdev = file->private_data;
struct ion_device *dev = container_of(miscdev, struct ion_device, dev);
struct ion_client *client;
char debug_name[64];
pr_debug("%s: %d\n", __func__, __LINE__);
snprintf(debug_name, 64, "%u", task_pid_nr(current->group_leader));
/*根据idev和task pid为name创建ion client*/
client = ion_client_create(dev, -1, debug_name);
if (IS_ERR_OR_NULL(client))
return PTR_ERR(client);
file->private_data = client;
return 0;
}

前一篇文章有说到，要使用ION, 必须要先创建ion_client, 因此用户空间在open ion的时候创建了client.

[cpp] view plain copy

struct ion_client *ion_client_create(struct ion_device *dev,
unsigned int heap_mask,
const char *name)
{
struct ion_client *client;
struct task_struct *task;
struct rb_node **p;
struct rb_node *parent = NULL;
struct ion_client *entry;
pid_t pid;
unsigned int name_len;
if (!name) {
pr_err("%s: Name cannot be null\n", __func__);
return ERR_PTR(-EINVAL);
}
name_len = strnlen(name, 64);
get_task_struct(current->group_leader);
task_lock(current->group_leader);
pid = task_pid_nr(current->group_leader);
/* don't bother to store task struct for kernel threads,
they can't be killed anyway */
if (current->group_leader->flags & PF_KTHREAD) {
put_task_struct(current->group_leader);
task = NULL;
} else {
task = current->group_leader;
}
task_unlock(current->group_leader);
/*分配ion client struct.*/
client = kzalloc(sizeof(struct ion_client), GFP_KERNEL);
if (!client) {
if (task)
put_task_struct(current->group_leader);
return ERR_PTR(-ENOMEM);
}
/*下面就是保存一系列参数了。*/
client->dev = dev;
client->handles = RB_ROOT;
mutex_init(&client->lock);
client->name = kzalloc(name_len+1, GFP_KERNEL);
if (!client->name) {
put_task_struct(current->group_leader);
kfree(client);
return ERR_PTR(-ENOMEM);
} else {
strlcpy(client->name, name, name_len+1);
}
client->heap_mask = heap_mask;
client->task = task;
client->pid = pid;
mutex_lock(&dev->lock);
p = &dev->clients.rb_node;
while (*p) {
parent = *p;
entry = rb_entry(parent, struct ion_client, node);
if (client < entry)
p = &(*p)->rb_left;
else if (client > entry)
p = &(*p)->rb_right;
}
/*当前client添加到idev的clients根树上去。*/
rb_link_node(&client->node, parent, p);
rb_insert_color(&client->node, &dev->clients);
/*在ION先创建的文件名字是以pid命名的。*/
client->debug_root = debugfs_create_file(name, 0664,
dev->debug_root, client,
&debug_client_fops);
mutex_unlock(&dev->lock);
return client;
}

[cpp] view plain copy

struct ion_client *ion_client_create(struct ion_device *dev,
unsigned int heap_mask,
const char *name)
{
struct ion_client *client;
struct task_struct *task;
struct rb_node **p;
struct rb_node *parent = NULL;
struct ion_client *entry;
pid_t pid;
unsigned int name_len;
if (!name) {
pr_err("%s: Name cannot be null\n", __func__);
return ERR_PTR(-EINVAL);
}
name_len = strnlen(name, 64);
get_task_struct(current->group_leader);
task_lock(current->group_leader);
pid = task_pid_nr(current->group_leader);
/* don't bother to store task struct for kernel threads,
they can't be killed anyway */
if (current->group_leader->flags & PF_KTHREAD) {
put_task_struct(current->group_leader);
task = NULL;
} else {
task = current->group_leader;
}
task_unlock(current->group_leader);
/*分配ion client struct.*/
client = kzalloc(sizeof(struct ion_client), GFP_KERNEL);
if (!client) {
if (task)
put_task_struct(current->group_leader);
return ERR_PTR(-ENOMEM);
}
/*下面就是保存一系列参数了。*/
client->dev = dev;
client->handles = RB_ROOT;
mutex_init(&client->lock);
client->name = kzalloc(name_len+1, GFP_KERNEL);
if (!client->name) {
put_task_struct(current->group_leader);
kfree(client);
return ERR_PTR(-ENOMEM);
} else {
strlcpy(client->name, name, name_len+1);
}
client->heap_mask = heap_mask;
client->task = task;
client->pid = pid;
mutex_lock(&dev->lock);
p = &dev->clients.rb_node;
while (*p) {
parent = *p;
entry = rb_entry(parent, struct ion_client, node);
if (client < entry)
p = &(*p)->rb_left;
else if (client > entry)
p = &(*p)->rb_right;
}
/*当前client添加到idev的clients根树上去。*/
rb_link_node(&client->node, parent, p);
rb_insert_color(&client->node, &dev->clients);
/*在ION先创建的文件名字是以pid命名的。*/
client->debug_root = debugfs_create_file(name, 0664,
dev->debug_root, client,
&debug_client_fops);
mutex_unlock(&dev->lock);
return client;
}

有了client之后，用户程序就可以开始申请分配ION buffer了！通过ioctl命令实现。

ion_ioct函数有若干个cmd，ION_IOC_ALLOC和ION_IOC_FREE相对应，表示申请和释放buffer。用户空间程序使用前先要调用ION_IOC_MAP才能得到buffer address，而ION_IOC_IMPORT是为了将这块内存共享给用户空间另一个进程。

[cpp] view plain copy

static long ion_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
struct ion_client *client = filp->private_data;
switch (cmd) {
case ION_IOC_ALLOC:
{
struct ion_allocation_data data;
if (copy_from_user(&data, (void __user *)arg, sizeof(data)))
return -EFAULT;
/*分配buffer.*/
data.handle = ion_alloc(client, data.len, data.align,
data.flags);
if (IS_ERR(data.handle))
return PTR_ERR(data.handle);
if (copy_to_user((void __user *)arg, &data, sizeof(data))) {
ion_free(client, data.handle);
return -EFAULT;
}
break;
}
case ION_IOC_FREE:
{
struct ion_handle_data data;
bool valid;
if (copy_from_user(&data, (void __user *)arg,
sizeof(struct ion_handle_data)))
return -EFAULT;
mutex_lock(&client->lock);
valid = ion_handle_validate(client, data.handle);
mutex_unlock(&client->lock);
if (!valid)
return -EINVAL;
ion_free(client, data.handle);
break;
}
case ION_IOC_MAP:
case ION_IOC_SHARE:
{
struct ion_fd_data data;
int ret;
if (copy_from_user(&data, (void __user *)arg, sizeof(data)))
return -EFAULT;
/*判断当前cmd是否被调用过了，调用过就返回，否则设置flags.*/
ret = ion_share_set_flags(client, data.handle, filp->f_flags);
if (ret)
return ret;
data.fd = ion_share_dma_buf(client, data.handle);
if (copy_to_user((void __user *)arg, &data, sizeof(data)))
return -EFAULT;
if (data.fd < 0)
return data.fd;
break;
}
case ION_IOC_IMPORT:
{
struct ion_fd_data data;
int ret = 0;
if (copy_from_user(&data, (void __user *)arg,
sizeof(struct ion_fd_data)))
return -EFAULT;
data.handle = ion_import_dma_buf(client, data.fd);
if (IS_ERR(data.handle))
data.handle = NULL;
if (copy_to_user((void __user *)arg, &data,
sizeof(struct ion_fd_data)))
return -EFAULT;
if (ret < 0)
return ret;
break;
}
case ION_IOC_CUSTOM:
~~snip
case ION_IOC_CLEAN_CACHES:
case ION_IOC_INV_CACHES:
case ION_IOC_CLEAN_INV_CACHES:
~~snip
case ION_IOC_GET_FLAGS:
~~snip
default:
return -ENOTTY;
}
return 0;
}

[cpp] view plain copy

static long ion_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
struct ion_client *client = filp->private_data;
switch (cmd) {
case ION_IOC_ALLOC:
{
struct ion_allocation_data data;
if (copy_from_user(&data, (void __user *)arg, sizeof(data)))
return -EFAULT;
/*分配buffer.*/
data.handle = ion_alloc(client, data.len, data.align,
data.flags);
if (IS_ERR(data.handle))
return PTR_ERR(data.handle);
if (copy_to_user((void __user *)arg, &data, sizeof(data))) {
ion_free(client, data.handle);
return -EFAULT;
}
break;
}
case ION_IOC_FREE:
{
struct ion_handle_data data;
bool valid;
if (copy_from_user(&data, (void __user *)arg,
sizeof(struct ion_handle_data)))
return -EFAULT;
mutex_lock(&client->lock);
valid = ion_handle_validate(client, data.handle);
mutex_unlock(&client->lock);
if (!valid)
return -EINVAL;
ion_free(client, data.handle);
break;
}
case ION_IOC_MAP:
case ION_IOC_SHARE:
{
struct ion_fd_data data;
int ret;
if (copy_from_user(&data, (void __user *)arg, sizeof(data)))
return -EFAULT;
/*判断当前cmd是否被调用过了，调用过就返回，否则设置flags.*/
ret = ion_share_set_flags(client, data.handle, filp->f_flags);
if (ret)
return ret;
data.fd = ion_share_dma_buf(client, data.handle);
if (copy_to_user((void __user *)arg, &data, sizeof(data)))
return -EFAULT;
if (data.fd < 0)
return data.fd;
break;
}
case ION_IOC_IMPORT:
{
struct ion_fd_data data;
int ret = 0;
if (copy_from_user(&data, (void __user *)arg,
sizeof(struct ion_fd_data)))
return -EFAULT;
data.handle = ion_import_dma_buf(client, data.fd);
if (IS_ERR(data.handle))
data.handle = NULL;
if (copy_to_user((void __user *)arg, &data,
sizeof(struct ion_fd_data)))
return -EFAULT;
if (ret < 0)
return ret;
break;
}
case ION_IOC_CUSTOM:
~~snip
case ION_IOC_CLEAN_CACHES:
case ION_IOC_INV_CACHES:
case ION_IOC_CLEAN_INV_CACHES:
~~snip
case ION_IOC_GET_FLAGS:
~~snip
default:
return -ENOTTY;
}
return 0;
}

下面分小节说明分配和共享的原理。

ION_IOC_ALLOC

[cpp] view plain copy

struct ion_handle *ion_alloc(struct ion_client *client, size_t len,
size_t align, unsigned int flags)
{
~~snip
mutex_lock(&dev->lock);
/*循环遍历当前Heap链表。*/
for (n = rb_first(&dev->heaps); n != NULL; n = rb_next(n)) {
struct ion_heap *heap = rb_entry(n, struct ion_heap, node);
/*只有heap type和id都符合才去创建buffer.*/
/* if the client doesn't support this heap type */
if (!((1 << heap->type) & client->heap_mask))
continue;
/* if the caller didn't specify this heap type */
if (!((1 << heap->id) & flags))
continue;
/* Do not allow un-secure heap if secure is specified */
if (secure_allocation && (heap->type != ION_HEAP_TYPE_CP))
continue;
buffer = ion_buffer_create(heap, dev, len, align, flags);
~~snip
}
mutex_unlock(&dev->lock);
~~snip
/*创建了buffer之后，就相应地创建handle来管理buffer.*/
handle = ion_handle_create(client, buffer);
~~snip
}
找到Heap之后调用ion_buffer_create：
static struct ion_buffer *ion_buffer_create(struct ion_heap *heap,
struct ion_device *dev,
unsigned long len,
unsigned long align,
unsigned long flags)
{
struct ion_buffer *buffer;
struct sg_table *table;
int ret;
/*分配struct ion buffer,用来管理buffer.*/
buffer = kzalloc(sizeof(struct ion_buffer), GFP_KERNEL);
if (!buffer)
return ERR_PTR(-ENOMEM);
buffer->heap = heap;
kref_init(&buffer->ref);
/*调用相应heap type的ops allocate。还记得前面有提到过不同种类的ops吗，
如carveout_heap_ops ，vmalloc_ops 。*/
ret = heap->ops->allocate(heap, buffer, len, align, flags);
if (ret) {
kfree(buffer);
return ERR_PTR(ret);
}
buffer->dev = dev;
buffer->size = len;
/*http://lwn.net/Articles/263343/*/
table = buffer->heap->ops->map_dma(buffer->heap, buffer);
if (IS_ERR_OR_NULL(table)) {
heap->ops->free(buffer);
kfree(buffer);
return ERR_PTR(PTR_ERR(table));
}
buffer->sg_table = table;
mutex_init(&buffer->lock);
/*将当前ion buffer添加到idev 的buffers 树上统一管理。*/
ion_buffer_add(dev, buffer);
return buffer;
}

[cpp] view plain copy

struct ion_handle *ion_alloc(struct ion_client *client, size_t len,
size_t align, unsigned int flags)
{
~~snip
mutex_lock(&dev->lock);
/*循环遍历当前Heap链表。*/
for (n = rb_first(&dev->heaps); n != NULL; n = rb_next(n)) {
struct ion_heap *heap = rb_entry(n, struct ion_heap, node);
/*只有heap type和id都符合才去创建buffer.*/
/* if the client doesn't support this heap type */
if (!((1 << heap->type) & client->heap_mask))
continue;
/* if the caller didn't specify this heap type */
if (!((1 << heap->id) & flags))
continue;
/* Do not allow un-secure heap if secure is specified */
if (secure_allocation && (heap->type != ION_HEAP_TYPE_CP))
continue;
buffer = ion_buffer_create(heap, dev, len, align, flags);
~~snip
}
mutex_unlock(&dev->lock);
~~snip
/*创建了buffer之后，就相应地创建handle来管理buffer.*/
handle = ion_handle_create(client, buffer);
~~snip
}
找到Heap之后调用ion_buffer_create：
static struct ion_buffer *ion_buffer_create(struct ion_heap *heap,
struct ion_device *dev,
unsigned long len,
unsigned long align,
unsigned long flags)
{
struct ion_buffer *buffer;
struct sg_table *table;
int ret;
/*分配struct ion buffer,用来管理buffer.*/
buffer = kzalloc(sizeof(struct ion_buffer), GFP_KERNEL);
if (!buffer)
return ERR_PTR(-ENOMEM);
buffer->heap = heap;
kref_init(&buffer->ref);
/*调用相应heap type的ops allocate。还记得前面有提到过不同种类的ops吗，
如carveout_heap_ops ，vmalloc_ops 。*/
ret = heap->ops->allocate(heap, buffer, len, align, flags);
if (ret) {
kfree(buffer);
return ERR_PTR(ret);
}
buffer->dev = dev;
buffer->size = len;
/*http://lwn.net/Articles/263343/*/
table = buffer->heap->ops->map_dma(buffer->heap, buffer);
if (IS_ERR_OR_NULL(table)) {
heap->ops->free(buffer);
kfree(buffer);
return ERR_PTR(PTR_ERR(table));
}
buffer->sg_table = table;
mutex_init(&buffer->lock);
/*将当前ion buffer添加到idev 的buffers 树上统一管理。*/
ion_buffer_add(dev, buffer);
return buffer;
}

[cpp] view plain copy

static struct ion_handle *ion_handle_create(struct ion_client *client,
struct ion_buffer *buffer)
{
struct ion_handle *handle;
/*分配struct ion_handle.*/
handle = kzalloc(sizeof(struct ion_handle), GFP_KERNEL);
if (!handle)
return ERR_PTR(-ENOMEM);
kref_init(&handle->ref);
rb_init_node(&handle->node);
handle->client = client; //client放入handle中
ion_buffer_get(buffer); //引用计数加1
handle->buffer = buffer; //buffer也放入handle中 return handle;
}

[cpp] view plain copy

static struct ion_handle *ion_handle_create(struct ion_client *client,
struct ion_buffer *buffer)
{
struct ion_handle *handle;
/*分配struct ion_handle.*/
handle = kzalloc(sizeof(struct ion_handle), GFP_KERNEL);
if (!handle)
return ERR_PTR(-ENOMEM);
kref_init(&handle->ref);
rb_init_node(&handle->node);
handle->client = client; //client放入handle中
ion_buffer_get(buffer); //引用计数加1
handle->buffer = buffer; //buffer也放入handle中 return handle;
}

先拿heap type为ION_HEAP_TYPE_CARVEOUT为例，看下它是如何分配buffer的。

allocate对应ion_carveout_heap_allocate。

[cpp] view plain copy

static int ion_carveout_heap_allocate(struct ion_heap *heap,
struct ion_buffer *buffer,
unsigned long size, unsigned long align,
unsigned long flags)
{
buffer->priv_phys = ion_carveout_allocate(heap, size, align);
return buffer->priv_phys == ION_CARVEOUT_ALLOCATE_FAIL ? -ENOMEM : 0;
}
ion_phys_addr_t ion_carveout_allocate(struct ion_heap *heap,
unsigned long size,
unsigned long align)
{
struct ion_carveout_heap *carveout_heap =
container_of(heap, struct ion_carveout_heap, heap);
/*通过创建的mem pool来管理buffer,由于这块buffer在初始化的
时候就预留了，现在只要从上面拿一块区域就可以了。*/
unsigned long offset = gen_pool_alloc_aligned(carveout_heap->pool,
size, ilog2(align));
/*分配不成功可能是没有内存空间可供分配了或者是有碎片导致的。*/
if (!offset) {
if ((carveout_heap->total_size -
carveout_heap->allocated_bytes) >= size)
pr_debug("%s: heap %s has enough memory (%lx) but"
" the allocation of size %lx still failed."
" Memory is probably fragmented.",
__func__, heap->name,
carveout_heap->total_size -
carveout_heap->allocated_bytes, size);
return ION_CARVEOUT_ALLOCATE_FAIL;
}
/*已经分配掉的内存字节。*/
carveout_heap->allocated_bytes += size;
return offset;
}

[cpp] view plain copy

static int ion_carveout_heap_allocate(struct ion_heap *heap,
struct ion_buffer *buffer,
unsigned long size, unsigned long align,
unsigned long flags)
{
buffer->priv_phys = ion_carveout_allocate(heap, size, align);
return buffer->priv_phys == ION_CARVEOUT_ALLOCATE_FAIL ? -ENOMEM : 0;
}
ion_phys_addr_t ion_carveout_allocate(struct ion_heap *heap,
unsigned long size,
unsigned long align)
{
struct ion_carveout_heap *carveout_heap =
container_of(heap, struct ion_carveout_heap, heap);
/*通过创建的mem pool来管理buffer,由于这块buffer在初始化的
时候就预留了，现在只要从上面拿一块区域就可以了。*/
unsigned long offset = gen_pool_alloc_aligned(carveout_heap->pool,
size, ilog2(align));
/*分配不成功可能是没有内存空间可供分配了或者是有碎片导致的。*/
if (!offset) {
if ((carveout_heap->total_size -
carveout_heap->allocated_bytes) >= size)
pr_debug("%s: heap %s has enough memory (%lx) but"
" the allocation of size %lx still failed."
" Memory is probably fragmented.",
__func__, heap->name,
carveout_heap->total_size -
carveout_heap->allocated_bytes, size);
return ION_CARVEOUT_ALLOCATE_FAIL;
}
/*已经分配掉的内存字节。*/
carveout_heap->allocated_bytes += size;
return offset;
}

同样地，对于heap type为ION_HEAP_TYPE_SYSTEM的分配函数是ion_system_heap_allocate。

[cpp] view plain copy

static int ion_system_contig_heap_allocate(struct ion_heap *heap,
struct ion_buffer *buffer,
unsigned long len,
unsigned long align,
unsigned long flags)
{
/*通过kzalloc分配。*/
buffer->priv_virt = kzalloc(len, GFP_KERNEL);
if (!buffer->priv_virt)
return -ENOMEM;
atomic_add(len, &system_contig_heap_allocated);
return 0;
}

[cpp] view plain copy

static int ion_system_contig_heap_allocate(struct ion_heap *heap,
struct ion_buffer *buffer,
unsigned long len,
unsigned long align,
unsigned long flags)
{
/*通过kzalloc分配。*/
buffer->priv_virt = kzalloc(len, GFP_KERNEL);
if (!buffer->priv_virt)
return -ENOMEM;
atomic_add(len, &system_contig_heap_allocated);
return 0;
}

其他的几种Heap type可自行研究，接着调用ion_buffer_add将buffer添加到dev的buffers树上去

[cpp] view plain copy

static void ion_buffer_add(struct ion_device *dev,
struct ion_buffer *buffer)
{
struct rb_node **p = &dev->buffers.rb_node;
struct rb_node *parent = NULL;
struct ion_buffer *entry;
while (*p) {
parent = *p;
entry = rb_entry(parent, struct ion_buffer, node);
if (buffer < entry) {
p = &(*p)->rb_left;
} else if (buffer > entry) {
p = &(*p)->rb_right;
} else {
pr_err("%s: buffer already found.", __func__);
BUG();
}
}
/*又是使用红黑树哦！*/
rb_link_node(&buffer->node, parent, p);
rb_insert_color(&buffer->node, &dev->buffers);
}

[cpp] view plain copy

static void ion_buffer_add(struct ion_device *dev,
struct ion_buffer *buffer)
{
struct rb_node **p = &dev->buffers.rb_node;
struct rb_node *parent = NULL;
struct ion_buffer *entry;
while (*p) {
parent = *p;
entry = rb_entry(parent, struct ion_buffer, node);
if (buffer < entry) {
p = &(*p)->rb_left;
} else if (buffer > entry) {
p = &(*p)->rb_right;
} else {
pr_err("%s: buffer already found.", __func__);
BUG();
}
}
/*又是使用红黑树哦！*/
rb_link_node(&buffer->node, parent, p);
rb_insert_color(&buffer->node, &dev->buffers);
}

至此，已经得到client和handle，buffer分配完成！

ION_IOC_MAP/ ION_IOC_SHARE

[cpp] view plain copy

int ion_share_dma_buf(struct ion_client *client, struct ion_handle *handle)
{
struct ion_buffer *buffer;
struct dma_buf *dmabuf;
bool valid_handle;
int fd;
mutex_lock(&client->lock);
valid_handle = ion_handle_validate(client, handle);
mutex_unlock(&client->lock);
if (!valid_handle) {
WARN(1, "%s: invalid handle passed to share.\n", __func__);
return -EINVAL;
}
buffer = handle->buffer;
ion_buffer_get(buffer);
/*生成一个新的file描述符*/
dmabuf = dma_buf_export(buffer, &dma_buf_ops, buffer->size, O_RDWR);
if (IS_ERR(dmabuf)) {
ion_buffer_put(buffer);
return PTR_ERR(dmabuf);
}
/*将file转换用户空间识别的fd描述符。*/
fd = dma_buf_fd(dmabuf, O_CLOEXEC);
if (fd < 0)
dma_buf_put(dmabuf);
return fd;
}
struct dma_buf *dma_buf_export(void *priv, const struct dma_buf_ops *ops,
size_t size, int flags)
{
struct dma_buf *dmabuf;
struct file *file;
~~snip
/*分配struct dma_buf.*/
dmabuf = kzalloc(sizeof(struct dma_buf), GFP_KERNEL);
if (dmabuf == NULL)
return ERR_PTR(-ENOMEM);
/*保存信息到dmabuf，注意ops为dma_buf_ops，后面mmap为调用到。*/
dmabuf->priv = priv;
dmabuf->ops = ops;
dmabuf->size = size;
/*产生新的file*/
file = anon_inode_getfile("dmabuf", &dma_buf_fops, dmabuf, flags);
dmabuf->file = file;
mutex_init(&dmabuf->lock);
INIT_LIST_HEAD(&dmabuf->attachments);
return dmabuf;
}

[cpp] view plain copy

int ion_share_dma_buf(struct ion_client *client, struct ion_handle *handle)
{
struct ion_buffer *buffer;
struct dma_buf *dmabuf;
bool valid_handle;
int fd;
mutex_lock(&client->lock);
valid_handle = ion_handle_validate(client, handle);
mutex_unlock(&client->lock);
if (!valid_handle) {
WARN(1, "%s: invalid handle passed to share.\n", __func__);
return -EINVAL;
}
buffer = handle->buffer;
ion_buffer_get(buffer);
/*生成一个新的file描述符*/
dmabuf = dma_buf_export(buffer, &dma_buf_ops, buffer->size, O_RDWR);
if (IS_ERR(dmabuf)) {
ion_buffer_put(buffer);
return PTR_ERR(dmabuf);
}
/*将file转换用户空间识别的fd描述符。*/
fd = dma_buf_fd(dmabuf, O_CLOEXEC);
if (fd < 0)
dma_buf_put(dmabuf);
return fd;
}
struct dma_buf *dma_buf_export(void *priv, const struct dma_buf_ops *ops,
size_t size, int flags)
{
struct dma_buf *dmabuf;
struct file *file;
~~snip
/*分配struct dma_buf.*/
dmabuf = kzalloc(sizeof(struct dma_buf), GFP_KERNEL);
if (dmabuf == NULL)
return ERR_PTR(-ENOMEM);
/*保存信息到dmabuf，注意ops为dma_buf_ops，后面mmap为调用到。*/
dmabuf->priv = priv;
dmabuf->ops = ops;
dmabuf->size = size;
/*产生新的file*/
file = anon_inode_getfile("dmabuf", &dma_buf_fops, dmabuf, flags);
dmabuf->file = file;
mutex_init(&dmabuf->lock);
INIT_LIST_HEAD(&dmabuf->attachments);
return dmabuf;
}

通过上述过程，用户空间就得到了新的fd,重新生成一个新的fd的目的是考虑了两个用户空间进程想共享这块heap内存的情况。然后再对fd作mmap，相应地kernel空间就调用到了file 的dma_buf_fops中的dma_buf_mmap_internal。

[cpp] view plain copy

static const struct file_operations dma_buf_fops = {
.release = dma_buf_release,
.mmap = dma_buf_mmap_internal,
};
static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
{
struct dma_buf *dmabuf;
if (!is_dma_buf_file(file))
return -EINVAL;
dmabuf = file->private_data;
/*检查用户空间要映射的size是否比目前dmabuf也就是当前heap的size
还要大，如果是就返回无效。*/
/* check for overflowing the buffer's size */
if (vma->vm_pgoff + ((vma->vm_end - vma->vm_start) >> PAGE_SHIFT) >
dmabuf->size >> PAGE_SHIFT)
return -EINVAL;
/*调用的是dma_buf_ops 的mmap函数*/
return dmabuf->ops->mmap(dmabuf, vma);
}
struct dma_buf_ops dma_buf_ops = {
.map_dma_buf = ion_map_dma_buf,
.unmap_dma_buf = ion_unmap_dma_buf,
.mmap = ion_mmap,
.release = ion_dma_buf_release,
.begin_cpu_access = ion_dma_buf_begin_cpu_access,
.end_cpu_access = ion_dma_buf_end_cpu_access,
.kmap_atomic = ion_dma_buf_kmap,
.kunmap_atomic = ion_dma_buf_kunmap,
.kmap = ion_dma_buf_kmap,
.kunmap = ion_dma_buf_kunmap,
};
static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
{
struct ion_buffer *buffer = dmabuf->priv;
int ret;
if (!buffer->heap->ops->map_user) {
pr_err("%s: this heap does not define a method for mapping "
"to userspace\n", __func__);
return -EINVAL;
}
mutex_lock(&buffer->lock);
/* now map it to userspace */
/*调用的是相应heap的map_user，如carveout_heap_ops 调用的是
ion_carveout_heap_map_user ，此函数就是一般的mmap实现，不追下去了。*/
ret = buffer->heap->ops->map_user(buffer->heap, buffer, vma);
if (ret) {
mutex_unlock(&buffer->lock);
pr_err("%s: failure mapping buffer to userspace\n",
__func__);
} else {
buffer->umap_cnt++;
mutex_unlock(&buffer->lock);
vma->vm_ops = &ion_vm_ops;
/*
* move the buffer into the vm_private_data so we can access it
* from vma_open/close
*/
vma->vm_private_data = buffer;
}
return ret;
}

[cpp] view plain copy

static const struct file_operations dma_buf_fops = {
.release = dma_buf_release,
.mmap = dma_buf_mmap_internal,
};
static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
{
struct dma_buf *dmabuf;
if (!is_dma_buf_file(file))
return -EINVAL;
dmabuf = file->private_data;
/*检查用户空间要映射的size是否比目前dmabuf也就是当前heap的size
还要大，如果是就返回无效。*/
/* check for overflowing the buffer's size */
if (vma->vm_pgoff + ((vma->vm_end - vma->vm_start) >> PAGE_SHIFT) >
dmabuf->size >> PAGE_SHIFT)
return -EINVAL;
/*调用的是dma_buf_ops 的mmap函数*/
return dmabuf->ops->mmap(dmabuf, vma);
}
struct dma_buf_ops dma_buf_ops = {
.map_dma_buf = ion_map_dma_buf,
.unmap_dma_buf = ion_unmap_dma_buf,
.mmap = ion_mmap,
.release = ion_dma_buf_release,
.begin_cpu_access = ion_dma_buf_begin_cpu_access,
.end_cpu_access = ion_dma_buf_end_cpu_access,
.kmap_atomic = ion_dma_buf_kmap,
.kunmap_atomic = ion_dma_buf_kunmap,
.kmap = ion_dma_buf_kmap,
.kunmap = ion_dma_buf_kunmap,
};
static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
{
struct ion_buffer *buffer = dmabuf->priv;
int ret;
if (!buffer->heap->ops->map_user) {
pr_err("%s: this heap does not define a method for mapping "
"to userspace\n", __func__);
return -EINVAL;
}
mutex_lock(&buffer->lock);
/* now map it to userspace */
/*调用的是相应heap的map_user，如carveout_heap_ops 调用的是
ion_carveout_heap_map_user ，此函数就是一般的mmap实现，不追下去了。*/
ret = buffer->heap->ops->map_user(buffer->heap, buffer, vma);
if (ret) {
mutex_unlock(&buffer->lock);
pr_err("%s: failure mapping buffer to userspace\n",
__func__);
} else {
buffer->umap_cnt++;
mutex_unlock(&buffer->lock);
vma->vm_ops = &ion_vm_ops;
/*
* move the buffer into the vm_private_data so we can access it
* from vma_open/close
*/
vma->vm_private_data = buffer;
}
return ret;
}

至此，用户空间就得到了bufferaddress，然后可以使用了！

ION_IOC_IMPORT

当用户空间另一个进程需要这块heap的时候，ION_IOC_IMPORT就派上用处了！注意，

传进去的fd为在ION_IOC_SHARE中得到的。

[cpp] view plain copy

struct ion_handle *ion_import_dma_buf(struct ion_client *client, int fd)
{
struct dma_buf *dmabuf;
struct ion_buffer *buffer;
struct ion_handle *handle;
dmabuf = dma_buf_get(fd);
if (IS_ERR_OR_NULL(dmabuf))
return ERR_PTR(PTR_ERR(dmabuf));
/* if this memory came from ion */
~~snip
buffer = dmabuf->priv;
mutex_lock(&client->lock);
/* if a handle exists for this buffer just take a reference to it */
/*查找是否已经存在对应的handle了，没有则创建。因为另外一个进程只是
调用了open 接口，对应的只创建了client，并没有handle。
*/
handle = ion_handle_lookup(client, buffer);
if (!IS_ERR_OR_NULL(handle)) {
ion_handle_get(handle);
goto end;
}
handle = ion_handle_create(client, buffer);
if (IS_ERR_OR_NULL(handle))
goto end;
ion_handle_add(client, handle);
end:
mutex_unlock(&client->lock);
dma_buf_put(dmabuf);
return handle;
}

[cpp] view plain copy

struct ion_handle *ion_import_dma_buf(struct ion_client *client, int fd)
{
struct dma_buf *dmabuf;
struct ion_buffer *buffer;
struct ion_handle *handle;
dmabuf = dma_buf_get(fd);
if (IS_ERR_OR_NULL(dmabuf))
return ERR_PTR(PTR_ERR(dmabuf));
/* if this memory came from ion */
~~snip
buffer = dmabuf->priv;
mutex_lock(&client->lock);
/* if a handle exists for this buffer just take a reference to it */
/*查找是否已经存在对应的handle了，没有则创建。因为另外一个进程只是
调用了open 接口，对应的只创建了client，并没有handle。
*/
handle = ion_handle_lookup(client, buffer);
if (!IS_ERR_OR_NULL(handle)) {
ion_handle_get(handle);
goto end;
}
handle = ion_handle_create(client, buffer);
if (IS_ERR_OR_NULL(handle))
goto end;
ion_handle_add(client, handle);
end:
mutex_unlock(&client->lock);
dma_buf_put(dmabuf);
return handle;
}

这样，用户空间另一个进程也得到了对应的bufferHandle，client/buffer/handle之间连接起来了！然后另一个一个进程就也可以使用mmap来操作这块heap buffer了。

和一般的进程使用ION区别就是共享的进程之间struction_buffer是共享的，而struct ion_handle是各自的。

可见，ION的使用流程还是比较清晰的。不过要记得的是，使用好了ION，一定要释放掉，否则会导致内存泄露。

ION内核空间使用

内核空间使用ION也是大同小异，按照创建client,buffer,handle的流程，只是它的使用对用户空间来说是透明的罢了！

ion_client_create在kernel空间被Qualcomm给封装了下。

[cpp] view plain copy

struct ion_client *msm_ion_client_create(unsigned int heap_mask,
const char *name)
{
return ion_client_create(idev, heap_mask, name);
}

[cpp] view plain copy

struct ion_client *msm_ion_client_create(unsigned int heap_mask,
const char *name)
{
return ion_client_create(idev, heap_mask, name);
}

调用的流程也类似，不过map的时候调用的是heap对应的map_kernel()而不是map_user().

msm_ion_client_create -> ion_alloc ->ion_map_kernel

为什么需要ION

回顾2011年末[2]，LWN审查了android kernel patch[3]，以期望将这些patch合并到kernel主线中。但是PMEM（android实现的一个内存分配器）使这个愿望破灭了。为什么PMEM不被linux 社区接受的原因在[3]中有讲到。从那开始，PMEM很明确会被完全抛弃，取而代之的是ION内存管理器。ION是google在Android4.0 ICS为了解决内存碎片管理而引入的通用内存管理器，它会更加融合kernel。目前QCOM MSM, NVDIA Tegra, TI OMAP, MRVL PXA都用ION替换PMEM。

如何获取source code

http://android.googlesource.com/kernel/common.git

ION codes reside in drivers/gpu/ion

Specific usage examples on omap4:

http://android.googlesource.com/kernel/omap.git

ION 框架[1]

ION 定义了四种不同的heap，实现不同的内存分配策略。

ION_HEAP_TYPE_SYSTEM : 通过vmalloc分配内存
ION_HEAP_TYPE_SYSTEM_CONTIG: 通过kmalloc分配内存
ION_HEAP_TYPE_CARVEOUT: 在保留内存块中（reserve memory）分配内存
ION_HEAP_TYPE_CUSTOM: 由客户自己定义

下图是两个client共享内存的示意图。图中有2个heap（每种heap都有自己的内存分配策略），每个heap中分配了若干个buffer。client的handle管理到对应的buffer。两个client是通过文件描述符fd来实现内存共享的。

ION APIs

用户空间 API

定义了6种 ioctl 接口，可以与用户应用程序交互。

ION_IOC_ALLOC: 分配内存
ION_IOC_FREE: 释放内存
ION_IOC_MAP: 获取文件描述符进行mmap (? 在code中未使用这个定义)
ION_IOC_SHARE: 创建文件描述符来实现共享内存
ION_IOC_IMPORT: 获取文件描述符
ION_IOC_CUSTOM: 调用用户自定义的ioctl

ION_IOC_SHARE 及ION_IOC_IMPORT是基于DMABUF实现的，所以当共享进程获取文件描述符后，可以直接调用mmap来操作共享内存。mmap实现由DMABUF子系统调用ION子系统中mmap回调函数完成。

内核空间 API

内核驱动也可以注册为一个ION的客户端（client），可以选择使用哪种类型的heap来申请内存。

ion_client_create: 分配一个客户端。
ion_client_destroy: 释放一个客户端及绑定在它上面的所有ion handle.

ion handle: 这里每个ion handle映射到一个buffer中，每个buffer关联一个heap。也就是说一个客户端可以操作多块buffer。

Buffer 申请及释放函数:

ion_alloc: 申请ion内存，返回ion handle
ion_free: 释放ion handle

ION 通过handle来管理buffer，驱动需要可以访问到buffer的地址。ION通过下面的函数来达到这个目的

ion_phys: 返回buffer的物理地址(address)及大小(size)
ion_map_kernel: 给指定的buffer创建内核内存映射
ion_unmap_kernel: 销毁指定buffer的内核内存映射
ion_map_dma: 为指定buffer创建dma 映射，返回sglist（scatter/gather list）
ion_unmap_dma: 销毁指定buffer的dma映射

ION是通过handle而非buffer地址来实现驱动间共享内存，用户空间共享内存也是利用同样原理。

ion_share: given a handle, obtain a buffer to pass to other clients
ion_import: given an buffer in another client, import it
ion_import_fd: given an fd obtained via ION_IOC_SHARE ioctl, import it

Heap API

Heap 接口定义 [drivers/gpu/ion/ion_priv.h]

这些接口不是暴露给驱动或者用户应用程序的。

/**
 * struct ion_heap_ops - ops to operate on a given heap
 * @allocate:           allocate memory
 * @free:               free memory
 * @phys                get physical address of a buffer (only define on physically contiguous heaps)
 * @map_dma             map the memory for dma to a scatterlist
 * @unmap_dma           unmap the memory for dma
 * @map_kernel          map memory to the kernel
 * @unmap_kernel        unmap memory to the kernel
 * @map_user            map memory to userspace
 */
struct ion_heap_ops {
        int (*allocate) (struct ion_heap *heap, struct ion_buffer *buffer, unsigned long len,unsigned long align, unsigned long flags);
        void (*free) (struct ion_buffer *buffer);
        int (*phys) (struct ion_heap *heap, struct ion_buffer *buffer, ion_phys_addr_t *addr, size_t *len);
        struct scatterlist *(*map_dma) (struct ion_heap *heap, struct ion_buffer *buffer);
        void (*unmap_dma) (struct ion_heap *heap, struct ion_buffer *buffer);
        void * (*map_kernel) (struct ion_heap *heap, struct ion_buffer *buffer);
        void (*unmap_kernel) (struct ion_heap *heap, struct ion_buffer *buffer);
        int (*map_user) (struct ion_heap *mapper, struct ion_buffer *buffer, struct vm_area_struct *vma);
};

ION debug

ION 在/sys/kernel/debug/ion/ 提供一个debugfs 接口。

每个heap都有自己的debugfs目录，client内存使用状况显示在/sys/kernel/debug/ion/<<heap name>>

$cat /sys/kernel/debug/ion/ion-heap-1 
          client              pid             size
        test_ion             2890            16384

每个由pid标识的client也有一个debugfs目录/sys/kernel/debug/ion/<<pid>>

$cat /sys/kernel/debug/ion/2890 
       heap_name:    size_in_bytes
      ion-heap-1:    40960 11

参考文献

1. https://wiki.linaro.org/BenjaminGaignard/ion

2. http://lwn.net/Articles/480055/

3. http://lwn.net/Articles/472984/

内核内存池管理技术实现分析

一．Linux系统内核内存管理简介

Linux采用“按需调页”算法，支持三层页式存储管理策略。将每个用户进程4GB长度的虚拟内存划分成固定大小的页面。其中0至3GB是用户态空间，由各进程独占；3GB到4GB是内核态空间，由所有进程共享，但只有内核态进程才能访问。

Linux将物理内存也划分成固定大小的页面，由数据结构page管理，有多少页面就有多少page结构，它们又作为元素组成一个数组mem_map[]。

slab：在操作系统的运作过程中，经常会涉及到大量对象的重复生成、使用和释放问题。对象生成算法的改进，可以在很大程度上提高整个系统的性能。在Linux系统中所用到的对象，比较典型的例子是inode、task_struct等，都又这些特点。一般说来，这类对象的种类相对稳定，每种对象的数量却是巨大的，并且在初始化与析构时要做大量的工作，所占用的时间远远超过内存分配的时间。但是这些对象往往具有这样一个性质，即他们在生成时，所包括的成员属性值一般都赋成确定的数值，并且在使用完毕，释放结构前，这些属性又恢复为未使用前的状态。因此，如果我们能够用合适的方法使得在对象前后两次背使用时，在同一块内存，或同一类内存空间，且保留了基本的数据结构，就可以大大提高效率。slab算法就是针对上述特点设计的。

slab算法思路中最基本的一点被称为object-caching，即对象缓存。其核心做法就是保留对象初始化状态的不变部分，这样对象就用不着在每次使用时重新初始化（构造）及破坏（析构）。

面向对象的slab分配中有如下几个术语：

l 缓冲区（cache）：一种对象的所有实例都存在同一个缓存区中。不同的对象，即使大小相同，也放在不同的缓存区内。每个缓存区有若干个slab，按照满，半满，空的顺序排列。在slab分配的思想中，整个内核态内存块可以看作是按照这种缓存区来组织的，对每一种对象使用一种缓存区，缓存区的管理者记录了这个缓存区中对象的大小，性质，每个slab块中对象的个数以及每个slab块大小。

l slab块：slab块是内核内存分配与页面级分配的接口。每个slab块的大小都是页面大小的整数倍，有若干个对象组成。slab块共分为三类：

完全块：没有空闲对象。

部分块：只分配了部分对象空间，仍有空闲对象。

空闲块：没有分配对象，即整个块内对象空间均可分配。

在申请新的对象空间时，如果缓冲区中存在部分块，那么首先查看部分块寻找空闲对象空间，若未成功再查看空闲块，如果还未成功则为这个对象分配一块新的slab块。

l 对象：将被申请的空间视为对象，使用构造函数初始化对象，然后由用户使用对象。

二．内存池的数据结构

Linux内存池是在2.6版内核中加入的，主要的数据结构定义在mm/mempool.c中。

typedef struct mempool_s {

spinlock_t lock;

int min_nr; /* elements数组中的成员数量 */

int curr_nr; /* 当前elements数组中空闲的成员数量 */

void **elements; /* 用来存放内存成员的二维数组，其长度为min_nr，宽度是上述各个内存对象的长度，因为对于不同的对象类型，我们会创建相应的内存池对象，所以每个内存池对象实例的element宽度都是跟其内存对象相关的 */

void *pool_data; /* 内存池与内核缓冲区结合使用（上面的简介当中提到了，Linux采用slab技术预先为每一种内存对象分配了缓存区，每当我们申请某个类型的内存对象时，实际是从这种缓存区获取内存），这个指针一般是指向这种内存对象对应的缓存区的指针 */

mempool_alloc_t *alloc; /* 用户在创建一个内存池对象时提供的内存分配函数，这个函数可以用户自行编写（因为对于某个内存对象如何获取内存，其开发者完全可以自行控制），也可以采用内存池提供的分配函数 */

mempool_free_t *free; /* 内存释放函数，其它同上 */

wait_queue_head_t wait;/* 任务等待队列 */

} mempool_t;

三．内核缓存区和内存池的初始化

上面提到，内存池的使用是与特定类型的内存对象缓存区相关联的。例如，在系统rpc服务中，系统初始化时，会为rpc_buffers预先分配缓存区，调用如下语句：

rpc_buffer_slabp = kmem_cache_create("rpc_buffers",

RPC_BUFFER_MAXSIZE,

0, SLAB_HWCACHE_ALIGN,

NULL, NULL);

调用kmem_cache_create函数从系统缓存区cache_cache中获取长度为RPC_BUFFER_MAXSIZE的缓存区大小的内存，作为rpc_buffer使用的缓存区。而以后对rpc操作的所有数据结构内存都是从这块缓存区申请，这是linux的slab技术的要点，而内存池也是基于这段缓存区进行的操作。

一旦rpc服务申请到了一个缓存区rpc_buffer_slabp以后，就可以创建一个内存池来管理这个缓存区了：

rpc_buffer_mempool = mempool_create(RPC_BUFFER_POOLSIZE,

mempool_alloc_slab,

mempool_free_slab,

rpc_buffer_slabp);

mempool_create函数就是内存池创建函数，负责为一类内存对象构造一个内存池，传递的参数包括，内存池大小，定制的内存分配函数，定制的内存析构函数，这个对象的缓存区指针。下面是mempool_create函数的具体实现：

/**

* mempool_create – 创建一个内存池对象

* @min_nr: 为内存池分配的最小内存成员数量

* @alloc_fn: 用户自定义内存分配函数

* @free_fn: 用户自定义内存释放函数

* @pool_data: 根据用户自定义内存分配函数所提供的可选私有数据，一般是缓存区指针

mempool_t * mempool_create(int min_nr, mempool_alloc_t *alloc_fn,

mempool_free_t *free_fn, void *pool_data)

{

mempool_t *pool;

/*为内存池对象分配内存*/

pool = kmalloc(sizeof(*pool), GFP_KERNEL);

if (!pool)

return NULL;

memset(pool, 0, sizeof(*pool));

/*根据内存池的最小长度为elements数组分配内存*/

pool->elements = kmalloc(min_nr * sizeof(void *), GFP_KERNEL);

if (!pool->elements) {

kfree(pool);

return NULL;

}

spin_lock_init(&pool->lock);

/*初始化内存池的相关参数*/

pool->min_nr = min_nr;

pool->pool_data = pool_data;

init_waitqueue_head(&pool->wait);

pool->alloc = alloc_fn;

pool->free = free_fn;

/*首先为内存池预先分配min_nr个element对象，这些对象就是为了存储相应类型的内存对象的。数据结构形入：

while (pool->curr_nr < pool->min_nr) {

void *element;

element = pool->alloc(GFP_KERNEL, pool->pool_data);

if (unlikely(!element)) {

free_pool(pool);

return NULL;

}

/*将刚刚申请到的内存挂到elements数组的相应位置上，并修改curr_nr的值*/

add_element(pool, element);

}

/*若成功创建内存池，则返回内存池对象的指针，这样就可以利用mempool_alloc和mempool_free访问内存池了。*/

return pool;

}

四．内存池的使用

如果需要使用已经创建的内存池，则需要调用mempool_alloc从内存池中申请内存以及调用mempool_free将用完的内存还给内存池。

void * mempool_alloc(mempool_t *pool, int gfp_mask)

{

void *element;

unsigned long flags;

DEFINE_WAIT(wait);

int gfp_nowait = gfp_mask & ~(__GFP_WAIT | __GFP_IO);

repeat_alloc:

/*这里存在一些不明白的地方，先将用户传递进来的gfp掩码标志去掉__GFP_WAIT 和 __GFP_IO 两个标志，试图调用用户自定义分配函数从缓存区申请一个内存对象，而不是首先从内存池从分配，如果申请不到，再从内存池中分配。*/

element = pool->alloc(gfp_nowait|__GFP_NOWARN, pool->pool_data);

if (likely(element != NULL))

return element;

/*如果池中的成员（空闲）的数量低于满时的一半时，需要额外从系统中申请内存，而不是从内存池中申请了。但是如果这段内存使用完了，则调用mempool_free将其存放到内存池中，下次使用就不再申请了。*/

mb();

if ((gfp_mask & __GFP_FS) && (gfp_mask != gfp_nowait) &&

(pool->curr_nr <= pool->min_nr/2)) {

element = pool->alloc(gfp_mask, pool->pool_data);

if (likely(element != NULL))

return element;

}

spin_lock_irqsave(&pool->lock, flags);

/*如果当前内存池不为空，则从池中获取一个内存对象，返回给申请者*/

if (likely(pool->curr_nr)) {

element = remove_element(pool);

spin_unlock_irqrestore(&pool->lock, flags);

return element;

}

spin_unlock_irqrestore(&pool->lock, flags);

/* We must not sleep in the GFP_ATOMIC case */

if (!(gfp_mask & __GFP_WAIT))

return NULL;

/*下面一部分应该和内核调度有关，所以暂时不看了*/

prepare_to_wait(&pool->wait, &wait, TASK_UNINTERRUPTIBLE);

mb();

if (!pool->curr_nr)

io_schedule();

finish_wait(&pool->wait, &wait);

goto repeat_alloc;

}

如果申请者调用mempool_free准备释放内存，实际上是将内存对象重新放到内存池中。源码实现如下：

void mempool_free(void *element, mempool_t *pool)

{

unsigned long flags;

mb();

/*如果当前内存池已经满，则直接调用用户内存释放函数将内存还给系统*/

if (pool->curr_nr < pool->min_nr) {

spin_lock_irqsave(&pool->lock, flags);

if (pool->curr_nr < pool->min_nr) {

/*如果内存池还有剩余的空间，则将内存对象放入池中，唤醒等待队列*/

add_element(pool, element);

spin_unlock_irqrestore(&pool->lock, flags);

wake_up(&pool->wait);

return;

}

spin_unlock_irqrestore(&pool->lock, flags);

}

pool->free(element, pool->pool_data);

}

这个函数十分简单，没有什么过多的分析了。

五．内存池实现总结

通过上面的分析，我们发现Linux内核的内存池实现相当简单。而C++STL中，实现了二级分配机制，初始化时将内存池按照内存的大小分成数个级别（每个级别均是8字节的整数倍，一般是8，16，24，…，128字节），每个级别都预先分配了20块内存。二级分配机制的基本思想是：如果用户申请的内存大于我们预定义的级别，则直接调用malloc从堆中分配内存，而如果申请的内存大小在128字节以内，则从最相近的内存大小中申请，例如申请的内存是10字节，则可以从16字节的组中取出一块交给申请者，如果该组的内存储量（初始是20）小于一定的值，就会根据一个算法（成为refill算法），再次从堆中申请一部分内存加入内存池，保证池中有一定量的内存可用。

而Linux的内存池实际上是与特定内存对象相关联的，每一种内存对象（例如task_struct）都有其特定的大小以及初始化方法，这个与STL的分级有点相似，但是内核主要还是根据实际的对象的大小来确定池中对象的大小。

内核内存池初始时从缓存区申请一定量的内存块，需要使用时从池中顺序查找空闲内存块并返回给申请者。回收时也是直接将内存插入池中，如果池已经满，则直接释放。内存池没有动态增加大小的能力，如果内存池中的内存消耗殆尽，则只能直接从缓存区申请内存，内存池的容量不会随着使用量的增加而增加。

memory pool 原理及使用

分类： 内存管理 2013-01-23 16:23 719人阅读 评论(0) 收藏 举报

chipset: msm8x25

codebase: android4.1

[html] view plain copy print ?

一、 初始化：



int __init memory_pool_init(void)

{

 int i;



 alloc_root = RB_ROOT;

 mutex_init(&alloc_mutex);

 for (i = 0; i < ARRAY_SIZE(mpools); i++) {

 mutex_init(&mpools[i].pool_mutex);

 mpools[i].gpool = NULL;

 }



 return 0;

}





Mpools结构体如下，最多能存放8个，存放类型由平台自己决定:

#define MAX_MEMPOOLS 8

struct mem_pool mpools[MAX_MEMPOOLS];

struct mem_pool {

 struct mutex pool_mutex;

 struct gen_pool gpool;

 unsigned long paddr; //存放的是物理或者虚拟地址都可以。

 unsigned long size; //pool 的size大小。

 unsigned long free; //还有多少空闲部分可用。

 unsigned int id;

};



本平台定义的type如下：

enum {

 MEMTYPE_NONE = -1,

 MEMTYPE_SMI_KERNEL = 0,

 MEMTYPE_SMI,

 MEMTYPE_EBI0,

 MEMTYPE_EBI1,

 MEMTYPE_MAX,

};



下面函数是和平台相关，其中调用了kernel中的initialize_memory_pool函数，

当然自己使用的时候也可用按照这种写法：

static void __init initialize_mempools(void)

{

 struct mem_pool mpool;

 int memtype;

 struct memtype_reserve mt;



 //保留内存相关信息，其实type为MEMTYPE_EBI0部分才有size，

因为平台用的就是EBI1接口的DDR。

 mt = &reserve_info->memtype_reserve_table[0];

 for (memtype = 0; memtype < MEMTYPE_MAX; memtype++, mt++) {

 if (!mt->size)

 continue;

 //依次将平台所用到的保留内存信息保存到mpool中。

 mpool = initialize_memory_pool(mt->start, mt->size, memtype);

 if (!mpool)

 pr_warning("failed to create %s mempool\n",

 memtype_name[memtype]);

 }

}



好了，看公共的函数initialize_memory_pool：

struct mem_pool initialize_memory_pool(unsigned long start,

 unsigned long size, int mem_type)

{

 int id = mem_type;



 //类型不符合或者size小于4k就返回

 if (id >= MAX_MEMPOOLS || size <= PAGE_SIZE || size % PAGE_SIZE)

 return NULL;



 mutex_lock(&mpools[id].pool_mutex);



 mpools[id].paddr = start; //保留内存的虚拟地址，注意是虚拟地址。

 mpools[id].size = size; //能使用的总size

 mpools[id].free = size; //空闲size,一开始肯定和总size一样。

 mpools[id].id = id;

 mutex_unlock(&mpools[id].pool_mutex);



 pr_info("memory pool %d (start %lx size %lx) initialized\n",

 id, start, size);

 return &mpools[id];

}



二、 使用：

平台提供了两种接口供我们分配mempool：allocate_contiguous_ebi 和 allocate_contiguous_ebi_nomap，区别只在于是否map。

void allocate_contiguous_ebi(unsigned long size,

 unsigned long align, int cached)

{

 return allocate_contiguous_memory(size, get_ebi_memtype(),

 align, cached);

}

EXPORT_SYMBOL(allocate_contiguous_ebi);



unsigned long allocate_contiguous_ebi_nomap(unsigned long size,

 unsigned long align)

{

 return _allocate_contiguous_memory_nomap(size, get_ebi_memtype(),

 align, __builtin_return_address(0));

}

EXPORT_SYMBOL(allocate_contiguous_ebi_nomap);



static int get_ebi_memtype(void)

{

 / on 7x30 and 8x55 "EBI1 kernel PMEM" is really on EBI0 /

 if (cpu_is_msm7x30() || cpu_is_msm8x55())

 return MEMTYPE_EBI0;

 //平台返回的是这个。

 return MEMTYPE_EBI1;

}

其实对应地就是调用了kernel的分配连续内存接口，就看allocate_contiguous_memory如何实现。



void allocate_contiguous_memory(unsigned long size,

 int mem_type, unsigned long align, int cached)

{

 //叶框对齐

 unsigned long aligned_size = PFN_ALIGN(size);

 struct mem_pool mpool;



 mpool = mem_type_to_memory_pool(mem_type);

 if (!mpool)

 return NULL;

 return __alloc(mpool, aligned_size, align, cached,

 __builtin_return_address(0));



}



先看mem_type_to_memory_pool：

static struct mem_pool mem_type_to_memory_pool(int mem_type)

{

 //取得mem_type对应的mpool.

 struct mem_pool mpool = &mpools[mem_type];



 //这里只有MEMTYPE_EBI1对应的size有赋值，

所以其他的mpool都直接返回。

 if (!mpool->size)

 return NULL;



 mutex_lock(&mpool->pool_mutex);

 //初始化gpool

 if (!mpool->gpool)

 mpool->gpool = initialize_gpool(mpool->paddr, mpool->size);

 mutex_unlock(&mpool->pool_mutex);

 if (!mpool->gpool)

 return NULL;



 return mpool;

}



static struct gen_pool initialize_gpool(unsigned long start,

 unsigned long size)

{

 struct gen_pool gpool;



 //先创建gpool

 gpool = gen_pool_create(PAGE_SHIFT, -1);



 if (!gpool)

 return NULL;

 //添加gen pool

 if (gen_pool_add(gpool, start, size, -1)) {

 gen_pool_destroy(gpool);

 return NULL;

 }



 return gpool;

}



struct gen_pool gen_pool_create(int min_alloc_order, int nid)

{

 struct gen_pool pool;

 //比较简单，分配gen_pool空间。

 pool = kmalloc_node(sizeof(struct gen_pool), GFP_KERNEL, nid);

 if (pool != NULL) {

 spin_lock_init(&pool->lock);

 INIT_LIST_HEAD(&pool->chunks);

 // min_alloc_order为PAGE_SHIFT =12.

 pool->min_alloc_order = min_alloc_order;

 }

 return pool;

}



static inline int gen_pool_add(struct gen_pool pool, unsigned long addr,

 size_t size, int nid)

{

 return gen_pool_add_virt(pool, addr, -1, size, nid);

}



int gen_pool_add_virt(struct gen_pool pool, unsigned long virt, phys_addr_t phys,

 size_t size, int nid)

{

 struct gen_pool_chunk chunk;

 //看意思是一个PAGE_SIZE作为一个bit来计算。

 int nbits = size >> pool->min_alloc_order;

 //nbits都存放在gen_pool_chunk的bits[0]数组中，用bitmap来管理。

 int nbytes = sizeof(struct gen_pool_chunk) +

 (nbits + BITS_PER_BYTE - 1) / BITS_PER_BYTE;



 //分配struct gen_pool_chunk空间。

 if (nbytes <= PAGE_SIZE)

 chunk = kmalloc_node(nbytes, __GFP_ZERO, nid);

 else

 chunk = vmalloc(nbytes);

 if (unlikely(chunk == NULL))

 return -ENOMEM;

 if (nbytes > PAGE_SIZE)

 memset(chunk, 0, nbytes);



 chunk->phys_addr = phys; //保存物理地址，传进来的是-1,说明还没计算出来。

 chunk->start_addr = virt; //其实这个值是虚拟或者物理地址都可以。如果是//物理地址，就调用allocate_contiguous_memory，会ioremap一次。否则使用//_allocate_contiguous_memory_nomap就可以了。

 chunk->end_addr = virt + size; //chuank结束地址。

 atomic_set(&chunk->avail, size); //保存当前chunk有效size到avail中。



 spin_lock(&pool->lock);

//以rcu的形式添加到pool的chunks列表中。



 list_add_rcu(&chunk->next_chunk, &pool->chunks); spin_unlock(&pool->lock);



 return 0;

}



再看alloc，要动真格了：

static void alloc(struct mem_pool mpool, unsigned long size,

 unsigned long align, int cached, void caller)

{

 unsigned long paddr;

 void __iomem vaddr;



 unsigned long aligned_size;

 int log_align = ilog2(align);



 struct alloc node;



 aligned_size = PFN_ALIGN(size);

 //从gen pool去分配内存。

 paddr = gen_pool_alloc_aligned(mpool->gpool, aligned_size, log_align);

 if (!paddr)

 return NULL;



 node = kmalloc(sizeof(struct alloc), GFP_KERNEL);

 if (!node)

 goto out;



 //这里返回的肯定是物理内存，所以需要ioremap，调用、//_allocate_contiguous_memory_nomap那就不需要了。

 if (cached)

 vaddr = ioremap_cached(paddr, aligned_size);

 else

 vaddr = ioremap(paddr, aligned_size);



 if (!vaddr)

 goto out_kfree;



 node->vaddr = vaddr;

 //保留相对应参数到node节点中。

 node->paddr = paddr;

 node->len = aligned_size;

 node->mpool = mpool;

 node->caller = caller;

 //插入到红黑树中去管理。

 if (add_alloc(node))

 goto out_kfree;



 mpool->free -= aligned_size;



 return vaddr;

out_kfree:

 if (vaddr)

 iounmap(vaddr);

 kfree(node);

out:

 gen_pool_free(mpool->gpool, paddr, aligned_size);

 return NULL;

}



分析关键函数gen_pool_alloc_aligned：

unsigned long gen_pool_alloc_aligned(struct gen_pool pool, size_t size,

 unsigned alignment_order)

{

 struct gen_pool_chunk chunk;

 unsigned long addr = 0, align_mask = 0;

 int order = pool->min_alloc_order;

 int nbits, start_bit = 0, remain;



#ifndef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG

 BUG_ON(in_nmi());

#endif



 if (size == 0)

 return 0;



 if (alignment_order > order)

 align_mask = (1 << (alignment_order - order)) - 1;



 //获取当前申请size所对应的nbits数量。

 nbits = (size + (1UL << order) - 1) >> order;



 rcu_read_lock();

 //在当前pool的chunks列表上依次查询

 list_for_each_entry_rcu(chunk, &pool->chunks, next_chunk) {

 unsigned long chunk_size;

 if (size > atomic_read(&chunk->avail))

 continue;

 //本chunk所以拥有的总chunk size.

 chunk_size = (chunk->end_addr - chunk->start_addr) >> order;



retry:

 //寻找未被使用区域的start bit位置

 start_bit = bitmap_find_next_zero_area_off(chunk->bits, chunk_size,

 0, nbits, align_mask,

 chunk->start_addr);

 //如果超出chunk size，那么再看下一个chunk。

 if (start_bit >= chunk_size)

 continue;

 //没超出那就设置nbits的大小表示这部分内存已经被使用了

 remain = bitmap_set_ll(chunk->bits, start_bit, nbits);

 if (remain) {

 remain = bitmap_clear_ll(chunk->bits, start_bit,

 nbits - remain);

 BUG_ON(remain);

 goto retry;

 }



 //获取当前申请size对应的address，这里为物理地址。

 addr = chunk->start_addr + ((unsigned long)start_bit << order);

 size = nbits << pool->min_alloc_order;

 //计算还有多少size可以供其他进程申请。

 atomic_sub(size, &chunk->avail);

 break;

 }

 rcu_read_unlock();

 return addr;

}



对于bitmap如何使用，这里就不具体追踪了，看函数名知道大概就可以了。







最后，我们看下_allocate_contiguous_memory_nomap，其实和上面的区别在于是否remap.

unsigned long _allocate_contiguous_memory_nomap(unsigned long size,

 int mem_type, unsigned long align, void caller)

{

 unsigned long paddr;

 unsigned long aligned_size;



 struct alloc node;

 struct mem_pool mpool;

 int log_align = ilog2(align);



 mpool = mem_type_to_memory_pool(mem_type);

 if (!mpool || !mpool->gpool)

 return 0;



 aligned_size = PFN_ALIGN(size);

 paddr = gen_pool_alloc_aligned(mpool->gpool, aligned_size, log_align);

 if (!paddr)

 return 0;



 node = kmalloc(sizeof(struct alloc), GFP_KERNEL);

 if (!node)

 goto out;



 node->paddr = paddr;



 /* We search the tree using node->vaddr, so set

 * it to something unique even though we don't

 * use it for physical allocation nodes.

 * The virtual and physical address ranges

 * are disjoint, so there won't be any chance of

 * a duplicate node->vaddr value.

 /

 //区别就在于这一步，因为这个函数传进来的就是虚拟地址，所以我们没必要再ioremap了，直接使用。

 node->vaddr = (void )paddr;

 node->len = aligned_size;

 node->mpool = mpool;

 node->caller = caller;

 if (add_alloc(node))

 goto out_kfree;



 mpool->free -= aligned_size;

 return paddr;

out_kfree:

 kfree(node);

out:

 gen_pool_free(mpool->gpool, paddr, aligned_size);

 return 0;

}





Mempool还是比较简单的，后续的ION的使用我们就会看到它使用了mempool了。





2013/01/23

3.内核缓冲区
(1)终端缓冲

　　终端设备有输入和输出队列缓冲区,如下图所示

　　

　　以输入队列为例,从键盘输入的字符经线路规程过滤后进入输入队列,用户程序以先进先出的顺序从队列中读取字符,一般情况下,当输入队列满的时候再输入字符会丢失,同时系统会响铃警报。终端可以配置成回显(Echo)模式,在这种模式下,输入队列中的每个字符既送给用户程序也送给输出队列,因此我们在命令行键入字符时,该字符不仅可以被程序读取,我们也可以同时在屏幕上看到该字符的回显。
注意上述情况是用户进程(shell进程也是)调用read/write等unbuffer I/O函数的情况,当调用printf/scanf(底层实现也是read/write)等C标准I/O库函数时,当用户程序调用scanf读取键盘输入时,开始输入的字符都存到C标准库的I/O缓冲区,直到我们遇到换行符(标准输入和标准输出都是行缓冲的)时,系统调用read将缓冲区的内容读到内核的终端输入队列;当调用printf打印一个字符串时,如果语句中带换行符,则立刻将放在I/O缓冲区的字符串调用write写到内核的输出队列,打印到屏幕上,如果printf语句没带换行符,则由上面的讨论可知,程序退出时会做fflush操作.

(2)虽然write 系统调用位于C标准库I/O缓冲区的底层,被称为Unbuffered I/O函数,但在write 的底层也可以分配一个内核I/O缓冲区,所以write 也不一定是直接写到文件的,也可能写到内核I/O缓冲区中,可以使用fsync函数同步至磁盘文件，至于究竟写到了文件中还是内核缓冲区中对于进程来说是没有差别的,如果进程A和进程B打开同一文件,进程A写到内核I/O缓冲区中的数据从进程B也能读到,因为内核空间是进程共享的, 而c标准库的I/O缓冲区则不具有这一特性,因为进程的用户空间是完全独立的.

　　(3)为了减少读盘次数,内核缓存了目录的树状结构,称为dentry(directory entry(目录下项) cache

　　(4)FIFO和UNIX Domain Socket这两种IPC机制都是利用文件系统中的特殊文件来标识的。FIFO文件在磁盘上没有数据块,仅用来标识内核中的一条通道,各进程可以打开这个文件进行read / write ,实际上是在读写内核通道(根本原因在于这个file 结构体所指向的read 、write 函数和常规文件不一样),这样就实现了进程间通信。UNIX Domain Socket和FIFO的原理类似,也需要一个特殊的socket文件来标识内核中的通道,文件类型s表示socket,这些文件在磁盘上也没有数据块。UNIX Domain Socket是目前最广泛使用的IPC机制.如下图:

　　

　　4.stack overflow 无穷递归或者定义的极大数组都可能导致操作系统为程序预留的栈空间耗尽程序崩溃(段错误)

chipset: msm8x25

codebase: android4.1

[html] view plain copy print ?

一、初始化：
int __init memory_pool_init(void)
{
int i;
alloc_root = RB_ROOT;
mutex_init(&alloc_mutex);
for (i = 0; i < ARRAY_SIZE(mpools); i++) {
mutex_init(&mpools[i].pool_mutex);
mpools[i].gpool = NULL;
}
return 0;
}
Mpools结构体如下，最多能存放8个，存放类型由平台自己决定:
#define MAX_MEMPOOLS 8
struct mem_pool mpools[MAX_MEMPOOLS];
struct mem_pool {
struct mutex pool_mutex;
struct gen_pool *gpool;
unsigned long paddr; //存放的是物理或者虚拟地址都可以。
unsigned long size; //pool 的size大小。
unsigned long free; //还有多少空闲部分可用。
unsigned int id;
};
本平台定义的type如下：
enum {
MEMTYPE_NONE = -1,
MEMTYPE_SMI_KERNEL = 0,
MEMTYPE_SMI,
MEMTYPE_EBI0,
MEMTYPE_EBI1,
MEMTYPE_MAX,
};
下面函数是和平台相关，其中调用了kernel中的initialize_memory_pool函数，
当然自己使用的时候也可用按照这种写法：
static void __init initialize_mempools(void)
{
struct mem_pool *mpool;
int memtype;
struct memtype_reserve *mt;
//保留内存相关信息，其实type为MEMTYPE_EBI0部分才有size，
因为平台用的就是EBI1接口的DDR。
mt = &reserve_info->memtype_reserve_table[0];
for (memtype = 0; memtype < MEMTYPE_MAX; memtype++, mt++) {
if (!mt->size)
continue;
//依次将平台所用到的保留内存信息保存到mpool中。
mpool = initialize_memory_pool(mt->start, mt->size, memtype);
if (!mpool)
pr_warning("failed to create %s mempool\n",
memtype_name[memtype]);
}
}
好了，看公共的函数initialize_memory_pool：
struct mem_pool *initialize_memory_pool(unsigned long start,
unsigned long size, int mem_type)
{
int id = mem_type;
//类型不符合或者size小于4k就返回
if (id >= MAX_MEMPOOLS || size <= PAGE_SIZE || size % PAGE_SIZE)
return NULL;
mutex_lock(&mpools[id].pool_mutex);
mpools[id].paddr = start; //保留内存的虚拟地址，注意是虚拟地址。
mpools[id].size = size; //能使用的总size
mpools[id].free = size; //空闲size,一开始肯定和总size一样。
mpools[id].id = id;
mutex_unlock(&mpools[id].pool_mutex);
pr_info("memory pool %d (start %lx size %lx) initialized\n",
id, start, size);
return &mpools[id];
}
二、使用：
平台提供了两种接口供我们分配mempool：allocate_contiguous_ebi 和 allocate_contiguous_ebi_nomap，区别只在于是否map。
void *allocate_contiguous_ebi(unsigned long size,
unsigned long align, int cached)
{
return allocate_contiguous_memory(size, get_ebi_memtype(),
align, cached);
}
EXPORT_SYMBOL(allocate_contiguous_ebi);
unsigned long allocate_contiguous_ebi_nomap(unsigned long size,
unsigned long align)
{
return _allocate_contiguous_memory_nomap(size, get_ebi_memtype(),
align, __builtin_return_address(0));
}
EXPORT_SYMBOL(allocate_contiguous_ebi_nomap);
static int get_ebi_memtype(void)
{
/* on 7x30 and 8x55 "EBI1 kernel PMEM" is really on EBI0 */
if (cpu_is_msm7x30() || cpu_is_msm8x55())
return MEMTYPE_EBI0;
//平台返回的是这个。
return MEMTYPE_EBI1;
}
其实对应地就是调用了kernel的分配连续内存接口，就看allocate_contiguous_memory如何实现。
void *allocate_contiguous_memory(unsigned long size,
int mem_type, unsigned long align, int cached)
{
//叶框对齐
unsigned long aligned_size = PFN_ALIGN(size);
struct mem_pool *mpool;
mpool = mem_type_to_memory_pool(mem_type);
if (!mpool)
return NULL;
return __alloc(mpool, aligned_size, align, cached,
__builtin_return_address(0));
}
先看mem_type_to_memory_pool：
static struct mem_pool *mem_type_to_memory_pool(int mem_type)
{
//取得mem_type对应的mpool.
struct mem_pool *mpool = &mpools[mem_type];
//这里只有MEMTYPE_EBI1对应的size有赋值，
所以其他的mpool都直接返回。
if (!mpool->size)
return NULL;
mutex_lock(&mpool->pool_mutex);
//初始化gpool
if (!mpool->gpool)
mpool->gpool = initialize_gpool(mpool->paddr, mpool->size);
mutex_unlock(&mpool->pool_mutex);
if (!mpool->gpool)
return NULL;
return mpool;
}
static struct gen_pool *initialize_gpool(unsigned long start,
unsigned long size)
{
struct gen_pool *gpool;
//先创建gpool
gpool = gen_pool_create(PAGE_SHIFT, -1);
if (!gpool)
return NULL;
//添加gen pool
if (gen_pool_add(gpool, start, size, -1)) {
gen_pool_destroy(gpool);
return NULL;
}
return gpool;
}
struct gen_pool *gen_pool_create(int min_alloc_order, int nid)
{
struct gen_pool *pool;
//比较简单，分配gen_pool空间。
pool = kmalloc_node(sizeof(struct gen_pool), GFP_KERNEL, nid);
if (pool != NULL) {
spin_lock_init(&pool->lock);
INIT_LIST_HEAD(&pool->chunks);
// min_alloc_order为PAGE_SHIFT =12.
pool->min_alloc_order = min_alloc_order;
}
return pool;
}
static inline int gen_pool_add(struct gen_pool *pool, unsigned long addr,
size_t size, int nid)
{
return gen_pool_add_virt(pool, addr, -1, size, nid);
}
int gen_pool_add_virt(struct gen_pool *pool, unsigned long virt, phys_addr_t phys,
size_t size, int nid)
{
struct gen_pool_chunk *chunk;
//看意思是一个PAGE_SIZE作为一个bit来计算。
int nbits = size >> pool->min_alloc_order;
//nbits都存放在gen_pool_chunk的bits[0]数组中，用bitmap来管理。
int nbytes = sizeof(struct gen_pool_chunk) +
(nbits + BITS_PER_BYTE - 1) / BITS_PER_BYTE;
//分配struct gen_pool_chunk空间。
if (nbytes <= PAGE_SIZE)
chunk = kmalloc_node(nbytes, __GFP_ZERO, nid);
else
chunk = vmalloc(nbytes);
if (unlikely(chunk == NULL))
return -ENOMEM;
if (nbytes > PAGE_SIZE)
memset(chunk, 0, nbytes);
chunk->phys_addr = phys; //保存物理地址，传进来的是-1,说明还没计算出来。
chunk->start_addr = virt; //其实这个值是虚拟或者物理地址都可以。如果是//物理地址，就调用allocate_contiguous_memory，会ioremap一次。否则使用//_allocate_contiguous_memory_nomap就可以了。
chunk->end_addr = virt + size; //chuank结束地址。
atomic_set(&chunk->avail, size); //保存当前chunk有效size到avail中。
spin_lock(&pool->lock);
//以rcu的形式添加到pool的chunks列表中。
list_add_rcu(&chunk->next_chunk, &pool->chunks); spin_unlock(&pool->lock);
return 0;
}
再看__alloc，要动真格了：
static void *__alloc(struct mem_pool *mpool, unsigned long size,
unsigned long align, int cached, void *caller)
{
unsigned long paddr;
void __iomem *vaddr;
unsigned long aligned_size;
int log_align = ilog2(align);
struct alloc *node;
aligned_size = PFN_ALIGN(size);
//从gen pool去分配内存。
paddr = gen_pool_alloc_aligned(mpool->gpool, aligned_size, log_align);
if (!paddr)
return NULL;
node = kmalloc(sizeof(struct alloc), GFP_KERNEL);
if (!node)
goto out;
//这里返回的肯定是物理内存，所以需要ioremap，调用、//_allocate_contiguous_memory_nomap那就不需要了。
if (cached)
vaddr = ioremap_cached(paddr, aligned_size);
else
vaddr = ioremap(paddr, aligned_size);
if (!vaddr)
goto out_kfree;
node->vaddr = vaddr;
//保留相对应参数到node节点中。
node->paddr = paddr;
node->len = aligned_size;
node->mpool = mpool;
node->caller = caller;
//插入到红黑树中去管理。
if (add_alloc(node))
goto out_kfree;
mpool->free -= aligned_size;
return vaddr;
out_kfree:
if (vaddr)
iounmap(vaddr);
kfree(node);
out:
gen_pool_free(mpool->gpool, paddr, aligned_size);
return NULL;
}
分析关键函数gen_pool_alloc_aligned：
unsigned long gen_pool_alloc_aligned(struct gen_pool *pool, size_t size,
unsigned alignment_order)
{
struct gen_pool_chunk *chunk;
unsigned long addr = 0, align_mask = 0;
int order = pool->min_alloc_order;
int nbits, start_bit = 0, remain;
#ifndef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
BUG_ON(in_nmi());
#endif
if (size == 0)
return 0;
if (alignment_order > order)
align_mask = (1 << (alignment_order - order)) - 1;
//获取当前申请size所对应的nbits数量。
nbits = (size + (1UL << order) - 1) >> order;
rcu_read_lock();
//在当前pool的chunks列表上依次查询
list_for_each_entry_rcu(chunk, &pool->chunks, next_chunk) {
unsigned long chunk_size;
if (size > atomic_read(&chunk->avail))
continue;
//本chunk所以拥有的总chunk size.
chunk_size = (chunk->end_addr - chunk->start_addr) >> order;
retry:
//寻找未被使用区域的start bit位置
start_bit = bitmap_find_next_zero_area_off(chunk->bits, chunk_size,
0, nbits, align_mask,
chunk->start_addr);
//如果超出chunk size，那么再看下一个chunk。
if (start_bit >= chunk_size)
continue;
//没超出那就设置nbits的大小表示这部分内存已经被使用了
remain = bitmap_set_ll(chunk->bits, start_bit, nbits);
if (remain) {
remain = bitmap_clear_ll(chunk->bits, start_bit,
nbits - remain);
BUG_ON(remain);
goto retry;
}
//获取当前申请size对应的address，这里为物理地址。
addr = chunk->start_addr + ((unsigned long)start_bit << order);
size = nbits << pool->min_alloc_order;
//计算还有多少size可以供其他进程申请。
atomic_sub(size, &chunk->avail);
break;
}
rcu_read_unlock();
return addr;
}
对于bitmap如何使用，这里就不具体追踪了，看函数名知道大概就可以了。
最后，我们看下_allocate_contiguous_memory_nomap，其实和上面的区别在于是否remap.
unsigned long _allocate_contiguous_memory_nomap(unsigned long size,
int mem_type, unsigned long align, void *caller)
{
unsigned long paddr;
unsigned long aligned_size;
struct alloc *node;
struct mem_pool *mpool;
int log_align = ilog2(align);
mpool = mem_type_to_memory_pool(mem_type);
if (!mpool || !mpool->gpool)
return 0;
aligned_size = PFN_ALIGN(size);
paddr = gen_pool_alloc_aligned(mpool->gpool, aligned_size, log_align);
if (!paddr)
return 0;
node = kmalloc(sizeof(struct alloc), GFP_KERNEL);
if (!node)
goto out;
node->paddr = paddr;
/* We search the tree using node->vaddr, so set
* it to something unique even though we don't
* use it for physical allocation nodes.
* The virtual and physical address ranges
* are disjoint, so there won't be any chance of
* a duplicate node->vaddr value.
*/
//区别就在于这一步，因为这个函数传进来的就是虚拟地址，所以我们没必要再ioremap了，直接使用。
node->vaddr = (void *)paddr;
node->len = aligned_size;
node->mpool = mpool;
node->caller = caller;
if (add_alloc(node))
goto out_kfree;
mpool->free -= aligned_size;
return paddr;
out_kfree:
kfree(node);
out:
gen_pool_free(mpool->gpool, paddr, aligned_size);
return 0;
}
Mempool还是比较简单的，后续的ION的使用我们就会看到它使用了mempool了。
2013/01/23

3.内核缓冲区
(1)终端缓冲

　　终端设备有输入和输出队列缓冲区,如下图所示

　　以输入队列为例,从键盘输入的字符经线路规程过滤后进入输入队列,用户程序以先进先出的顺序从队列中读取字符,一般情况下,当输入队列满的时候再输入字符会丢失,同时系统会响铃警报。终端可以配置成回显(Echo)模式,在这种模式下,输入队列中的每个字符既送给用户程序也送给输出队列,因此我们在命令行键入字符时,该字符不仅可以被程序读取,我们也可以同时在屏幕上看到该字符的回显。
注意上述情况是用户进程(shell进程也是)调用read/write等unbuffer I/O函数的情况,当调用printf/scanf(底层实现也是read/write)等C标准I/O库函数时,当用户程序调用scanf读取键盘输入时,开始输入的字符都存到C标准库的I/O缓冲区,直到我们遇到换行符(标准输入和标准输出都是行缓冲的)时,系统调用read将缓冲区的内容读到内核的终端输入队列;当调用printf打印一个字符串时,如果语句中带换行符,则立刻将放在I/O缓冲区的字符串调用write写到内核的输出队列,打印到屏幕上,如果printf语句没带换行符,则由上面的讨论可知,程序退出时会做fflush操作.

(2)虽然write 系统调用位于C标准库I/O缓冲区的底层,被称为Unbuffered I/O函数,但在write 的底层也可以分配一个内核I/O缓冲区,所以write 也不一定是直接写到文件的,也可能写到内核I/O缓冲区中,可以使用fsync函数同步至磁盘文件，至于究竟写到了文件中还是内核缓冲区中对于进程来说是没有差别的,如果进程A和进程B打开同一文件,进程A写到内核I/O缓冲区中的数据从进程B也能读到,因为内核空间是进程共享的, 而c标准库的I/O缓冲区则不具有这一特性,因为进程的用户空间是完全独立的.

　　(3)为了减少读盘次数,内核缓存了目录的树状结构,称为dentry(directory entry(目录下项) cache

　　(4)FIFO和UNIX Domain Socket这两种IPC机制都是利用文件系统中的特殊文件来标识的。FIFO文件在磁盘上没有数据块,仅用来标识内核中的一条通道,各进程可以打开这个文件进行read / write ,实际上是在读写内核通道(根本原因在于这个file 结构体所指向的read 、write 函数和常规文件不一样),这样就实现了进程间通信。UNIX Domain Socket和FIFO的原理类似,也需要一个特殊的socket文件来标识内核中的通道,文件类型s表示socket,这些文件在磁盘上也没有数据块。UNIX Domain Socket是目前最广泛使用的IPC机制.如下图:

　　4.stack overflow 无穷递归或者定义的极大数组都可能导致操作系统为程序预留的栈空间耗尽程序崩溃(段错误)