Linux内存管理详述

MMU

MMU=Segmentation Unit+Paging Unit //MMU: Memory Management Unit

logical address=>Segmentation Unit=>linear address=>Paging Unit=>physical address

Linux系统中采用虚拟内存管理技术来进行内存空间的管理, 即: 每个进程都可以拥有0~4G-1的虚拟内存地址空间(虚拟的,并不是真实存在的), 其中0~3G-1之间的地址空间叫做用户空间,而3G~4G-1之间的地址空间叫做内核空间,由操作系统（Paging Unit）负责建立虚拟内存地址到真实物理内存/文件的映射, 因此不同进程中的虚拟内存地址看起来是一样的, 但所对应的物理内存/文件是不一样的，%p打印出来的是虚拟内存地址(线性地址), 不是真实的地址
一般来说,绝大多数程序都运行在用户空间中, 不能直接访问内核空间, 但是内核提供了一些相关的函数可以用于访问内核空间，
虚拟内存技术可以解决物理内存不够用的问题eg：我们需要4G物理内存=>1G 的真实物理内存,3G的硬盘
内存地址的基本单位是字节, 内存映射的基本单位是内存页, 目前主流的OS一个内存页的大小是4Kb;
Segmentation Fault：
试图操作没有操作权限的内存区域时可能引发段错误, eg: 试图修改只读常量区中的数据内容时
试图访问没有经过映射的虚拟地址时可能引发段错误, eg: 读取顶定义地址(无效地址)中的数据内容时

malloc()

#include <stdlib.h>
void *malloc(size_t size);

使用malloc()申请动态内存时, 除了申请参数指定大小的内存空间之外, 还会申请额外的12byte(也可能不是12)用于存储该动态内存的管理信息, eg:大小, 是否空闲etc.
使用malloc()申请动态内存时, 注意避免对内存空间的越界访问, 以避免破坏该动态内存的管理信息, 也就避免Segmentation fault的产生
一般来说, 使用malloc()函数申请比较小块的动态内存时, 操作系统会一次性映射33个内存页的存储空间, 以防止多次malloc, 提高系统效率
malloc()在linux平台的实现会调用sbrk()

eg

 #include<stdlib.h>
int *p1=(int*)malloc(sizeof(int));  //会一口气分配33页, 把p1的管理信息放在p1之后的12byte
int *p2=(int*)malloc(sizeof(int));  //因为p1时分配的33页还没用完, 所以直接分配在p1的管理信息后
//管理信息的区域可以写入,但是你写了之后free(p1)就会段错误,所以不要写
//超出33page的内存你访问都不行, 直接段错误

#include<stdio.h>
#include<unistd.h>
#include<sys/types.h>
#include<stdlib.h>
int main(){
    printf("当前进程进程号:%d\n",getpid());
    int *pi=(int*)malloc(sizeof(int));
    printf("pi=%p\n",pi);   //0x21000 就是33个内存页//0x 1000 就是 1个内存页
    
    //故意越界一下试试, 不超过33内存页的范围
    *(pi+1024*30)=250;
    printf("*(pi+1024*30)=%d\n",*(pi+1024*30));
    //没有发生段错误
    
    //故意越界一下试试, 超过33内存页的范围
    *(pi+1024*33)=250;  //ATTENTION:pi是int*, 所以pi+1024*33可是pi+4*1024*33byte啊
    printf("*(pi+1024*33)=%d\n",*(pi+1024*33));
    //发生段错误
    while(1);
    return 0;
}
/*
$ ./a.out 
当前进程进程号:2787
pi=0x9c40008
*(pi+1024*30)=250
Segmentation fault (core dumped)
*/

free()

a) #include <stdlib.h>
void free(void *ptr);

frees the memory space pointed to by ptr, which must have been returned by a previous call to malloc(), calloc() or realloc()

Note:

使用free释放多少, 则从映射的总量中减去多少,当所有的动态内存全部释放时, 系统可能会保留33个内存页, 以提高效率
free() does not check the pointer passed to see whether it is NULL and does not set the pointer to NULL before it returns, while setting a pointer to NULL after freeing is a good practice.If the pointer is NULL, then no action is performed and the program will execute without terminating abnormally;

void safefree(void **pp){
    if(pp!=NULL&&*pp!=NULL){
    free(*pp);
    *pp=NULL;
    }
}
#define safeFree（p） safeFree（（void**）&（p））
int main(){
    int *pi;
    pi=(int*)malloc(sizeof(int));
    *pi=5;
    printf(“Before:%p\n”,pi);
    safefree(pi);
    printf(“After:%p\n”,pi);
    safefree(pi);
    return 0;
}

getpagesize()

#include <unistd.h>
int getpagesize(void);

returns the number of bytes in a memory page, where "page" is a fixed-length block, the unit for memory allocation and file mapping performed by mmap(2).

sbrk()

 #include <unistd.h>
void *sbrk(intptr_t increment); //intptr_t 是int的别名, _t都可以认为是int的别名,偶尔是short 或long etc

调整动态内存/虚拟内存的大小, increments the program's data space by increment bytes. Calling sbrk() with an increment of 0 can be used to find the current location of the program break.(当前动态分配内存的末尾位置)(程序断点,program break,可以理解为offset的位置)，成功返回the previous program break，失败返回(void*)-1
increment>0表示申请动态内存, 就是内存空间变大

increment=0表示获取当前动态内存的末尾地址, 就是内存空间不变

increment<0表示释放动态内存, 就是内存空间变小

#include<stdlib.h>
#include<unistd.h>
//使用sbrk()获取一个有效地址
void* pv=sbrk(0);
if((void*)-1==pv)
    perror("sbrk"),exit(-1);
//使用sbrk()在当前位置基础上申请一个int长度的内存
void* p2=sbrk(sizeof(int));
if((void*)-1==p2)
    perror("sbrk"),exit(-1);

Note:虽然sbrk()既能申请动态内存, 也能释放动态内存, 但是使用sbrk函数申请动态内存时更加方便,��般来说, 使用sbrk函数申请比较小块的动态内存时, 操作系统会映射1个内存页大小的内存空间, 当申请的动态内存超过1个内存也时, 系统会再次映射1个内存页的大小, 当所有动态内存释放完毕时, 系统不会保留任何的动态内存映射, 与malloc()相比, 比较节省内存空间, 也不会申请额外的存储空间, 但是频繁分配时效率没有malloc()高

brk()

#include <unistd.h>
int brk(void *addr);

调整动态内存/虚拟内存的大小, sets the end of the data segment to the value specified by addr，成功返回0，失败返回-1, 设errno为ENOMEM

如果addr>原来的末尾地址，申请动态内存, 内存空间变大
如果addr=原来的末尾地址，内存空间不变
如果addr<原来的末尾地址，释放动态内存, 内存空间变小

//使用brk()释放动态内存  
#include<stdlib.h>   
#include<unistd.h>   
int res=brk(pv);
if(-1==res)
    perror("brk"),exit(-1);

Note:虽然brk()既能申请动态内存, 又能释放动态内存, 但是释放动态内存更加方便, 而sbrk()申请动态内存更加方便, 因此一般情况下两个函数搭配使用, sbrk()专门用于申请, brk()专门用于释放

Memory Leak

A memory leak occurs when allocated memory is never used again but is not freed !！！A problem with memory leaks is that the memory cannot be reclaimed and used later. The amount of memory available to the heap manager is decreased. If memory is repeatedly allocated and then lost, then the program may terminate when more memory is needed but malloc() cannot allocate it because it ran out of memory. In extreme cases, the operationg system may crash

losing the address:

int *pi=(int*)malloc(sizeof(int));
*pi=5;  //之前申请的内存已经无法释放了，因为address已经丢了
…
pi=(int*)malloc(sizeof(int);

Hidden memory leaks，Memory leaks can also occur when the program should release memory but does not. A hidden memory leak occurs when an object is kept in the heap even thouth the object is no longrer needed. This is frequently the result of programmer oversight. The primary problem with this type of leak is that the obkect is using memory that is no longer nedded and should be returned to the heap. In the worst case, the heap manager may not be able to allocate memory when requested, possibly forcing the program to terminate. At best, we are holding unneeded memory.

Structure deallocation without free pointers defined in it. When memory is allocated for a strcture, the rentime system will not aytomaticallu allocate memory for any pointers defined within it. Likewise, when the structure goes away, the runtime system will not automatically deallocate memory asigned to the structure’s pointers
The correct way to allocate and deallocate a structure pointer with pointer fields:

void initializePerson(Person *person, sonst char* fn,const char * ln, const chat* title,uint age){ 
person->firstName=(char*)malloc(strlen(fn)+1); 
    strcpy(person->firstName,fn); 
    ... 
    person->age=age; 
} 
void deallocatePerson(Person* person){ 
    free(person->firstName); 
    free(person->lastName); 
    free(person->title); 
} 
void processPerson(){ 
    Person* pPerson; 
    pPerson=(Person*)malloc(sizeof(Person)); 
    initilizePerson(pPterson,"Peter","Underwood","Manager",36); 
    ... 
    deallocatePerson(pPerson); 
    free(pPerson); 
}

Dangling Pointers

a pointer still references the original memory after it has been freed. The use of dangling pointer can result in:

Unpredicatable behavior if the memory is accessed
Segmentation fauts if the memory is no longer accessible
Potential security risks

Several approaches for pointer-induced problem:

Setting a pointer to NULL after freeing it.
Writing special functions to replace the free function.
Some systems(rumtime/debugger) will overwrite data when it is freed
Use third-party tools to detect dangling pointers and other problem
Displaying pointer values can be helpful in debugging dangling pointer

Note:When a pointer is passed to a function, it is always good practice to verify it is not NULL before using it;

2 ways returning a pointer referencing heap

Allocate memory within the function using malloc and return its address. The caller is responsible for deallocating the memory returned;
Pass an object to the function where it is modified.This makes the allocation and deallocation of the object's memory the caller's responsibility.

Several potential problems can occur when returning a pointer from a function:

Return an uninitialized pointer
Returning a pointer to an invalid address
Returning a pointer to a local variable
Returning a pointer but failing to free it =>memory leak

A pointer to one function can be cast to another type.

This should be done carefully since the runtime system doesn’t verify that parameters used by a function pointer are correct.It is also possible to cast a function pointer to a different function pointer and then back. The resulting pointer will be equal to the original pointer, The size of function pointers used are not necessarily the same.
Note that conversion between function pointers and pointers to data is not guaranted to work;

Always make sure you use the correct argument list for function pointers, Failure to do so will result in indeterminate behavior