Linux virtual memory and physical memory management

Understanding of Linux virtual memory and physical memory

Understanding of Linux virtual memory and physical memory.

First, let's look at virtual memory:

First level of understanding

1. Each process has its own independent 4G memory space, and the memory space of each process has a similar structure

2. When a new process is created, it will establish its own memory space. The data and code of this process are copied from the disk to its own process space. Which data is located is recorded by task_struct in the process control table, task_struct A linked list in the middle record, the allocation of memory space in the record, which addresses have data, which addresses have no data, which can be read, and which can be written, can all be recorded through this linked list

3. The memory space allocated by each process is mapped to the corresponding disk space

problem:

The computer obviously does not have that much memory (n*4G is required for n processes)

To establish a process, it is necessary to copy the program files on the disk to the memory corresponding to the process. In the case of multiple processes corresponding to one program, memory is wasted!

1. The 4G memory space of each process is just virtual memory space. Every time you access an address in the memory space, you need to translate the address into an actual physical memory address

2. All processes share the same physical memory, and each process only maps and stores the virtual memory space it currently needs to the physical memory.

3. The process needs to know which memory address data is in physical memory, which is not, and where it is in physical memory. It needs to be recorded by page table

4. Each entry of the page table is divided into two parts, the first part records whether the page is in physical memory, and the second part records the address of the physical memory page (if it is)

5. When the process accesses a certain virtual address, it goes to the page table. If the corresponding data is found not in the physical memory, the page fault is abnormal.

6. The process of dealing with page faults is to copy the data needed by the process from the disk to the physical memory. If the memory is full and there is no space left, then find a page to cover. Of course, if the overwritten page has been Modified, need to write this page back to disk


to sum up:

advantage:

1. Since the memory space of each process is consistent and fixed, the linker can set the memory address when linking the executable file, regardless of the final actual memory address of these data, which has independent memory space the benefits of

2. When different processes use the same code, such as the code in the library file, only one copy of this code can be stored in the physical memory. Different processes only need to map their own virtual memory to save memory.

3. When the program needs to allocate contiguous memory space, it only needs to allocate contiguous space in virtual memory space, without the contiguous space of actual physical memory, and fragments can be used.

In addition, in fact, when each process is created and loaded, the kernel just "creates" the virtual memory layout for the process. Specifically, it initializes the memory-related linked list in the process control table. In fact, it does not immediately set the virtual memory to the corresponding location. The program data and code (such as the .text.data section) are copied to the physical memory, just to establish the mapping between the virtual memory and the disk file (called memory mapping), and wait until the corresponding program is run to pass the defect The page is abnormal, to copy the data. Also, during the process of running, it is necessary to dynamically allocate memory. For example, when malloc, only virtual memory is allocated, that is, the page table entry corresponding to this virtual memory is set accordingly. When the process actually accesses this data, the defect is caused. The page is abnormal.

Supplementary understanding:

Virtual storage involves three concepts: virtual storage space, disk space, memory space


It can be considered that the virtual space is mapped to the disk space (in fact, it is also mapped to the disk space as needed, through mmap), and the mapping position is recorded by the page table. When a certain address is accessed, through the page table The valid bit can tell whether the data is in the memory. If it is not, copy the data corresponding to the disk to the memory through a page fault exception. If there is no free memory, select the sacrifice page and replace other pages.

mmap is used to establish a mapping from virtual space to disk space. It can map a virtual space address to a disk file. When this address is not set, it is automatically set by the system, and the function returns the corresponding memory address (virtual address). ), when accessing this address, you need to copy the content on the disk to the memory, then you can read or write, and finally the data on the memory can be changed back to the disk through manmap, that is, the virtual space and memory space are released The mapping, this is also a method of reading and writing disk files, and also a method of process sharing data shared memory

Next we will discuss the physical memory:

在内核态申请内存比在用户态申请内存要更为直接,它没有采用用户态那种延迟分配内存技术。内核认为一旦有内核函数申请内存,那么就必须立刻满足该申请内存的请求,并且这个请求一定是正确合理的。相反,对于用户态申请内存的请求,内核总是尽量延后分配物理内存,用户进程总是先获得一个虚拟内存区的使用权,最终通过缺页异常获得一块真正的物理内存。

1.物理内存的内核映射

IA32架构中内核虚拟地址空间只有1GB大小(从3GB到4GB),因此可以直接将1GB大小的物理内存(即常规内存)映射到内核地址空间,但超出1GB大小的物理内存(即高端内存)就不能映射到内核空间。为此,内核采取了下面的方法使得内核可以使用所有的物理内存。

1).高端内存不能全部映射到内核空间,也就是说这些物理内存没有对应的线性地址。不过,内核为每个物理页框都分配了对应的页框描述符,所有的页框描述符都保存在mem_map数组中,因此每个页框描述符的线性地址都是固定存在的。内核此时可以使用alloc_pages()和alloc_page()来分配高端内存,因为这些函数返回页框描述符的线性地址。

2).内核地址空间的后128MB专门用于映射高端内存,否则,没有线性地址的高端内存不能被内核所访问。这些高端内存的内核映射显然是暂时映射的,否则也只能映射128MB的高端内存。当内核需要访问高端内存时就临时在这个区域进行地址映射,使用完毕之后再用来进行其他高端内存的映射。

由于要进行高端内存的内核映射,因此直接能够映射的物理内存大小只有896MB,该值保存在high_memory中。内核地址空间的线性地址区间如下图所示:


从图中可以看出,内核采用了三种机制将高端内存映射到内核空间:永久内核映射,固定映射和vmalloc机制。

2.物理内存管理机制

基于物理内存在内核空间中的映射原理,物理内存的管理方式也有所不同。内核中物理内存的管理机制主要有伙伴算法,slab高速缓存和vmalloc机制。其中伙伴算法和slab高速缓存都在物理内存映射区分配物理内存,而vmalloc机制则在高端内存映射区分配物理内存。

伙伴算法

伙伴算法负责大块连续物理内存的分配和释放,以页框为基本单位。该机制可以避免外部碎片。

per-CPU页框高速缓存

内核经常请求和释放单个页框,该缓存包含预先分配的页框,用于满足本地CPU发出的单一页框请求。

slab缓存

slab缓存负责小块物理内存的分配,并且它也作为高速缓存,主要针对内核中经常分配并释放的对象。

vmalloc机制

vmalloc机制使得内核通过连续的线性地址来访问非连续的物理页框,这样可以最大限度的使用高端物理内存。

3.物理内存的分配

内核发出内存申请的请求时,根据内核函数调用接口将启用不同的内存分配器。

3.1 分区页框分配器

分区页框分配器 (zoned page frame allocator) ,处理对连续页框的内存分配请求。分区页框管理器分为两大部分:前端的管理区分配器和伙伴系统,如下图:


管理区分配器负责搜索一个能满足请求页框块大小的管理区。在每个管理区中,具体的页框分配工作由伙伴系统负责。为了达到更好的系统性能,单个页框的申请工作直接通过per-CPU页框高速缓存完成。该分配器通过几个函数和宏来请求页框,它们之间的封装关系如下图所示。


这些函数和宏将核心的分配函数__alloc_pages_nodemask()封装,形成满足不同分配需求的分配函数。其中,alloc_pages()系列函数返回物理内存首页框描述符,__get_free_pages()系列函数返回内存的线性地址。      

3.2 slab分配器

slab 分配器最初是为了解决物理内存的内部碎片而提出的,它将内核中常用的数据结构看做对象。slab分配器为每一种对象建立高速缓存。内核对该对象的分配和释放均是在这块高速缓存中操作。一种对象的slab分配器结构图如下:


可以看到每种对象的高速缓存是由若干个slab组成,每个slab是由若干个页框组成的。虽然slab分配器可以分配比单个页框更小的内存块,但它所需的所有内存都是通过伙伴算法分配的。slab高速缓存分专用缓存和通用缓存。专用缓存是对特定的对象,比如为内存描述符创建高速缓存。通用缓存则是针对一般情况,适合分配任意大小的物理内存,其接口即为kmalloc()。      

3.3 非连续内存区内存的分配

内核通过vmalloc()来申请非连续的物理内存,若申请成功,该函数返回连续内存区的起始地址,否则,返回NULL。vmalloc()和kmalloc()申请的内存有所不同,kmalloc()所申请内存的线性地址与物理地址都是连续的,而vmalloc()所申请的内存线性地址连续而物理地址则是离散的,两个地址之间通过内核页表进行映射。vmalloc()的工作方式理解起来很简单:       

1).寻找一个新的连续线性地址空间;       

2).依次分配一组非连续的页框;       

3).为线性地址空间和非连续页框建立映射关系,即修改内核页表;

vmalloc()的内存分配原理与用户态的内存分配相似,都是通过连续的虚拟内存来访问离散的物理内存,并且虚拟地址和物理地址之间是通过页表进行连接的,通过这种方式可以有效的使用物理内存。但是应该注意的是,vmalloc()申请物理内存时是立即分配的,因为内核认为这��内存分配请求是正当而且紧急的;相反,用户态有内存请求时,内核总是尽可能的延后,毕竟用户态跟内核态不在一个特权级。

本文转载自http://www.linuxidc.com/Linux/2015-02/113981.htm

4、补充

MMU:内存管理单元

作用:完成虚拟地址到物理地址的转换

  设置内存的访问属性:只读、可读写、禁止访问

MMU地址转换的单位

MMU进行虚拟地址到物理转换的时候,并不是一个个地址转换的,而是一段段地址转换。
转换的单位:
section  段   1MB    ----> u-boot
large page  大页   64KB
small page  小页    4kB   ----> linux内,page=4KB
tiny page   极小页  1KB
MMU做地址映射的方法
1)查表的方式,而不是公式计算
2)查表的时候,使用的是page table
3)page table里面存放的是虚拟地址和访问属性
4)查找page table的索引是虚拟地址
5)page table在DDR2 内存,linux内核初始化的时候需要创建page table。
6)page table也可以动态的改写,改写的时候使用ioremap()--->动态改写页表的过程
7)page table可以预先做好。
8)page table的内容:entry(条目)-->物理地址和访问属性,条目的大小4B

linux内核也page(4KB)为单位
1)需要两级页表
2)一级页表存放的是二级页表的索引值;二级页表中存放的是虚拟地址和访问属性。
3)以page(4KB)为单位进行映射的时候,VA[31:20]在一级页表中索引二级页表;VA[19:12]在二级页表中索引PA和访问属性。
   
    VA[11:0] = PA[11:0]
4)以4KB的page为单位进行映射,page table的大小:


一级页表:2^12 entry
二级页表:2^8  entry
2^12 * 2^8  + 2^12 ---> (1M + 4K) * 4B

外部碎片:虽然有足够的内存,但是都是分散的碎片,无法满足大块连续内存的需要。都是小块的连续内存。 使用伙伴算法减少外部碎片。vmalloc

内部碎片:系统为了满足一小段连续内存区,不得不分配一大区域内存给他,从而造成空间的浪费。都是大块的连续内存。使用slab,slub,slob分配器减少内部碎片。kmalloc


vmalloc:分配内核虚拟内存,3G+160M-4G,物理地址不一定连续,虚拟地址连续,申请大小没有限制,页对齐,不能出现在上下文中;基于slab算法(slab分配器),slab--->消除page内碎片

kmalloc:分配内核虚拟内存,3G-3G+160M,物理内存映射区域(该区域中包含了内核镜像、物理页框表mem_map等等),物理地址是连续的,虚拟地址也是连续的;申请大小不能大于128k,不清零,可阻塞;基于buddy算法 , buddy--->消除page外碎片

malloc: 分配用户虚拟内存,1-3G


用户空间与内核空间的交互:

1、内核变量写入到用户空间:

put_user(x,ptr);将内核变量写入到ptr所指向用户空间中,可能引起随眠

2、将用户空间的值写入到内核空间

get_user(x,ptr)

3、从内核区中读取数据到用户区

copy_to_user(void __user *to, const void *from, unsigned long n);

*用户区中使用函数read函数,

*to是用户空间指针,from是内核空间指针,n是拷贝数据的字节大小

*数据拷贝成功,则返回零;否则,返回没有拷贝成功的数据字节数

4、从用户空间拷贝数据到内核空间

copy_from_user(void __user *to, const void *from, unsigned long n);

5、检查用户空间内存块是否可用    

access_ok(type,addr,size) 

*  type   : 访问类型,其值可为 VERIFY_READ 或者 VERIFY_WRITE 。

   addr  :   用户空间的指针变量,其指向一个要检查的内存块开始处。
                   size   :   要检查内存块的大小

*如果可用,则返回真(非0值),否则返回假 (0)。


 

Guess you like

Origin blog.csdn.net/hgz_gs/article/details/51923272