Memory management unit MMU ARM learning

First question, why have a memory management unit?

 

Benpian objectives:

  • Understand the relationship between virtual addresses and physical addresses
  • How to master the conversion of a virtual address is controlled by setting the MMU to physical address
  • Learn MMU-based memory access mechanism
  • Learn TLB, Cache, Write Buffer principle, when using Notes
  • By way of example profound grasp above points

Memory management unit (MMU) Introduction

S3C2410 / S3C2440 MMU properties

Memory management unit (Memory Management Unit) referred to as the MMU, which is responsible for mapping virtual addresses to physical addresses and provides memory access checking hardware mechanisms.

( This sentence should explain the role of MMU, right! )

Modern multi-user multi-process operating system by MMU allows each user process has its own separate address spaces:

  • Address mapping function that each process has a "look" the same address space;
  • Memory access can check protected memory used by each process will not be destroyed by other processes.

S3C2410 / S3C2440 has the following characteristics:

  • ARM V4 compatible with the hardware length, domains, access check mechanisms.
  • Four kinds of mapping length: section (1MB), a large page (64KB), small page (4KB), tiny pages (1KB).
  • You can set access permissions for each segment.
  • Each sub-page (sub-page, ie map page 1/4) large pages and small pages can set access permissions alone.
  • Hardware implementation of the 16 domains. ( Domain is what? )
  • A TLB command (containing 64 entries), a TLB data (containing 64 entries).
  • Hardware access page table (address map, permission checks automatically by hardware).
  • Alternatively TLB entries using a round-robin algorithm (also known as cyclic algorithm).
  • You can invalidate the entire TLB.
  • You can invalidate a TLB entry separately.
  • You can lock an entry in the TLB, instruction TLB, independent data TLB.

Focus on learning the address mapping: structure and establish a page table, the mapping process, to access, TLB, Cache only a rough introduction.

S3C2410 / S3C2440 MMU address conversion process

Classification address

Previous program is very small, it can all be loaded into memory. With the development of technology, there have been two cases.

(1) Some large programs, it requires more than the total memory capacity of the memory, the memory can not be loaded one;

(2) a multi-channel system there are many programs need to be performed at the same time, they require operation over a total memory capacity of the memory, all the procedures can not put all into memory.

In fact, before running a program, not necessarily all loaded into memory only those parts need to be present to run into memory first, and then transferred from the rest of the disk at the time of use, and in the run out of memory when any unused part of paged out to disk.

This makes a large program can be run in a small memory space, so that the memory can be loaded simultaneously more programs concurrently executing, from a user perspective, the system has a memory capacity larger than the actual memory size people put this memory is called virtual memory.

Wow, this idea is still very powerful, memory is always limited, if you run in a limited memory Unlimited program it?

I think embedded operating principle can be composed of two computers and operating systems together on the course, but also hands-on practice.

The virtual memory of the memory from the expansion logic, the user sees only a feeling of mass, virtual, in the 32-bit CPU system, the virtual memory address range is 0 ~ 0xFFFFFFFF, we address this range It referred to as virtual address space, which is an address known as a virtual address. The virtual address space, virtual address space corresponding to the physical address, a physical address, which correspond to actual memory.

The final virtual address to be converted into actual physical address to read and write data, by which the virtual address space and physical address space into equally sized pieces small space (referred to as a segment or page), then these two into a small space establish the mapping. Because the virtual space much greater than the physical address space, it is possible to a plurality of virtual address space mapped to the same physical address space, or some of the virtual address space is not mapped to a particular physical address space up (to be re-used when the map).

ARM CPU address translation process involves the three concepts:

  • Virtual address (VA, Virtual Address)
  • Virtual Address (MVA, Modified Virtual Address) transformed
  • A physical address (PA, Physical Address)

Did not start when the MMU, all parts CPU cores, cache, MMU, peripherals, and other physical addresses are used.

After starting MMU, CPU virtual address issued outer check VA; VA for MVA is converted into cache, MMU used herein MVA is converted to PA; PA last used to read and write the actual device (S3C2410 / S3C2440 internal register or an external device):

(1) CPU core saw, use only the virtual address VA, VA as to how to implement the final physical address PA, CPU core is to ignore the.

(2) cache and MMU are invisible VA, which are obtained by using the MVA PA conversion.

(3) do not see the actual device VA, MVA, which are used for reading and writing the physical address PA.

MVA other portions except for the CPU core seen virtual address VA of the transformation between MVA follows:

If VA <32M, need to use the process identification number PID (obtained by reading the CP15 C13) to convert MVA.

VA and MVA conversion method is as follows (this is done automatically hardware):

The PID generated MVA purpose is to reduce the cost of the switching process: without the MVA used directly VA words, when two processes are used in the virtual address space (VA) overlap, at the time of switching the process to the overlapping VA mapped to different PA up, page tables need to be rebuilt, invalidating caches and TLBs, etc., the cost is very large.

After using MVA, the process of switching to save a great: VA assuming two processes are running at 1 0 ~ (32M-1), but they do not overlap MVA, namely 0x02000000 ~ 0x03FFFFFF, 0x04000000 ~ 0x05FFFFFF, this eliminates the need to work to rebuild the page table waiting.

A little ignorant of the force, VA -> MVA -> PA

Speaking following the virtual address, if not specifically stated, it refers to the MVA.

Virtual addresses to physical addresses of the conversion process

Focus here! ! !

A virtual address into a physical address, there are two general methods: a conversion table stored virtual address or a physical address corresponding with a certain mathematical formula. Such a table called a page table (Page Table), a page table entries made (the Entry); each entry stores the physical address and the virtual address corresponding to the access period, or the lower address of the page table.

在 ARM CPU 中使用第二种方法。S3C2410/S3C2440 最多会用到两级页表:以段(Section,1MB)的方式进行转换时只用到一级页表,以页(Page)的方式进行转换时用到两级页表。

页的大小有 3 种:大页(64KB)、小页(4KB)、极小页(1KB)。

条目也称为 “描述符”(Descriptor),有:段描述符、大页描述符、小页描述符、极小页描述符——它们保存段、大页、小页或极小页的起始物理地址;粗页表描述符、细页表描述符——它们保存二级页表的物理地址。

大概的转化过程如下(一个通用的转换过程,一个是针对 ARM CPU 细化的转换过程):

(1)根据给定的虚拟地址找到一级页表中的条目;

(2)如果此条目是段描述符,则返回物理地址,转换结束;

(3)否则如果此条目是二级页表描述符,继续利用虚拟地址在此二级页表中找到下一个条目;

(4)如果这第二个条目是页描述符,则返回物理地址,转换结束;

(5)其他情况出错。

 “TTB base” 代表一级页表的地址,将它写入协处理器 CP15 的寄存器 C2(称为页表基址寄存器)即可,如下图所示,一级页表的地址必须是 16K 对应的(位 [14:0] 为 0,这里没写错吗?)。

先介绍一级页表。32 位的 CPU 的虚拟地址空间达到 4GB,一级页表中使用 4096 个描述符来表示这 4GB 空间——每个描述符对应 1MB 的虚拟地址,要么存储了它对应的 1MB 物理空间的起始地址,要么存储了下一级页表的地址。(所以到底存了啥?

使用 MVA[31:20] 来索引一级页表,得到一个描述符,每个描述符占据 4 字节,格式如下图:

根据一级描述符的最低两位,可以分为以下 4 种。

(1)0b00:无效

(2)0b01:粗页表(Coarse page table)

位 [31:10] 称为粗页表基址(Coarse page table base address),此描述符的低 10 位填充 0 后就是一个二级页表的物理地址。此二级页表含 256 个条目(所以大小为 1KB),称为粗页表(Coarse page table)。其中每个条目表示大小为 4KB 的物理地址空间,所以一个粗页表表示 1MB 的物理地址空间。

(3)0b10:段(Section)

位 [31:20] 称为段基址(Section base),此描述符的低 20 位填充 0 后就是一块 1MB 物理地址空间的起始地址。MVA[19:0] 用来在这 1MB 空间中寻址。所以,描述符的位 [31:20] 和 MVA[19:0] 就构成了这个虚拟地址 MVA 对应的物理地址。

以段的方式进行映射时,虚拟地址 MVA 到物理地址 PA 的转换过程如下:

1. 页表基址寄存器位 [31:14] 和 MVA[31:20] 组成一个低两位为 0 的 32 位地址,MMU 利用这个地址找到段描述符。

2. 取出段描述符位 [31:20] —— 即段基址,它和 MVA[19:0] 组成一个 32 位的物理地址 —— 这就是 MVA 对应的 PA。

(4)0b11:细页表(Fine page table)

位 [31:12] 称为细页表基址(Fine page table base address),此描述符的低 12 位填充 0 后就是一个二级页表的物理地址。此二级页表含 1024 个条目(所以大小为 4KB),称为细页表(Fine page table)。其中每个条目表示大小为 1KB 的物理地址空间,所以一个细页表表示 1MB 的物理地址空间。

以大页(64KB)、小页(4KB)或极小页(1KB)进行地址映射时,需要用到两级页表。

二级页表有粗页表、细页表两种,“Coarse page table” 和 “Fine page table” 就是这两种页表。

二级页表中描述符的格式如下图:

根据二级描述符的最低两位,可以分为 4 中情况。

(1)0b00:无效

(2)0b01:大页描述符

(3)0b10:小页描述符

(4)0b11:极小页描述符

细节先不敲了,太多了

 

从段、大页、小页、极小页的地址转换过程可知:

(1)以段进行映射时,通过 MVA[31:20] 结合页表得到一段(1MB)的起始物理地址,MVA[19:0] 用来在段中寻址。

(2)以大页进行映射时,通过 MVA[31:16] 结合页表得到一个大页(64KB)的起始物理地址,以 MVA[15:0] 用来在大页中寻址。

(3)以小页进行映射时,通过 MVA[31:12] 结合页表得到一个小页(4KB)的起始物理地址,MVA[11:0] 用来在小页中寻址。

(4)以极小页进行映射时,通过 MVA[31:10] 结合页表得到一个极小页(1KB)的起始物理地址,MVA[9:0] 用来在极小页中寻址。

 

内存访问权限检查

TLB 的作用

Cache 的作用

S3C2410/S3C2440 MMU、TLB、Cache 的控制指令

 

MMU 使用实例:地址映射

程序设计

开发板的 SDRAM 的物理地址范围处于 0x30000000~0x33FFFFFF,S3C2410/S3C2440 的寄存器地址范围都处于 0x48000000~0x5FFFFFFF。

在学习 GPIO 的时候,是通过 GPBCON 和 GPBDAT 这两个寄存器的物理地址 0x56000010、0x56000014 写入特定的数据来驱动 4 个 LED。

寄存器的地址似乎没有对应上,尼玛我这个板子只有 3 个 LED,骗子,赔我一个 LED。

 

这个实例将会开启 MMU,并将虚拟地址空间 0xA0000000~0xA0100000 映射到物理地址空间 0x30000000~0x33FFFFFF 上,并在连接程序时将一部分代码的运行地址指定为 0xB0004000(这个数值有点奇怪,看下去就会明白),看看能否令程序跳转到 0xB0004000 处执行。

程序将只使用一级页表,以段的方式进行地址映射。

32 位 CPU 的虚拟地址空间达到 4GB,一级页表中使用 4096 个描述符来表示这 4GB 空间(每个描述符对应 1MB 的虚拟地址),每个描述符占 4 字节,所以一级页表占 16KB。

本实例使用 SDRAM 的开始 16KB 来存放一级页表,所以剩下的内存开始物理地址为 0x30004000。

将程序代码分为两个部分:

  • 第一部分的运行地址设为 0,它用来初始化 SDRAM、复制第二部分代码到 SDRAM 中(存放在 0x30004000 开始处)、设置页表、启动 MMU,最后跳转到 SDRAM 中(地址 0xB0004000)去继续执行;
  • 第二部分的运行地址设为 0xB0004000,它用来驱动 LED。

程序流程图如下:

 

值得注意的是,在开启了 MMU 之后,无论是 CPU 取值还是 CPU 读写数据,使用的都是虚拟地址。

copy_2th_to_sdram 函数用来将第二部分代码(即由 leds.c 编译的来的代码)从 Steppingstone 中复制到 SDRAM 中。

在连接程序时,第二部分代码的加载地址被指定为 2048,重定位地址为 0xB0004000。

所以系统从 NAND Flash 启动后,第二部分代码就存储在 Steppingstone 中地址 2048 之后,需要把它复制到 0x30004000 处(此时尚未开启 MMU,虚拟地址 0xB0004000 对应的物理地址在后面设为 0x30004000)。Steppingstone 总大小为 4KB,不妨把地址 2048 之后的所有数据复制到 SDRAM 中,素以源数据的结束地址为 4096。

 

剩下的 create_page_table、mmu_init 就是重点了,前者用来设置页表,后者用来开启 MMU。

先看看 create_page_table 函数。它用于设置 3 个区域的地址映射关系。

(1)将虚拟地址 0~(1M-1) 映射到同样的物理地址去,Steppingstone(从 0 地址开始的 4KB 内存)就处于这个范围中。

是虚拟地址等于物理地址,可以让 Steppingstone 中的程序(head.S 和 init.c)在开启 MMU 前后不需要考虑太多的事情。

(2)GPIO 寄存器的起始物理地址范围为 0x56000000,将虚拟地址 0xA0000000~(0xA0000000+1M-1) 映射到物理地址 0x56000000~(0x56000000+1M-1)。

(3)开发板的 SDRAM 的物理地址范围为 0x30000000~0x33FFFFFF,将虚拟地址 0xB0000000~0xB3FFFFFF 映射到物理地址 0x30000000~0x33FFFFFF。

 

 

连接脚本 mmu.lds 将程序分为两个段:first 和 second。

前者由 head.o 和 init.o 组成,它的加载地址和运行地址都是 0,所以运行前不需要重新移动代码。

后者由 leds.o 组成,它的加载地址为 2048,重定位地址为 0xB0004000,这表明段 second 存在编译器所得映像文件地址 2048 处,在运行前需要将它复制到地址 0xB0004000 处,这由 init.c 中的 copy_2th_to_sdram 函数完成(注意,此函数将代码复制开始地址为 0x30004000 的内存中,这是开启 MMU 后虚拟地址 0xB0004000 对应的物理地址)。

卧槽,这个过程有点复杂啊!!

不过能使用  MMU 也算一个进阶了,我没说我啊!!

还要消化一下,再分析代码。

 

Guess you like

Origin www.cnblogs.com/tuhooo/p/11183817.html