Memory series learning (2): CP15 coprocessor in ARM processor

Registers of CP15 coprocessor in ARM processor

0 Preface

When we were learning MMU before, we knew the allocation of this memory and the CP15 coprocessor . Here first introduce the CP15 register and the assembly instructions for accessing the CP15 register.

1 Instructions to access CP15 registers

The encoding format and syntax of the instruction to access the CP15 register are as follows:

insert image description here
illustrate:

  • <opcode_1>: coprocessor behavior operation code, for CP15, <opcode_1> is always 0b000, otherwise the result is unknown.
  • : Cannot be r15/pc, otherwise, the result is unknown.
  • : As the coprocessor register of the target register, the number is C0~C15.
  • : Additional target register or source operand register, if no additional information needs to be set, set crm to c0, otherwise the result is unknown.
  • <opcode_2>: Provide additional information such as the version number or access type of the register , which is used to distinguish different physical registers with the same number. <opcode_2> can be omitted or set to 0, otherwise the result is unknown.

insert image description here

2 CP15 register introduction

Register list of CP15:
Registers of CP15 coprocessor in ARM processor
The register C0 in CP15 corresponds to two identifier registers. The specific physical register to be accessed is specified by <opcode_2> in the register instruction in CP15. The corresponding relationship between <opcode_2> and the two identifier registers is as follows Show:

● Register C0 of CP15

insert image description here

1) Primary identifier register

The instruction format for accessing the main identifier register is as follows:

        mrc p15, 0, r0, c0, c0, 0      ;将主标识符寄存器C0,0的值读到r0中

In case you don't understand:

MRC:协处理器寄存器到ARM寄存器的数据传送指令
p15:ARM中用于存储管理的系统控制协处理器CP15
第一个0:操作码1
R0:ARM寄存器
c0:协处理器寄存器;
基本作用:ID编码(只读);
在MMU中的作用:ID编码和cache类型
最后一个0:操作码2
整句作用:读取CP15的主标识寄存器的指令,该指令将主标识符寄存器的内容读取到ARM寄存器R0中。

The encoding format of the main identifier register in the processors of different versions of the ARM system is described as follows.

The main identifier register encoding format of the processor after ARM7 is as follows:

insert image description hereinsert image description here
The main identifier register encoding format of the ARM7 processor is as follows:

insert image description here
insert image description here
The main identifier register encoding format of processors before ARM7 is as follows:

insert image description here
insert image description here

2) cache type identifier register

The instruction format for accessing the cache type identifier register is as follows:

        mrc p15, 0, r0, c0, c0, 1      ;将cache类型标识符寄存器C0,1的值读到r0中

The encoding format of the cache type identifier register in the ARM processor is as follows:

insert image description here
insert image description here
The meaning of control field bit [28:25] is explained as follows:

Control field bits [28:25] of the cache type identifier register
The encoding format of control field bits [23:12] and control field bits [11:0] is the same, and their meanings are as follows: The meaning of
insert image description here
cache capacity field bits [8: 6] is as follows:

insert image description here
The meaning of bits[5: 3] of the cache associative characteristic field is as follows:
insert image description here
The meaning of bits[1: 0] of the cache block size field is as follows:
insert image description here

● Register C1 of CP15

The instruction format for accessing the main identifier register is as follows:

        mrc p15, 0, r0, c1, c0{, 0}    ;将CP15的寄存器C1的值读到r0中
        mcr p15, 0, r0, c1, c0{, 0}    ;将r0的值写到CP15的寄存器C1中

The encoding format and meaning of register C1 in CP15 are as follows:

insert image description here
insert image description here

● Register C2 of CP15

Register C2 in CP15 stores the base address of the page table, that is, the base address of the first-level mapping descriptor table. Its encoding format is as follows:

insert image description here

● Register C3 of CP15

Register C3 in CP15 defines the access rights of 16 domains of the ARM processor.

insert image description here

在ARM处理器中,MMU将整个存储空间分成最多16个域,记作D0~D15,每个域对应一定的存储区域,该区域具有相同的访问控制属性。每个域的访问权限分别由CP15的C3寄存器中的两位来设定,c3寄存器的大小为32bits,刚好可以设置16个域的访问权限。

● Register C5 of CP15

Register C5 in CP15 is a failure status register, and the encoding format is as follows:

insert image description here
Wherein, the domain identifier bit[7:4] indicates the domain to which the storage access causing the storage access invalidation belongs.

Status flag bit[3:0] indicates the type of storage access that causes storage access failure . The meaning of this field is shown in Table 4-3 (priority decreases from top to bottom).

The meaning of the status identification field

● Register C6 in CP15

Register C5 in CP15 is a failure address register, and the encoding format is as follows:

insert image description here

● Register C7 in CP15

The C7 register of CP15 is used to control the cache and write cache. It is a write-only register, and the read operation will have unpredictable consequences.

The instruction format for accessing the C7 register of CP15 is as follows:

        mcr p15, 0, <rd>, <c7>, crm, <opcode_2>;
        <rd>、<crm>和<opcode_2>的不同取值
    组合 实现不同功能

● Register C8 in CP15

The C8 register of CP15 is used to control and clear the content of TLB. It is a write-only register, and the read operation will have unpredictable consequences.

The instruction format for accessing the C8 register of CP15 is as follows:

        mcr p15, 0, <rd>, <c8>, crm, <opcode_2>;
        <rd>、<crm>和<opcode_2>的不同取值
    组合实现不同功能

● Register C9 in CP15

The C9 register of CP15 is used to control cache content locking.

The instruction format for accessing the C9 register of CP15 is as follows:

        mcr p15, 0, <rd>, <c9>, c0, <opcode_2>
        mrc p15, 0, <rd>, <c9>, c0, <opcode_2>

If the system contains independent instruction cache and data cache, then there is an independent cache content lock register corresponding to the data cache and instruction cache , and <opcode_2> is used to select one of the registers:

  • <opcode_2>=1 selects the content lock register of the instruction cache;
  • <opcode_2>=0 selects the content lock register of the data cache.

The C9 register of CP15 has two encoding formats, A and B. The encoding format A is as follows:
insert image description here
where index indicates that when a cache miss occurs next time, the prefetched storage block will be stored in the cache block with the serial number index in the group corresponding to the block in the cache.

At this time, the cache block with sequence number 0~index-1 is locked. When a cache replacement occurs, the replaced block is selected from the blocks with sequence numbers from index to ASSOCIATIVITY.

Encoding format B is as follows:

insert image description here
insert image description here

● Register C10 of CP15

The C10 register of CP15 is used to control the TLB content locking.

The instruction format for accessing the C10 register of CP15 is as follows:

        mcr p15, 0, <rd>, <c10>, c0, <opcode_2>
        mrc p15, 0, <rd>, <c10>, c0, <opcode_2>

If the system contains independent instruction TLB and data TLB, then there is an independent TLB content lock register corresponding to the data TLB and instruction TLB respectively,

  • <opcode_2> is used to select one of the registers:
    • <opcode_2>=1 selects the content lock register of the instruction TLB;
    • <opcode_2>=0 selects the content lock register of the data TLB.

The encoding format of the C10 register is as follows:
insert image description here
insert image description here

● Register C13 of CP15

The C13 register is used for fast context switching FCSE.

FCSE(Fast Context Switch Extension,快速上下文切换)位于 CPU 和 MMU 之间,如果两个进程使用了同样的虚拟地址空间,则对 CPU而言,两个进程使用了同样的虚拟地址空间。

The instruction format for accessing the C13 register of CP15 is as follows:

        mcr p15, 0, <rd>, <c13>, c0, 0
        mrc p15, 0, <rd>, <c13>, c0, 0

The encoding format of the C13 register is as follows:
insert image description here

Wherein, PID represents the serial number of the process space block where the current process is located, that is, the process identifier of the current process, and the value is 0~127.

  • 0: MVA (transformed virtual address) = VA (virtual address), disable FCSE (fast context switching technology), PID=0 after system reset;
  • Non-zero: FCSE is enabled.

3 FCSE

● FCSE Overview

FCSE (Fast Context Switch Extension, fast context switching) is located between the CPU and the MMU. If two processes use the same virtual address space, then for the CPU, the two processes use the same virtual address space.

The fast context switching mechanism transforms the virtual address of each process , so that the part of the system except the CPU sees the virtual address transformed by the fast context switching mechanism.

The fast context switching mechanism transforms the virtual space of each process into different virtual spaces , so that there is no need to remap virtual addresses to physical addresses when switching between processes .

Usually, if the virtual address space occupied by two processes overlaps, when the system switches between the two processes, it must remap the virtual address to the physical address.

The remapping of virtual addresses to physical addresses involves rebuilding the page table in the MMU, and the contents of the cache and TLB must be invalidated (by setting the relevant bits of the coprocessor register).

These operations will bring a huge system overhead. On the one hand, rebuilding the MMU and invalidating the contents of the cache and TLB requires a lot of overhead, and on the other hand, rebuilding the contents of the cache and TLB also requires a lot of overhead.

Fast context switching (FCSE) avoids the remapping of virtual addresses to physical addresses caused by switching between processes by modifying the virtual addresses of different processes in the system, thus reducing the need to rebuild the MMU, invalidate the Cache and TLB, and rebuild the Cache and The huge overhead of operations such as TLB content, thereby improving the performance of the system.

● FCSE principle

In the ARM system, the 4GB virtual space is divided into 128 process space blocks, and the size of each process space block is 32MB.

Each process space block can contain a process, and the process can use the virtual address space 0x0~0x01FFFFFF. This address range is also the virtual space of the process seen by the CPU.

The number 0~127 of the 128 process space blocks of the system,

The actual virtual address space used by the process in the process space block labeled X is (X 0x02000000) to (X 0x02000000+0x01FFFFFF). This address space is occupied by the process seen by other parts of the system except the CPU virtual address space.

The fast context switching mechanism transforms each virtual address issued by the CPU according to the above rules, and then sends it to other parts of the system. The transformation process is as follows:

insert image description here
The conversion algorithm from address VA to MVA is as follows;

if (VA[31:25]==0b0000000)then

MVA=VA|(PID<<25)

else

MVA=VA

in. PID is the number of the process space where the current process is located, that is, the process identifier of the current process. Its value is 0~127.

  • a. In the system, each process uses the virtual address space 0x0~0x01FFFFFF. When the process accesses the instructions and data of the process, it generates the high 7 bits of the virtual address VA as 0; the fast context switching mechanism uses the process's The process identifier replaces the upper 7 bits of the VA, so as to obtain the transformed virtual address MVA, and this MVA is in the process space block corresponding to the process.
PID       |     PID   | 0.....0 |

 VA        |0000000|    VA   |        置1运算

 MVA     |    PID    |    VA   |
  • b. When the upper 7 bits of VA are not all 0, MVA=VA. This VA is the virtual address used by this process to access data and instructions in other processes. Note that the accessed process identifier cannot be 0.

Register C13 in CP15 is used for fast context switching. Its encoding format is as follows.

insert image description here
Among them, PID represents the number of the process space block where the current process is located, that is, the process identifier of the current process, and the value is 0~127.

The instruction format for accessing register C13 is as follows.

MCR           p15, 0,,,c0,0

MRC           P15,0,,,c0,0

Wherein, in the read operation, bit[31::25] in the result returns the PID, and the values ​​of other bits are unpredictable. A write operation will set the value of PID.

  • When the value of PID is 0, MVA=VA, which is equivalent to prohibiting FCSE. After the system is reset, the PID is 0.
  • When the value of PID is not 0, it means that FCSE is enabled.

References

  • "Embedded System Linux Kernel Development Practical Guide"
  • https://blog.csdn.net/yinsexingkong/article/details/51291683

Guess you like

Origin blog.csdn.net/weixin_45264425/article/details/132248470