Article Directory
1. What is architecture
"Architecture" refers to the functional specification, and the ARM architecture is the functional specification of the ARM processor, including the following main contents:
- Instruction set: the function of each instruction, the representation method (encoding) of the instruction in memory;
- Register set: the number, size, function, and initial state of registers;
- Exception model: different privilege levels, exception types, and processing actions when adopting exceptions and returning from exceptions;
- Memory model: the order in which memory is accessed, the behavior of caches when software must perform accurate maintenance;
- Debugging, Tracing, and Statistics: How to set and trigger breakpoints, what and how trace tools can capture information.
"Architecture" does not directly explain how to build a processor and work, it just provides a behavior specification between software and hardware, the specific processor construction and design is called "micro-architecture" Micro-Architecture, micro-architecture includes:
- Pipeline Length and Layout
- Number and size of caches
- The number of cycles of a single instruction (one instruction cycle corresponds to several clock cycles)
- Other optional features
Two, ARM architecture classification
ARM provides three architectural profiles:
- A-Profile (application): used in complex computing applications, such as servers, mobile phones, and car hosts;
- R-Profile (real-time): Used where real-time response is required, such as safety-critical applications or applications that require deterministic response, such as medical equipment, vehicle steering, braking and signaling, etc.;
- M-Profile (microcontroller): used in places with strong requirements for energy efficiency, power consumption, and size, such as deeply embedded chips, small sensors, communication modules, and smart home products.
The construction and design of a processor is called a "microarchitecture." The microarchitecture defines how the processor works, including: the length and layout of the pipeline, the number and size of caches, the number of cycles for a single instruction, and other optional features.
Arm-A architecture
Armv7-A
(1) Instruction set
The ARMv7-A architecture is a 32-bit processor architecture, which is also a load/store architecture, that is, data processing instruction operations are completed in general-purpose registers, and only load/store instructions can access memory. In addition, there is another major feature of the ARM instruction set, that is, almost all instructions in the ARM instruction set can add condition codes.
The ARM instruction set can be classified into the following four categories:
- Data processing operations (ALU operations such as ADD);
- memory operations (load/store);
- control flow (loops, jumps, condition codes, etc.);
- System (coprocessor, debug, mode switching, etc.).
Armv7-A supports Arm (A32) and Thumb (T32) datasets.
ARM core can only perform data processing on registers, not directly on memory.
Data manipulation instructions generally consist of a target register and two source operands. All ARM data manipulation instructions can be suffixed (Suffix) and affect the status flag (CPSR). Its basic format is as follows:
Operation{cond}{S}Rd,Rn,Operand2 - Operation : instruction mnemonic;
- cond: execution condition;
- S: suffix, whether it affects the status bit of the CPSR register;
- Rd: target register;
- Rn: the first operand register;
- Operand2: the second operand;
- {}: optional.
(2) Processor mode
The ARMv7 architecture supports security extensions. If security extensions are enabled, the ARMv7-A architecture is divided into two worlds: secure mode (Secure State) and non-secure mode (Non-secure State).
In non-secure mode, there are three operating privileges PL0, PL1 and PL2 (privilege level).
privilege level | describe |
---|---|
PL0 | PL0 runs in user mode (User) for running applications. This mode programs restricted access to system resources. Corresponds to Linux user mode. |
PL1 | PL1 runs all modes except non-user mode and Hyp mode. The Linux kernel runs on PL1. Contains System, SVC, FIQ, IRQ, UNDEF and Abort modes in the ARMv6 architecture. In addition, Montior in safe mode also runs at PL1 level, managing the switching between safe mode and non-safe mode. |
PL2 | PL2 is for virtualization. The virtualization hypervisor (Hypervisor) runs on PL2. |
Processor mode:
- User: User mode, run at the privilege level of PL0, that is, there is no privilege level. It is the level when running applications on the OS. It cannot access system resources (MMU, etc.). In this mode, it cannot actively switch modes. , unless an interrupt or exception is encountered (such as SWI triggering a system call);
- FIQ: fast interrupt mode, processor mode when FIQ fast interrupt occurs;
- IRQ: interrupt mode, the processor mode when an IRQ fast interrupt occurs;
- Supervisor: Administrator mode, the default mode after reset, run at the PL1 privilege level, can call through the SWI (SVC) system to generate a Supervisor Call exception, enter the Supervisor mode, the mode commonly used by the operating system;
- Monitor: monitoring mode, for Security extension, not discussed in detail;
- Abort: Stop mode, enter this mode when a Data Abort exception or Prefetch Abort exception occurs;
- Hyp: When the virtualization extension is supported, the mode will not be discussed in detail;
- Undefined: This is a mode related to execution and instructions, and enters this mode when attempting to execute UNDEFINED instructions;
- System: System mode is also the PL1 privilege level. The difference from Supervisor is that System mode has the same registers as User mode, which are currently not used by most systems;
(3) General registers
The ARMv7-A processor has 16 general-purpose registers: R0~R15, of which:
- R13: Usually used as a stack pointer SP;
- R14: Usually used as link register LR;
- R15: Usually used as a program counter PC;
The authority to access system resources is different for each privilege level, and the processor has several modes. The privilege level corresponding to each mode is different; the registers corresponding to each processor mode are also different:
- R0~R7, PC is shared in all modes;
- In FIQ mode, R8-R12, SP, and LR all have special registers. In some materials, they are called "shadow registers". What does it mean? In this mode, there are dedicated R8~R12, SP, LR;
- In the same way, Supervisor, Abort, Undefined, IRQ, etc. all have their own special SP and LR in their own mode, that is to say, when coming in from other modes, there is no need to restore the scene for these two registers;
- The reason why FIQ is called FIQ can also be seen from the software that it has more dedicated registers than IRQ, so it is indeed necessary to Fast;
(4) Special registers
ARMv7-A also has a special register called: Program Status Register CPSR (Current Program Status Register). Before entering an exception, the current CPSR is saved to SPSR (Saved Program Status Register); CPRS is called APSR in the user layer, and APSR is
just A part of the CPSR register is truncated, because in the user layer, not all CPSR bits are accessible;
Field | effect |
---|---|
N | ALU returns whether the operation result is a negative number |
Z | ALU returns whether the operation result is 0 |
C | Whether carry occurs in ALU operation |
V | Whether the ALU operation overflows |
Q | cumulative saturation |
J | Is ARM in Jazelle state |
E | Control the byte order of load/store, E=1 means big endian mode, E=0 means little endian mode |
A | disables asynchronous aborts, User mode cannot operate |
I | Enable/disable IRQ, User mode cannot operate, I=1 means disable IRQ, I=0 means enable IRQ |
F | Enable/disable FIQ, User mode cannot operate, F=1 means disable FIQ, I=0 means enable FIQ |
T | ARM and Thumb status flags |
GE | For some SIMD (Single Instruction, Multiple Data) instructions |
M[4:0] | Processor mode: FIQ, IRQ, ABT, SVC, UND, MON, HYP. User mode cannot operate |
IT[7:0] | IT7:2: Form IT[7:0] together with IT1:0, indicating the execution status of the IF-THEN instruction |
[28-31]: Condition code
M[4:0]: Encoding of processor mode
Armv8-A
The Armv8-A architecture is the latest generation of Arm architecture for application frameworks. The ARMv8 architecture inherits the basis of ARMv7 and previous processor technologies. In addition to supporting the existing 16/32bit Thumb2 instructions, it is also forward compatible with the existing A32 (ARM 32bit) instruction set. Based on the 64bit AArch64 architecture, except In addition to the new A64 (ARM 64bit) instruction set, the existing A32 (ARM 32bit) and T32 (Thumb2 32bit) instruction sets have also been expanded, and CRYPTO (encryption) module support has also been newly added.
register
In order to be forward compatible with Armv7, Armv8-A supports two Execution States, namely AArch32 and AArch64. The differences between the two Execution States are as follows:
Arch32 | AArch64 |
---|---|
Provide 13 32bit general-purpose registers R0-R12, a 32bit PC pointer (R15), stack pointer SP (R13), link register LR (R14) | Provide 31 64bit general-purpose registers X0-X30 (W0-W30), of which X30 is the program link register LR |
Provide a 32bit abnormal link register ELR for abnormal return in Hyp mode | Provide a 64bit PC pointer, stack pointer SPx, exception link register ELRx |
Provide 32 64bit SIMD vector and scalar floating-point support | Provide 32 128bit SIMD vector and scalar floating-point support |
Provide two instruction sets A32(32bit), T32(16/32bit) | Define the ARMv8 exception level ELx (x<4), the larger the x, the higher the level, and the greater the authority |
Exception model compatible with ARMv7 | Define a set of PSTATE to save the state of PE (Processing Element) |
The coprocessor only supports CP10\CP11\CP14\CP15 | no concept of coprocessor |
General-purpose registers
Under the ARM64 architecture, the CPU provides 33 registers, of which the first 31 (0~30) are general-purpose integer registers.
register | illustrate |
---|---|
X0 register | Used to save the return value (or pass parameters) |
X1 ~ X7 registers | Used to save the parameters of the function |
X8 register | Can also be used to save return values |
X9 ~ X28 registers | General register, no special purpose |
X29(FP) register | Used to save the bottom address of the stack |
X30 (LR) register | Link register, used to hold the return address |
Each AArch64 64-bit general-purpose register X0-X30 has a corresponding 32-bit register. The Wn register is the lower 32 bits of the Xn register. When reading the Wn register, the upper 32 bits of the Xn register will remain unchanged. If the W register is written, will set the upper 32 bits of the X register to 0.
特殊寄存器
除了31个通用寄存器,还有几个特殊的寄存器:
1、zero register:写操作被忽略,读操作都返回0;
2、SP/WSP:当前栈指针;
3、PC program counter:ARMv7指令集使用通用寄存器R15作为PC,直接操作PC可以做一些机智的编程操作,但是ARMv8不能直接进入PC,这使返回更好预测,并且使ABI规范更加简单;
4、ELR / SPSR:当armV8执行在AArch64,每个ELn异常返回状态取决于ELR和SPSR
ELR: exception link register 保存exception返回地址
SPSR: saved processor state register 执行exception前保存当前的processor state, 执行exception完返回时restore
在ARMv8中,如果异常发生在EL1,就使用SPSR_EL1,如果发生在EL2, 使用SPSR_EL2,如果发生在EL3, SPSR_EL3使用
ELR 和SPSR时成对的,其和对应的ELn相关
5、SP 每个 exception level 都有对应的 SP:
Armv8有32个 128bit的浮点寄存器 V0-V31. 这32个寄存器用来处理标量浮点预算和NEON指令。
指令集
A64指令的编码是固定的32bits;A32指令的编码也是固定的32bits;T32指令编码是可变长的16bits、32bits。
ARM指令使用的是 三地址码 , 它的格式如下:
{} {S} , , <shifter_operand>
opcode:操作码,也就是助记符,操作码,也就是助记符,说明指令需要执行的操作类型
cond:指令执行条件码,在编码中占4bit,0b0000 -0b1110
S:条件码设置项,决定本次指令执行是否影响PSTATE寄存器响应状态位值
Rd:目标寄存器,A64指令可以选择X0-X30 or W0-W30
Rn:第一个操作数的寄存器,和Rd一样,不同指令有不同要求
shifter_operand:第二个操作数,可以是立即数,寄存器Rm和寄存器移位方式(Rm,#shit)
指令分类
- 跳转指令:条件跳转、无条件跳转(#imm、register)指令;
- 异常产生指令:系统调用类指令(SVC、HVC、SMC);
- 系统寄存器指令:读写系统寄存器,如 :MRS、MSR指令 可操作PSTATE的位段寄存器;
- 数据处理指令:包括各种算数运算、逻辑运算、位操作、移位(shift)指令;
- load/store内存访问指令:load/store {批量寄存器、单个寄存器、一对寄存器、非-暂存、非特权、独占}以及load-Acquire、store-Release指令 (A64没有LDM/STM指令);
- 协处理器指令:A64没有协处理器指令。
常见指令
add:将某一寄存器的值和另一寄存器的值 相加 并将结果保存在另一寄存器中
add x0, x0, #1 ; 将寄存器 x0 的值和常量 1 相加后保存在寄存器 x0 中
add x0, x1, x2 ; 将寄存器 x1 和 x2 的值相加后保存到寄存器 x0 中
add x0, x1, [x2] ; 将寄存器 x1 的值加上寄存器 x2 的值作为地址,再取该内存地址的内容放入寄存器 x0 中
mov:把一个寄存器的值(要能用立即数表示)赋给另一个寄存器,或者将一个常量赋给寄存器,将后边的量赋给前边的量
mov R1, R0 ; 将寄存器R0的值传送到寄存器R1
mov PC, R14 ; 将寄存器R14的值传送到PC,常用于子程序返回
mov R1, R0, LSL#3 ; 将寄存器R0的值左移3位后传送到R1(即乘8)
movs PC, R14 ; 将寄存器R14的值传送到PC中,返回到调用代码并恢复标志位
sub : used to subtract operand 1 from operand 2 and store the result in the destination register. Operand 1 shall be a register, operand 2 may be a register, a shifted register, or an immediate value. This instruction can be used for subtraction of signed or unsigned numbers
sub R0, R1, R2 ;R0 = R1 - R2
sub R0, R1, #256 ;R0 = R1 - 256
sub R0, R2, R3, LSL#1 ;R0 = R2 - (R3 << 1)
Exception Model and Handler Mode
The exception model
Armv8 has four Exception Levels, namely EL0, EL1, EL2, EL3
Exception | Level |
---|---|
EL0 | Application |
EL1 | Linux kernel- OS |
EL2 | Hypervisor |
EL3 | Secure Monitor |
Security | |
Non secure | Non-secure EL0/EL1/EL2, can only access Non-secure memory |
Secure | Secure EL0/EL1/EL3, can access Non-secure memory & Secure memory |
Note that the handler exception class has the following rules:
- ELx(x<4), the larger the x, the higher the level, and the higher the execution privilege
- Execution at EL0 is called Unprivileged Execution
- EL2 has no Secure state, only Non-secure state
- EL3 only has Secure state, which realizes switching between Secure and Non-secure of EL0/EL1
- EL0 & EL1 must be implemented, EL2/EL3 is optional