arm简介

British company ARM Holdings develops the architecture and licenses it to other companies, who design their own products that implement one of those architectures‍

Processors that have a RISC architecture typically require fewer transistors than those with a complex instruction set computing (CISC) architecture (such as the x86 processors found in most personal computers), which improves cost, power consumption, and heat dissipation. These characteristics are desirable for light, portable, battery-powered devices‍

Architecture versions ARMv3 to ARMv7 support 32-bit address space (pre-ARMv3 chips, made before ARM Holdings was formed, as used in the Acorn Archimedes, had 26-bit address space) and 32-bit arithmetic; most architectures have 32-bit fixed-length instructions. The Thumb version supports a variable-length instruction set that provides both 32- and 16-bit instructions for improved code density.

Released in 2011, the ARMv8-A architecture added support for a 64-bit address space and 64-bit arithmetic with its new 32-bit fixed-length instruction set.

Since 1995, the ARM Architecture Reference Manual[62] has been the primary source of documentation on the ARM processor architecture and instruction set, distinguishing interfaces that all ARM processors are required to support (such as instruction semantics) from implementation details that may vary. The architecture has evolved over time, and version seven of the architecture, ARMv7, defines three architecture "profiles":

  • A-profile, the "Application" profile, implemented by 32-bit cores in the Cortex-A series and by some non-ARM cores
  • R-profile, the "Real-time" profile, implemented by cores in the Cortex-R series
  • M-profile, the "Microcontroller" profile, implemented by most cores in the Cortex-Mseries

CPU modes[edit]

Except in the M-profile, the 32-bit ARM architecture specifies several CPU modes, depending on the implemented architecture features. At any moment in time, the CPU can be in only one mode, but it can switch modes due to external events (interrupts) or programmatically.[63]

  • User mode: The only non-privileged mode.
  • FIQ mode: A privileged mode that is entered whenever the processor accepts a fast interrupt request.
  • IRQ mode: A privileged mode that is entered whenever the processor accepts an interrupt.
  • Supervisor (svc) mode: A privileged mode entered whenever the CPU is reset or when an SVC instruction is executed.
  • Abort mode: A privileged mode that is entered whenever a prefetch abort or data abort exception occurs.
  • Undefined mode: A privileged mode that is entered whenever an undefined instruction exception occurs.
  • System mode (ARMv4 and above): The only privileged mode that is not entered by an exception. It can only be entered by executing an instruction that explicitly writes to the mode bits of the Current Program Status Register (CPSR) from another privileged mode (not from user mode).
  • Monitor mode (ARMv6 and ARMv7 Security Extensions, ARMv8 EL3): A monitor mode is introduced to support TrustZone extension in ARM cores.
  • Hyp mode (ARMv7 Virtualization Extensions, ARMv8 EL2): A hypervisor mode that supports Popek and Goldberg virtualization requirements for the non-secure operation of the CPU.[64][65]
  • Thread mode (ARMv6-M, ARMv7-M, ARMv8-M): A mode which can be specified as either privileged or unprivileged, while whether Main Stack Pointer (MSP) or Process Stack Pointer (PSP) is used can also be specified in CONTROL register with privileged access. This mode is designed for user tasks in RTOS environment but it's typically used in bare-metal for super-loop.
  • Handler mode (ARMv6-M, ARMv7-M, ARMv8-M): A mode dedicated for exception handling (except the RESET which are handled in Thread mode). Handler mode always uses MSP and works in privileged level.

Instruction set

The 32-bit ARM architecture (and the 64-bit architecture for the most part) includes the following RISC features:

  • Load/store architecture.
  • No support for unaligned memory accesses in the original version of the architecture. ARMv6 and later, except some microcontroller versions, support unaligned accesses for half-word and single-word load/store instructions with some limitations, such as no guaranteed atomicity.[66][67]
  • Uniform 16× 32-bit register file (including the program counter, stack pointer and the link register).
  • Fixed instruction width of 32 bits to ease decoding and pipelining, at the cost of decreased code density. Later, the Thumb instruction set added 16-bit instructions and increased code density.
  • Mostly single clock-cycle execution.

To compensate for the simpler design, compared with processors like the Intel 80286 and Motorola 68020, some additional design features were used:

  • Conditional execution of most instructions reduces branch overhead and compensates for the lack of a branch predictor.
  • Arithmetic instructions alter condition codes only when desired.
  • 32-bit barrel shifter can be used without performance penalty with most arithmetic instructions and address calculations.
  • Has powerful indexed addressing modes.
  • link register supports fast leaf function calls.
  • A simple, but fast, 2-priority-level interrupt subsystem has switched register banks.

Registers

Registers R0 through R7 are the same across all CPU modes; they are never banked.

Registers R8 through R12 are the same across all CPU modes except FIQ mode. FIQ mode has its own distinct R8 through R12 registers.

R13 and R14 are banked across all privileged CPU modes except system mode. That is, each mode that can be entered because of an exception has its own R13 and R14. These registers generally contain the stack pointer and the return address from function calls, respectively.

扫描二维码关注公众号,回复: 2174722 查看本文章

Aliases:

  • R13 is also referred to as SP, the Stack Pointer.
  • R14 is also referred to as LR, the Link Register.
  • R15 is also referred to as PC, the Program Counter.

The Current Program Status Register (CPSR) has the following 32 bits.[71]

  • M (bits 0–4) is the processor mode bits.
  • T (bit 5) is the Thumb state bit.
  • F (bit 6) is the FIQ disable bit.
  • I (bit 7) is the IRQ disable bit.
  • A (bit 8) is the imprecise data abort disable bit.
  • E (bit 9) is the data endianness bit.
  • IT (bits 10–15 and 25–26) is the if-then state bits.
  • GE (bits 16–19) is the greater-than-or-equal-to bits.
  • DNM (bits 20–23) is the do not modify bits.
  • J (bit 24) is the Java state bit.
  • Q (bit 27) is the sticky overflow bit.
  • V (bit 28) is the overflow bit.
  • C (bit 29) is the carry/borrow/extend bit.
  • Z (bit 30) is the zero bit.
  • N (bit 31) is the negative/less than bit.

The interrupts are enabled and disabled by setting a bit inthe Processor Status Registers (PSR or CPSR where C stands for current). TheCPSR also controls the processor mode (SVC, System, User etc.) and whether theprocessor is decoding Thumb instructions. The top 4 bits of the PSR are reservedfor the conditional execution flags. In a privileged mode the program has full readand write access to CPSR but in an non-privileged mode the program can only readthe CPSR. When an interrupt or exception occurs the processor will go into the correspondinginterrupt or exception mode and by doing so a subset of the main registerswill be banked or swapped out and replaced with a set of mode registers.

Note: In privileged modes there is another register for each mode called Saved ProcessorStatus Register (SPSR). The SPSR is used to save the current CPSR beforechanging modes.

The User Mode (denoted with *) is the only mode that is a non-privileged mode. Ifthe CPSR is set to User mode then the only way for the processor to enter a privilegedmode is to either execute a SWI/Undefined instruction or if an exceptionoccur during code execution. The SWI call is normally provided as a function ofthe RTOS. The SWI will have to set the CPSR mode to SVC or SYS and thenreturn to the halted program.

The processor automatically copies the CPSR into the SPSR for the mode beingentered when an exception occurs. This allows the original processor state to berestored when the exception handler returns to normal program execution.

Conditional execution

Almost every ARM instruction has a conditional execution feature called predication, which is implemented with a 4-bit condition code selector (the predicate). To allow for unconditional execution, one of the four-bit codes causes the instruction to be always executed. Most other CPU architectures only have condition codes on branch instructions.

Though the predicate takes up four of the 32 bits in an instruction code, and thus cuts down significantly on the encoding bits available for displacements in memory access instructions, it avoids branch instructions when generating code for small if statements. Apart from eliminating the branch instructions themselves, this preserves the fetch/decode/execute pipeline at the cost of only one cycle per skipped instruction.

Coprocessors

The ARM architecture (pre-ARMv8) provides a non-intrusive way of extending the instruction set using "coprocessors" that can be addressed using MCR, MRC, MRRC, MCRR and similar instructions. The coprocessor space is divided logically into 16 coprocessors with numbers from 0 to 15, coprocessor 15 (cp15) being reserved for some typical control functions like managing the caches and MMUoperation on processors that have one.

In ARM-based machines, peripheral devices are usually attached to the processor by mapping their physical registers into ARM memory space, into the coprocessor space, or by connecting to another device (a bus) that in turn attaches to the processor. Coprocessor accesses have lower latency, so some peripherals—for example, an XScale interrupt controller—are accessible in both ways: through memory and through coprocessors.

Debugging

All modern ARM processors include hardware debugging facilities, allowing software debuggers to perform operations such as halting, stepping, and breakpointing of code starting from reset. These facilities are built using JTAG support, though some newer cores optionally support ARM's own two-wire "SWD" protocol. In ARM7TDMI cores, the "D" represented JTAG debug support, and the "I" represented presence of an "EmbeddedICE" debug module. For ARM7 and ARM9 core generations, EmbeddedICE over JTAG was a de facto debug standard, though not architecturally guaranteed.

Thumb

To improve compiled code-density, processors since the ARM7TDMI (released in 1994[75]) have featured the Thumb instruction set, which have their own state. (The "T" in "TDMI" indicates the Thumb feature.) When in this state, the processor executes the Thumb instruction set, a compact 16-bit encoding for a subset of the ARM instruction set.[76] Most of the Thumb instructions are directly mapped to normal ARM instructions. The space-saving comes from making some of the instruction operands implicit and limiting the number of possibilities compared to the ARM instructions executed in the ARM instruction set state.

A processor in ARM state executes ARM instructions, and a processor in Thumb state executes Thumb instructions.
A processor in Thumb state can enter ARM state by executing any of the following instructions: BX, BLX, or an LDR
or LDM that loads the PC.
A processor in ARM state can enter Thumb state by executing any of the same instructions.
In ARMv7, a processor in ARM state can also enter Thumb state by executing an ADC, ADD, AND, ASR, BIC, EOR, LSL,
LSR, MOV, MVN, ORR, ROR, RRX, RSB, RSC, SBC, or SUB instruction that has the PC as destination register and does not set the
condition flags.

Thumb-2

Thumb-2 technology was introduced in the ARM1156 core, announced in 2003. Thumb-2 extends the limited 16-bit instruction set of Thumb with additional 32-bit instructions to give the instruction set more breadth, thus producing a variable-length instruction set. A stated aim for Thumb-2 was to achieve code density similar to Thumb with performance similar to the ARM instruction set on 32-bit memory.

TrustZone (for Cortex-A profile)

The Security Extensions, marketed as TrustZone Technology, is in ARMv6KZ and later application profile architectures. It provides a low-cost alternative to adding another dedicated security core to an SoC, by providing two virtual processors backed by hardware based access control. This lets the application core switch between two states, referred to as worlds (to reduce confusion with other names for capability domains), in order to prevent information from leaking from the more trusted world to the less trusted world. This world switch is generally orthogonal to all other capabilities of the processor, thus each world can operate independently of the other while using the same core. Memory and peripherals are then made aware of the operating world of the core and may use this to provide access control to secrets and code on the device.[91]

Typically, a rich operating system is run in the less trusted world, with smaller security-specialized code in the more trusted world, aiming to reduce the attack surface. Typical applications include DRM functionality for controlling the use of media on ARM-based devices,[92] and preventing any unapproved use of the device.


参考文档:

wiki:https://en.wikipedia.org/wiki/ARM_architecture

官方网站:https://www.arm.com/

http://infocenter.arm.com/help/index.jsp

http://www.ece.umd.edu/courses/enee447.S2016/ARM-Documentation/

猜你喜欢

转载自blog.csdn.net/whuzm08/article/details/80437432
今日推荐