ARM体系架构

ARM family ARM architecture ARM core Feature Cache (I / D), MMU Typical MIPS @ MHz
ARM1 ARMv1 ARM1 First implementation None  
ARM2 ARMv2 ARM2 ARMv2 added the MUL (multiply) instruction None 4 MIPS @ 8 MHz
0.33 DMIPS/MHz
ARMv2a ARM250 Integrated MEMC (MMU), graphics and I/O processor. ARMv2a added the SWP and SWPB (swap) instructions None, MEMC1a 7 MIPS @ 12 MHz
ARM3 ARMv2a ARM3 First integrated memory cache 4 KB unified 12 MIPS @ 25 MHz
0.50 DMIPS/MHz
ARM6 ARMv3 ARM60 ARMv3 first to support 32-bit memory address space (previously 26-bit).
ARMv3M first added long multiply instructions (32x32=64).
None 10 MIPS @ 12 MHz
ARM600 As ARM60, cache and coprocessor bus (for FPA10 floating-point unit) 4 KB unified 28 MIPS @ 33 MHz
ARM610 As ARM60, cache, no coprocessor bus 4 KB unified 17 MIPS @ 20 MHz
0.65 DMIPS/MHz
ARM7 ARMv3 ARM700   8 KB unified 40 MHz
ARM710 As ARM700, no coprocessor bus 8 KB unified 40 MHz
ARM710a As ARM710 8 KB unified 40 MHz
0.68 DMIPS/MHz
ARM7T ARMv4T ARM7TDMI(-S) 3-stage pipeline, Thumb, ARMv4 first to drop legacy ARM 26-bit addressing None 15 MIPS @ 16.8 MHz
63 DMIPS @ 70 MHz
ARM710T As ARM7TDMI, cache 8 KB unified, MMU 36 MIPS @ 40 MHz
ARM720T As ARM7TDMI, cache 8 KB unified, MMU with FCSE (Fast Context Switch Extension) 60 MIPS @ 59.8 MHz
ARM740T As ARM7TDMI, cache MPU  
ARM7EJ ARMv5TEJ ARM7EJ-S 5-stage pipeline, Thumb, Jazelle DBX, enhanced DSP instructions None  
ARM8 ARMv4 ARM810 5-stage pipeline, static branch prediction, double-bandwidth memory 8 KB unified, MMU 84 MIPS @ 72 MHz
1.16 DMIPS/MHz
ARM9T ARMv4T ARM9TDMI 5-stage pipeline, Thumb None  
ARM920T As ARM9TDMI, cache 16 KB / 16 KB, MMU with FCSE (Fast Context Switch Extension) 200 MIPS @ 180 MHz
ARM922T As ARM9TDMI, caches 8 KB / 8 KB, MMU  
ARM940T As ARM9TDMI, caches 4 KB / 4 KB, MPU  
ARM9E ARMv5TE ARM946E-S Thumb, enhanced DSP instructions, caches Variable, tightly coupled memories, MPU  
ARM966E-S Thumb, enhanced DSP instructions No cache, TCMs  
ARM968E-S As ARM966E-S No cache, TCMs  
ARMv5TEJ ARM926EJ-S Thumb, Jazelle DBX, enhanced DSP instructions Variable, TCMs, MMU 220 MIPS @ 200 MHz
ARMv5TE ARM996HS Clockless processor, as ARM966E-S No caches, TCMs, MPU  
ARM10E ARMv5TE ARM1020E 6-stage pipeline, Thumb, enhanced DSP instructions, (VFP) 32 KB / 32 KB, MMU  
ARM1022E As ARM1020E 16 KB / 16 KB, MMU  
ARMv5TEJ ARM1026EJ-S Thumb, Jazelle DBX, enhanced DSP instructions, (VFP) Variable, MMU or MPU  
ARM11 ARMv6 ARM1136J(F)-S 8-stage pipeline, SIMD, Thumb, Jazelle DBX, (VFP), enhanced DSP instructions, unaligned memory access Variable, MMU 740 @ 532–665 MHz (i.MX31 SoC), 400–528 MHz
ARMv6T2 ARM1156T2(F)-S 9-stage pipeline, SIMD, Thumb-2, (VFP), enhanced DSP instructions Variable, MPU  
ARMv6Z ARM1176JZ(F)-S As ARM1136EJ(F)-S Variable, MMU + TrustZone 965 DMIPS @ 772 MHz, up to 2,600 DMIPS with four processors
ARMv6K ARM11MPCore As ARM1136EJ(F)-S, 1–4 core SMP Variable, MMU  
SecurCore ARMv6-M SC000 As Cortex-M0   0.9 DMIPS/MHz
ARMv4T SC100 As ARM7TDMI    
ARMv7-M SC300 As Cortex-M3   1.25 DMIPS/MHz
Cortex-M ARMv6-M Cortex-M0 Microcontroller profile, most Thumb + some Thumb-2,hardware multiply instruction (optional small), optional system timer, optional bit-banding memory Optional cache, no TCM, no MPU 0.84 DMIPS/MHz
Cortex-M0+ Microcontroller profile, most Thumb + some Thumb-2,hardware multiply instruction (optional small), optional system timer, optional bit-banding memory Optional cache, no TCM, optional MPU with 8 regions 0.93 DMIPS/MHz
Cortex-M1 Microcontroller profile, most Thumb + some Thumb-2,hardware multiply instruction (optional small), OS option adds SVC / banked stack pointer, optional system timer, no bit-banding memory Optional cache, 0–1024 KB I-TCM, 0–1024 KB D-TCM, no MPU 136 DMIPS @ 170 MHz,(0.8 DMIPS/MHz FPGA-dependent)
ARMv7-M Cortex-M3 Microcontroller profile, Thumb / Thumb-2, hardware multiply and divide instructions, optional bit-banding memory Optional cache, no TCM, optional MPU with 8 regions 1.25 DMIPS/MHz
ARMv7E-M Cortex-M4 Microcontroller profile, Thumb / Thumb-2 / DSP / optional VFPv4-SP single-precision FPU, hardware multiply and divide instructions, optional bit-banding memory Optional cache, no TCM, optional MPU with 8 regions 1.25 DMIPS/MHz (1.27 w/FPU)
Cortex-M7 Microcontroller profile, Thumb / Thumb-2 / DSP / optional VFPv5 single and double precision FPU, hardware multiply and divide instructions 0−64 KB I-cache, 0−64 KB D-cache, 0–16 MB I-TCM, 0–16 MB D-TCM (all these w/optional ECC), optional MPU with 8 or 16 regions 2.14 DMIPS/MHz
ARMv8-M Baseline Cortex-M23 Microcontroller profile, Thumb-1 (most), Thumb-2 (some), Divide, TrustZone Optional cache, no TCM, optional MPU with 16 regions 0.99 DMIPS/MHz
ARMv8-M Mainline Cortex-M33 Microcontroller profile, Thumb-1, Thumb-2, Saturated, DSP, Divide, FPU (SP), TrustZone, Co-processor Optional cache, no TCM, optional MPU with 16 regions 1.50 DMIPS/MHz
Cortex-M35P Microcontroller profile, Thumb-1, Thumb-2, Saturated, DSP, Divide, FPU (SP), TrustZone, Co-processor Built-in cache (with option 2–16 KB), I-cache, no TCM, optional MPU with 16 regions 1.50 DMIPS/MHz
ARMv8.1-M Mainline Cortex-M55      
Cortex-R ARMv7-R Cortex-R4 Real-time profile, Thumb / Thumb-2 / DSP / optional VFPv3 FPU, hardware multiply and optional divide instructions, optional parity & ECC for internal buses / cache / TCM, 8-stage pipeline dual-core running lockstep with fault logic 0–64 KB / 0–64 KB, 0–2 of 0–8 MB TCM, opt. MPU with 8/12 regions 1.67 DMIPS/MHz
Cortex-R5 Real-time profile, Thumb / Thumb-2 / DSP / optional VFPv3 FPU and precision, hardware multiply and optional divide instructions, optional parity & ECC for internal buses / cache / TCM, 8-stage pipeline dual-core running lock-step with fault logic / optional as 2 independent cores, low-latency peripheral port (LLPP), accelerator coherency port (ACP) 0–64 KB / 0–64 KB, 0–2 of 0–8 MB TCM, opt. MPU with 12/16 regions 1.67 DMIPS/MHz[25]
Cortex-R7 Real-time profile, Thumb / Thumb-2 / DSP / optional VFPv3 FPU and precision, hardware multiply and optional divide instructions, optional parity & ECC for internal buses / cache / TCM, 11-stage pipeline dual-core running lock-step with fault logic / out-of-order execution / dynamic register renaming / optional as 2 independent cores, low-latency peripheral port (LLPP), ACP 0–64 KB / 0–64 KB, ? of 0–128 KB TCM, opt. MPU with 16 regions 2.50 DMIPS/MHz
Cortex-R8 TBD 0–64 KB / 0–64 KB L1, 0–1 / 0–1 MB TCM, opt MPU with 24 regions 2.50 DMIPS/MHz
ARMv8-R Cortex-R52 TBD 0–32 KB / 0–32 KB L1, 0–1 / 0–1 MB TCM, opt MPU with 24+24 regions 2.16 DMIPS/MHz
Cortex-R82 TBD 16–128 KB /16–64 KB L1, 64K–1MB L2, 0.16–1 / 0.16–1 MB TCM,opt MPU with 32+32 regions 3.41 DMIPS/MHz
Cortex-A(32-bit) ARMv7-A Cortex-A5 Application profile, ARM / Thumb / Thumb-2 / DSP / SIMD / Optional VFPv4-D16 FPU / Optional NEON / Jazelle RCT and DBX, 1–4 cores / optional MPCore, snoop control unit (SCU), generic interrupt controller (GIC), accelerator coherence port (ACP) 4−64 KB / 4−64 KB L1, MMU + TrustZone 1.57 DMIPS/MHz per core
Cortex-A7 Application profile, ARM / Thumb / Thumb-2 / DSP / VFPv4 FPU / NEON / Jazelle RCT and DBX / Hardware virtualization, in-order execution, superscalar, 1–4 SMP cores, MPCore, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), architecture and feature set are identical to A15, 8–10 stage pipeline, low-power design 8−64 KB / 8−64 KB L1, 0–1 MB L2, MMU + TrustZone 1.9 DMIPS/MHz per core
Cortex-A8 Application profile, ARM / Thumb / Thumb-2 / VFPv3 FPU / NEON / Jazelle RCT and DAC, 13-stage superscalar pipeline 16–32 KB / 16–32 KB L1, 0–1 MB L2 opt. ECC, MMU + TrustZone Up to 2000 (2.0 DMIPS/MHz in speed from 600 MHz to greater than 1 GHz)
Cortex-A9 Application profile, ARM / Thumb / Thumb-2 / DSP / Optional VFPv3 FPU / Optional NEON / Jazelle RCT and DBX, out-of-order speculative issue superscalar, 1–4 SMP cores, MPCore, snoop control unit (SCU), generic interrupt controller (GIC), accelerator coherence port (ACP) 16–64 KB / 16–64 KB L1, 0–8 MB L2 opt. parity, MMU + TrustZone 2.5 DMIPS/MHz per core, 10,000 DMIPS @ 2 GHz on Performance Optimized TSMC 40G (dual-core)
Cortex-A12 Application profile, ARM / Thumb-2 / DSP / VFPv4 FPU / NEON / Hardware virtualization, out-of-order speculative issue superscalar, 1–4 SMP cores, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), accelerator coherence port (ACP) 32−64 KB 3.0 DMIPS/MHz per core
Cortex-A15 Application profile, ARM / Thumb / Thumb-2 / DSP / VFPv4 FPU / NEON / integer divide / fused MAC / Jazelle RCT / hardware virtualization, out-of-order speculative issue superscalar, 1–4 SMP cores, MPCore, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), ACP, 15-24 stage pipeline 32 KB w/parity / 32 KB w/ECC L1, 0–4 MB L2, L2 has ECC, MMU + TrustZone At least 3.5 DMIPS/MHz per core (up to 4.01 DMIPS/MHz depending on implementation)[41]
Cortex-A17 Application profile, ARM / Thumb / Thumb-2 / DSP / VFPv4 FPU / NEON / integer divide / fused MAC / Jazelle RCT / hardware virtualization, out-of-order speculative issue superscalar, 1–4 SMP cores, MPCore, Large Physical Address Extensions (LPAE), snoop control unit (SCU), generic interrupt controller (GIC), ACP 32 KB L1, 256 KB–8 MB L2 w/optional ECC 2.8 DMIPS/MHz
ARMv8-A Cortex-A32 Application profile, AArch32, 1–4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, dual issue, in-order pipeline 8–64 KB w/optional parity / 8−64 KB w/optional ECC L1 per core, 128 KB–1 MB L2 w/optional ECC shared  
Cortex-A(64-bit) ARMv8-A Cortex-A34 Application profile, AArch64, 1–4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 2-width decode, in-order pipeline 8−64 KB w/parity / 8−64 KB w/ECC L1 per core, 128 KB–1 MB L2 shared, 40-bit physical addresses  
Cortex-A35 Application profile, AArch32 and AArch64, 1–4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 2-width decode, in-order pipeline 8−64 KB w/parity / 8−64 KB w/ECC L1 per core, 128 KB–1 MB L2 shared, 40-bit physical addresses 1.78 DMIPS/MHz
Cortex-A53 Application profile, AArch32 and AArch64, 1–4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 2-width decode, in-order pipeline 8−64 KB w/parity / 8−64 KB w/ECC L1 per core, 128 KB–2 MB L2 shared, 40-bit physical addresses 2.3 DMIPS/MHz
Cortex-A57 Application profile, AArch32 and AArch64, 1–4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 3-width decode superscalar, deeply out-of-order pipeline 48 KB w/DED parity / 32 KB w/ECC L1 per core; 512 KB–2 MB L2 shared w/ECC; 44-bit physical addresses 4.1–4.5 DMIPS/MHz
Cortex-A72 Application profile, AArch32 and AArch64, 1–4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 3-width superscalar, deeply out-of-order pipeline 48 KB w/DED parity / 32 KB w/ECC L1 per core; 512 KB–2 MB L2 shared w/ECC; 44-bit physical addresses 4.7 DMIPS/MHz
Cortex-A73 Application profile, AArch32 and AArch64, 1–4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 2-width superscalar, deeply out-of-order pipeline 64 KB / 32−64 KB L1 per core, 256 KB–8 MB L2 shared w/ optional ECC, 44-bit physical addresses 4.8 DMIPS/MHz
ARMv8.2-A Cortex-A55 Application profile, AArch32 and AArch64, 1–8 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 2-width decode, in-order pipeline 16−64 KB / 16−64 KB L1, 256 KB L2 per core, 4 MB L3 shared  
Cortex-A65 Application profile, AArch64, 1–8 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 2-wide decode superscalar, 3-width issue, out-of-order pipeline, SMT    
Cortex-A65AE As ARM Cortex-A65, adds dual core lockstep for safety applications 64 / 64 KB L1, 256 KB L2 per core, 4 MB L3 shared  
Cortex-A75 Application profile, AArch32 and AArch64, 1–8 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 3-width decode superscalar, deeply out-of-order pipeline 64 / 64 KB L1, 512 KB L2 per core, 4 MB L3 shared  
Cortex-A76 Application profile, AArch32 (non-privileged level or EL0 only) and AArch64, 1–4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 4-width decode superscalar, 8-way issue, 13 stage pipeline, deeply out-of-order pipeline 64 / 64 KB L1, 256−512 KB L2 per core, 512 KB−4 MB L3 shared  
Cortex-A76AE As ARM Cortex-A76, adds dual core lockstep for safety applications    
Cortex-A77 Application profile, AArch32 (non-privileged level or EL0 only) and AArch64, 1–4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 4-width decode superscalar, 6-width instruction fetch, 12-way issue, 13 stage pipeline, deeply out-of-order pipeline 1.5K L0 MOPs cache, 64 / 64 KB L1, 256−512 KB L2 per core, 512 KB−4 MB L3 shared  
Cortex-A78      
Cortex-A78AE As ARM Cortex-A78, adds dual core lockstep for safety applications    
Cortex-X1 Performance-tuned variant of Cortex-A78    
Cortex-A78C      
ARMv9-A Cortex-A710 First Armv9 generation “big” CPU    
Neoverse   Neoverse N1 Application profile, AArch32 (non-privileged level or EL0 only) and AArch64, 1–4 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 4-width decode superscalar, 8-way dispatch/issue, 13 stage pipeline, deeply out-of-order pipeline 64 / 64 KB L1, 512−1024 KB L2 per core, 2−128 MB L3 shared, 128 MB system level cache  
Neoverse E1 Application profile, AArch64, 1–8 SMP cores, TrustZone, NEON advanced SIMD, VFPv4, hardware virtualization, 2-wide decode superscalar, 3-width issue, 10 stage pipeline, out-of-order pipeline, SMT 32−64 KB / 32−64 KB L1, 256 KB L2 per core, 4 MB L3 shared  
{{o.name}}
{{m.name}}

猜你喜欢

转载自my.oschina.net/u/4702401/blog/5300533