『ARM』 and 『x86』 processor architecture analysis guide

Preface

If you ask everyone if they know about CPU, I believe you will not get a negative answer, but if you continue to ask if you know about ARM and X86 architecture , what is the difference between them? I believethat some people may be speechless

At present, with the technological iteration of deep learning, high-performance computing, NLP, AIGC, GLM, and AGI, the rapid development of large models is facilitated. For the combination of multiple computing power (CPU+GPU ) demand is getting higher and higher, and it makes no sense if you don’t understand CPU anymore. Therefore, this article will take you to have an in-depth understanding of CPU architecture and analyze the two mainstream CPU architectures. : ARM and X86

419a8ad171e1f2ccd36d965b9cdded05

Introduction

Central Processing Unit (CPU) is the computing core and control core of the computer. CPU, internal memory and input/output devices are the three core components of electronic computers. Its function is mainly to interpret computer instructions and process data in computer software

CPU is composed of arithmetic units, controllers and registers, as well as data, control and status buses that implement the connection between them.. The operating principles of almost all CPUs can be divided into four stages: Fetch and Decode , Execute and Writeback. The CPU fetches instructions from memory or cache, places them in the instruction register, decodes them, and executes them

Processor series

x86

Intel series: Celeron, Pentium and Core from low-end to high-end

AMD series: Semporn, Athlon and Phenom series from low-end to high-end

Because Intel is the leader in x86 architecture processors, it may be used in some places below Intel processor refers to x86 architecture processors

Note: The two companies above mainly make x86 architecture, but they also make ARM architecture. For example, in 2016 AMD launched the Opteron A1100, a processor based on ARM architecture.

ARM

Basically, it is the processor series of the British ARM company, and all companies authorized by ARM, such as Qualcomm, Apple (made by Samsung), Samsung and Huawei

Instruction set differences

If you want to understand X86 and ARM, you need to first understandComplex Instruction Set Computer (CISC) and Reduced Instruction Set Computer (RISC)< /span>

image-20231019191356285

complex instruction set

In the CISC microprocessor,the instructions of the program are executed serially in order, and the operations in each instruction It is also executed serially in order. The advantage of sequential execution is simple control, but the utilization of various parts of the computer is not high and the execution speed is slow. CISC architecture servers are mainly based on IA-32 architecture (Intel Architecture), and most of them are used by mid- to low-end servers

The computer's instruction system is relatively rich, with special instructions to complete specific functions. Therefore, processes special tasks more efficiently. The machine's memoryhas many operating instructions and the operations are straightforward. Contains a wealth of circuit units, soit has strong functions, large area, and high power consumption

Applicable fields:PCs and servers

Note:x86 structure For main hooks Additional command collection

reduced instruction set

RISC is a microprocessor that executes fewer types of computer instructions. It originated from the MIPS host (ie RISC machine) in the 1980s. The microprocessors used in RISC machines are collectively called RISC Processor

RISC processors are able to perform operations faster (more million instructions per second, or MIPS). Because a computer requires additional transistors and circuit elements to execute each type of instruction, a larger computer instruction set makes the microprocessor more complex and perform operations more slowly.

Because the designer mainly focuses on those frequently used instructions, trying to make them simple and efficient, less commonly used functions will be completed by combining instructions, so on RISC machinesWhen implementing special functions, the efficiency may be low, but it can be improved and compensated for by using pipeline technology and superscalar technology

There are restrictions on memory operations to simplify control. Contains fewer unit circuits, sosmall area and low power consumption

Applicable fields:Mobile devices and embedded systems

Note: **ARM architecture ** mainly uses reduced instruction set

Summarize

In terms of CPU power consumption, RISC and CISC have taken two different paths. CISC takes the performance route, improving performance first and power consumption second. Constantly consider how to dissipate heat, using metal sheets, fans, water cooling and other devices to cool down, because when used on a PC, there will be no obvious problem with high power consumption. RISC takes the low-power route and is aimed at scenarios that require the use of batteries. Low power consumption is the first principle, followed by performance

64-bit computing

x86

ADM first developed a 64-bit version of the x86 instruction set. The 64-bit instruction set is called x86-64 (referred to as x64)

Intel believed that evolving its 32-bit x86 architecture into a 64-bit architecture would be inefficient, so it created a new 64-bit processor project called IA64, thus creating the Itanium series of processors

Later, AMD knew that it could not build a processor compatible with IA64, so it extended x86 to include 64-bit addressing and 64-bit registers. The final architecture was AMD64, and ADM eventually became the standard for 64-bit versions of x86 processors. In the end, Intel completely abandoned the Itanium series of processors and finally adopted AMD64

ARM

After seeing the demand for 64-bit computing in mobile devices, ARM released the ARMv8 64-bit architecture in 2011. Based on the original principles and instruction sets, ARM developed a concise 64-bit architecture. ARMv8 uses two execution modes, AArch32 and AArch64

The ingenuity of the ARM design is thatthe processor can seamlessly switch between the two modes during operation. This means that the decoder for 64-bit instructions is newly designed, without having to consider 32-bit instructions, and the processor can still be backwards compatible

Heterogeneous computing

ARM big.LITTLE - Wikipedia

ARM's big.LITTLE architecture addresses the challenges facing the industry today:How to create high-performance and extremely energy-efficient System on chip (SoC) to extend battery life

In big.LITTLE architecture, processors can be of different types. A traditional dual-core or quad-core processor contains the same 2 or 4 cores. A dual-core Atom processor has two identical cores that provide the same performance and consume the same power. ARM brings heterogeneous computing to mobile devices with big.LITTLE. This meansthat cores in a processor can have different performance and power consumption. When the device is running normally, low-power cores are used, and when you run complex games, high-performance cores are used

Efficient and seamless switching of workloads between the two processors in the big.Little system is achieved through the development of advanced ARM system IP, which ensures complete cache, I/O between the Cortex-A15 and Cortex-A7 processors consistency

If you wantto learn more about the underlying principles and mechanisms of big.LITTLE, you can view the following ARM official website explanation

Official explanation:

Power consumption comparison

In the past, people's stereotypes were that low-power consumption and high-computing power processors were clearly distinguished. The x86 architecture was for high computing power, and the ARM architecture was for low-power consumption.

But since Apple released the M1 chip (the current M2 chip far exceeds the performance of the equivalent x86 processor), and with the rapid development of other ARM processors, people have suddenly realized that,It turns out that ARM, which started with low power consumption, can also do high computing power, and can truly achievehigher performance and lower power consumption

According to data provided by Ampere in 2022,its CPU performance is 3 times higher than that of traditional x86 processors, and its performance-to-power ratio is nearly 4 times ahead, compared with x86 server CPU, the Ampere Altra series can use 50% of the energy consumption and provide 200% of the performance

Arm server CPUs will further widen the performance gap with x86 CPUs

Reference link

This article is published by the blog post platform OpenWrite!

Guess you like

Origin blog.csdn.net/m0_63748493/article/details/133934230