Summarize some basic knowledge about the CPU

CPU and implementation of programs

CPU is the brain of the computer.

  1. The program is run, in fact, related to the execution, the instruction does not involve a lot of programs .
    When the part program to be executed is loaded into memory, CPU fetch the instructions from memory, and an instruction decoder (to know the type and number of operations, to be understood as a simple CPU know what command), and then executes the instruction. And then fetch the next instruction, decode, execute, and so on until the program exits.
  2. This fetch, decode, execute three processes constitute a basic cycle of the CPU.
  3. Specialized for each CPU has its own set can be executed instruction set (note that this is part of the instruction provided by the CPU, CPU-Z software to view).
    It is because of the different CPU instruction set architecture different, making the x86 processor can not execute ARM program, the program can not execute x86 ARM program.
    Note: hardware-level instruction set points : hardware instruction set executable instructions are provided on the hardware level set by the CPU itself. Software instruction set is a command language provided by the library, as long as the installed language libraries, commands can be executed.

  4. Since the CPU accesses the memory to obtain the instructions or data time much longer than instruction execution takes, thus providing some internal CPU general purpose registers used to store key variables, temporary data and the like information .
    Therefore, the CPU need to provide some specific instruction, such that data can be read from memory into the register, and data may be stored in memory.
  5. In addition to general-purpose registers, there are some special registers. A typical example:
    • PC: program counter, a program counter indicates that it holds the memory address of the next instruction to be fetched after the instruction fetch, this register is updated to point to the next instruction .
    • Stack pointers: pointing to the top of the current stack memory, the process comprising the stack frame for each function execution, the stack frame holds the input parameters related to the function, local variables, and variables are not stored in the temporary register.
    • PSW: program status word, indicates the program status word, the control bits stored in this register, such as priority, the operating mode of the CPU (user mode or kernel mode mode) of the CPU and the like.
  6. When the process of switching the CPU, the status register and data related to the current process needs to be written to the memory location corresponding to (stack space in the process kernel) saved when switching back to the process, you need to be copied back from memory register . That is the context switch, the need to protect the site and restore the site.
  7. To improve performance, the single CPU is not the 取指-->解码-->执行route, but these three processes are separate values each unit, a decoding unit and an execution unit. This forms the pipeline mode.
    For example, the last pipeline unit - the n-th execution unit is executing instructions, and a front unit may decode the first instruction n + 1, then a unit that is pre-fetch unit may be to read the n + 2 instruction. This is a three-stage pipeline, also may have a longer pipeline mode.
  8. CPU architecture is more optimized superscalar architecture (superscalar architecture). This architecture will fetch decoded separately, an execution unit, a large number of execution units, each fetch and decode portions + run in a parallel manner. For example there are two fetch the instruction decoded in parallel lines +, each working line are decoded into a cache buffer to remove the execution unit for execution.

  9. In addition to embedded systems, most of the CPU has two operating modes: kernel mode and user mode. These two modes by a bit in the PSW register controlled.
  10. CPU core state, can perform all the instruction set, and all functions using hardware.
  11. CPU user mode, only operative instruction set. In general, all performed IO and the relevant associated memory protection in user mode are prohibited, in addition to other privileged instructions are prohibited, such as not setting of the PSW control mode bit is set to user mode kernel mode .
  12. User mode CPU wants to perform privileged operations, it is necessary to initiate the kernel system call to request help to complete the corresponding operation. In fact, after initiating a system call, CPU executes the instruction into a trap (trap) to the kernel. After the completion of privileged operations, it requires an instruction for the CPU to return to user mode.
  13. In addition to system calls into the kernel, will cause more of a hardware trap behavior into the kernel so that the CPU can control back to the operating system, so the operating system to decide how to handle the hardware exception.

About CPU multi-core and multi-threading

  1. The number of physical CPU is determined by the number of slots on the motherboard, and each can have multiple CPU cores, each core may be multi-threaded.
  2. Each core multi-core CPU (per core is a small chip), it seems the OS is a separate CPU .
  3. For CPU, hyper-threading, each CPU can have a plurality of core threads (the number is two, such a dual-threaded core, two core threads 4, 4 core threads 8), each thread is a virtual logical CPU (for example the windows is the name of the logical processor address), and each thread is independent of the OS seems the CPU .
    This is the deception of the operating system behavior, still only 1 in nuclear physics, but viewed in the perspective of Hyper-Threading CPU, it believes will accelerate its Hyper-Threading to run the program.

  4. To play to the advantages of Hyper-Threading, we require a dedicated operating system optimized for Hyper-Threading.
  5. Multi-threaded CPU on ability than non-core CPU multithreading to be stronger, but each thread is not sufficient and independent of the CPU core competencies compared.
  6. Multi-threaded CPU core on each share CPU resources of the core .
    For example, after assuming that only one per CPU core "engine" of resources, then the thread 1 virtual CPU uses the "engine", the thread 2 would not be able to use, can only wait.
    Therefore, the main purpose of Hyper-Threading technology is to increase the pipeline (see text explanation of the pipeline before) more separate command, so the thread 1 and thread 2 in the pipeline will not try to compete for the core CPU resources. So, Hyper-Threading technology takes advantage of the superscalar (superscalar) architecture.
  7. 多线程意味着每核可以有多个线程的状态。比如某核的线程1空闲,线程2运行。
  8. 多线程没有提供真正意义上的并行处理,每核CPU在某一时刻仍然只能运行一个进程,因为线程1和线程2是共享某核CPU资源的。可以简单的认为每核CPU在独立执行进程的能力上,有一个资源是唯一的,线程1获取了该资源,线程2就没法获取
    但是,线程1和线程2在很多方面上是可以并行执行的。比如可以并行取指、并行解码、并行执行指令等。所以虽然单核在同一时间只能执行一个进程,但线程1和线程2可以互相帮助,加速进程的执行。
    并且,如果线程1在某一时刻获取了该核执行进程的能力,假设此刻该进程发出了IO请求,于是线程1掌握的执行进程的能力,就可以被线程2获取,即切换到线程2。这是在执行线程间的切换,是非常轻量级的。(WIKI: if resources for one process are not available, then another process can continue if its resources are available)
  9. 多线程可能会出现一种现象:假如2核4线程CPU,有两个进程要被调度,那么只有两个线程会处于运行状态,如果这两个线程是在同一核上,则另一核完全空转,处于浪费状态。更期望的结果是每核上都有一个CPU分别调度这两个进程。

CPU上的高速缓存

  1. 最高速的缓存是CPU的寄存器,它们和CPU的材料相同,最靠近CPU或最接近CPU,访问它们没有时延(<1ns)。但容量很小,小于1kb。
    • 32bit:32*32比特=128字节
    • 64bit:64*64比特=512字节
  2. 寄存器之下,是CPU的高速缓存。分为L1缓存、L2缓存、L3缓存,每层速度按数量级递减、容量也越来越大。

  3. 每核心都有一个自己的L1缓存。L1缓存分两种:L1指令缓存(L1-icache)和L1数据缓存(L1-dcache)。L1指令缓存用来存放已解码指令,L1数据缓存用来放访问非常频繁的数据。
  4. L2缓存用来存放近期使用过的内存数据。更严格地说,存放的是很可能将来会被CPU使用的数据。
  5. Intel的CPU是多核共享L2缓存

Guess you like

Origin www.cnblogs.com/f-ck-need-u/p/11141636.html