Advanced CPU Design

1. Concept review

  • Cache: A small block of RAM in the CPU used to store batches of instructions.
  • Cache hit: the desired data is already in the cache
  • Cache miss: the desired data is not in the cache
  • Dirty bit: Each block of space in the cache has a special mark called the dirty bit, which is used to detect whether the data in the cache is consistent with RAM.
  • Multi-core processor: There are multiple independent processing units in a CPU chip.

2. How modern CPUs improve performance:

In the early days, CPU speed was increased by speeding up transistors. But soon the method reached its limits.

Later, a special division circuit + other circuits were designed for the CPU to perform complex operations: such as games and video decoding.

3. Cache:

In order to prevent the CPU from waiting for data, a small piece of memory is set up inside the CPU, called a cache, so that the RAM can transfer a batch of data to the CPU at one time. (Without caching, the CPU has no room to store large amounts of data)

The cache can also be used as a temporary space to store some intermediate values, which is suitable for long/complex operations.

Reason for empty waiting: There is a delay in data transmission from RAM to CPU (it needs to go through the bus, and RAM still needs time to find the address, fetch data, configure, and output data).

4. Cache synchronization:

Cache synchronization generally occurs when the CPU cache is full, but the CPU still needs to input data into the cache. At this time, the data marked as dirty bits will be transferred back to RAM first to make room to prevent it from being overwritten, resulting in incorrect calculation results.

5. Instruction pipeline:

Function: Let the three steps of address fetching → decoding → execution proceed at the same time. Execute instructions in parallel to improve CPU performance.

It originally took 3 clock cycles to execute an instruction, but now it only takes 1 clock cycle.

Design difficulty: data is dependent.

Data dependency resolution:

Dynamic sorting, out-of-order execution, branch prediction (high-end CPU)

6. Process multiple instructions at one time

 

7. Run multiple instruction streams simultaneously (multi-core CPU)

Multi-core processor: There are multiple independent processing units in a CPU chip. But because they are closely integrated, they can share some resources.

8. Supercomputer (multiple CPUs)

In a computer, countless CPUs are used to perform monster-level complex operations, such as simulating the formation of the universe.

Article from: [Computer Science Crash Course] Notes

Guess you like

Origin blog.csdn.net/hellow_xqs/article/details/131124414