Architecture design needs to solve two main problems: (1) How to improve the
能效比
(performance/power consumption) of the processor - hardening algorithm (2) How to improve the可编程性
(generality) of the processor - CPU
1. Single-core deep learning processor (DLP-S)
1. Overall structure
(1) Architecture diagram
DMA
Is a hardware mechanism that allows peripheral components to transfer their I/O data directly into main memory without involving the system processor.大大提升设备数据传输的吞吐量
.
(2) From DLP to DLP-S
Control module : Multiple issue queues, instruction-level parallelism support, register renaming.
Calculation module : ① Increase the operations in the arithmetic unit to support efficient operations performed by hardware; ② Low-bit width arithmetic unit (quantization) to improve execution energy efficiency; ③ Sparse operation to improve calculation