What is CPU, GPU, TPU, DPU, NPU, BPU

With the hot development of deep learning, various chips have appeared in people's field of vision one after another, such as GPU, TPU, DPU, NPU, BPU... What are these? How are they related to the CPU? Let's talk a little bit

First, let's introduce the full English names of these words:

  • CPU full name: Central Processing Unit, central processing unit;
  • GPU full name: Graphics Processing Unit, image processor;
  • TPU full name: Tensor Processing Unit, tensor processor;
  • DPU full name: Deep learning Processing Unit, deep learning processor;
  • NPU full name: Neural network Processing Unit, neural network processor;
  • BPU full name: Brain Processing Unit, brain processor.

Let's take a look at these so-called "XPU"

1、CPU

CPU (Central Processing Unit, central processing unit) is the "brain" of the machine, mainly including arithmetic unit (ALU, Arithmetic and Logic Unit), control unit (CU, Control Unit), register (Register), cache (Cache) The data, control, and status bus that communicates between them. It consists of three parts: computing unit, control unit and storage unit. As shown below

2、GPU

In order to solve the difficulties encountered by the CPU in large-scale parallel operations and improve the speed, the GPU emerged as the times require, using a large number of computing units and super-long pipelines.

Regarding GPU, a biological concept mentioned earlier - parallel computing. Parallel computing refers to the process of using multiple computing resources to solve computing problems at the same time, and it is an effective means to improve the computing speed and processing capacity of computer systems. Its basic idea is to use multiple processors to jointly solve the same problem, that is, to decompose the problem to be solved into several parts, and each part is calculated in parallel by an independent processor.

GPU (Graphics Processing Unit), Chinese for graphics processor, GPU was originally used in personal computers, workstations, game consoles and some mobile devices (such as tablet computers, smart phones, etc.) to run graphics operations on microprocessors. As shown below: 

Why are GPUs so good at processing image data? This is because each pixel on the image needs to be processed, and the process and method of each pixel are very similar, so the GPU is very capable in image processing.

The figure below is a comparison of the CPU and GPU architectures. It can be seen that the GPU architecture is very simple, but the GPU cannot work alone, and must be controlled by the CPU to work.

3、TPU

TPU is a programmable AI accelerator proposed by Google in May 2016 for the Tensorflow platform. TPU can provide high-throughput, low-precision computing, which is used for model forward operation instead of model training, and is energy efficient (TOPS/w) higher. It is said that TPU can provide 15-30 times performance improvement and 30-80 times efficiency (performance/watt) improvement compared with CPU and GPU of the same period.

How did the TPU do it so fast?

(1) Customized research and development of deep learning: TPU is a chip specially developed by Google to accelerate the computing power of deep neural networks, and it is actually an ASIC (application-specific integrated circuit).

(2) Large-scale on-chip memory: TPU uses up to 24MB of local memory, 6MB of accumulator memory and memory for docking with the main control processor on the chip.

(3) Low-precision (8-bit) calculation: The high performance of the TPU also comes from its tolerance for low precision. transistor.

The following is the frame diagram of each module of the TPU: 

Block diagram of each module of the TPU. The main computational part is the yellow matrix multiplication unit at the top right. Its input is the blue "weight FIFO" and the blue unified buffer (Unified Buffer (UB)); the output is the blue accumulator (Accumulators (Acc)). The yellow Activation unit performs a non-linear function in Acc that flows to UB.

4、DPU

The DPU deep learning processor was first proposed by the domestic Shenjian Technology. Based on the FPGA chip with the reconfigurable characteristics of Xilinx, a dedicated deep learning processing unit was designed, and a customized instruction set and compiler were abstracted to achieve rapid development and product development. iterate. 

5, NPU

NPU, neural network processor, simulates human neurons and synapses at the circuit layer, and uses deep learning instruction sets to directly process large-scale neurons and synapses. One instruction completes the processing of a group of neurons. Typical representatives of NPU are domestic Cambrian chips and IBM's TrueNorth. 

6 、 BPU

BPU, brain processor, is an embedded artificial intelligence processor architecture proposed by Horizon Technology. Traditional CPU chips do everything, so they generally use a serial structure. The BPU is mainly used to support deep neural networks, such as image, speech, text, control and other tasks, rather than doing everything.

 

Welcome to follow my WeChat public account "Big Data and Artificial Intelligence Lab" (BigdataAILab) for more information

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324579664&siteId=291194637