A dedicated neural network processor chip, can the CPU run neural networks?

How does "unified memory (unifie? emory)" in Apple products compare with previous "memory"?

This unified memory is to put the memory, GPU memory and neural network processor cache together, and connect it to the CPU/GPU/neural network processor through Fabric.

The advantage is that a cache area is added between the CPU and the GPU, and the data transmission between them is easier; the disadvantage is that the three common unified memories (for example, 16GB) are equal to the actual memory for the CPU. Less than 16GB, of which there are Part had to be split between the GPU and the neural processor.

This approach is to change the previous distributed structure storage into the current System on Chip. The system-on-chip has better efficiency and can integrate larger Cache, Memory and customized requirements.

As an advantage of SoC, in addition to saving the area usually required, the internal bus access efficiency can be improved even more. In the past chips, for integration and space size, a balanced consideration should be done, which is large in size and occupies a large area. High integration and high cost.

Since the working mode is a centralized disk array, it supports file-level data access by the host system through the IP network, or block-level data access by the fiber optic protocol on the SAN network.

Similarly, iSCSI is also a very general IP protocol, but it provides block-level data access. This disk array is configured with a multi-port storage controller and a management interface, allowing storage administrators to create storage pools or spaces on demand and provide them to host systems with different access types.

The most common protocols generally include NAS and FC, or iSCSI and FC.

Of course, the above three protocols can also be supported at the same time, but general storage administrators will choose one of FC or iSCSI, both of which provide block-level access methods and file-level access methods (NAS mode) to form a unified storage .

Google AI Writing Project: Neural Network Pseudo-Original

What is the mobile phone chip NPU?

The embedded neural network processor (NPU) adopts the "data-driven parallel computing" architecture, and is especially good at processing massive multimedia data such as video and images .

Introduction cpu=center processing unitsnpu=neural-network processing unitsnpu is not a test item, it is a network processor, which can be regarded as a component (or subsystem), and sometimes it can be called [2] NPU coprocessor.

The embedded neural network processor (NPU) adopts the "data-driven parallel computing" architecture, and is especially good at processing massive multimedia data such as video and images.

On June 20, 2016, the State Key Laboratory of Vimicro Digital Multimedia Chip Technology announced in Beijing that it had successfully developed China's first embedded neural network processor (NPU) chip, becoming the world's first chip with deep learning artificial intelligence. Embedded video capture compression encoding system-level chip, and named "Starlight Intelligent No. 1".

This deep learning-based chip is used in face recognition, with a maximum accuracy rate of 98%, which exceeds the recognition rate of human eyes. The chip was mass-produced on March 6 this year, and the current shipment volume is more than 100,000 pieces.

Zhang Yundong, executive director of the laboratory and chief technology officer of Vimicro, said in an interview that the chip equipped with a neural network processor is applied to a surveillance camera, and the camera is upgraded from an "eye" to an "eye with a brain". first.

The State Key Laboratory of "Digital Multimedia Chip Technology" was established in 2010, relying on Beijing Vimicro Electronics Co., Ltd. and approved by the Ministry of Science and Technology.

According to data, Vimicro was founded in 1999 with direct investment from the former Ministry of Information Industry. It is the "national team" among companies specializing in chip technology. core" situation.

The landing of artificial intelligence "Starlight Smart One" is an embedded NPU. The neural network processor NPU (Neural Processing Unit) is not yet well known, but it is a popular technology in the chip field.

Compared with the CPU processor in the Von Neumann architecture, it adopts a disruptive new architecture of "data-driven parallel computing".

If the data processing method of the Von Neumann architecture is compared to a single lane, then "data-driven parallel computing" has 128 multi-lane parallel lanes, which can process 128 data at the same time, which is conducive to the processing of massive multimedia data such as video and images.

In the industry, the computing performance per unit power consumption, that is, the performance-to-watt ratio, is used to measure the pros and cons of a processor architecture.

According to Zhang Yundong, executive director of the laboratory and chief technology officer of Vimicro, the performance-to-power ratio of "Starlight Smart One" is "at least two or three orders of magnitude higher" than the traditional von Neumann architecture, that is, hundreds of times. High power consumption is criticized by many top artificial intelligence technologies.

IBM's "Deep Blue" in the 20th century and Google's AlphaGo in 2016 need to be supported by huge data calculations. The former uses supercomputers, and the latter uses server clusters, which cannot be separated from the computer room with constant temperature and humidity.

AlphaGo needs 3,000 US dollars to play a game of chess. Zhang Yundong called them "a scientific experiment", and there is still a long way to go before the technology is implemented and put into application. This highlights the miniaturization, low power consumption and low cost advantages of embedded NPU, and accelerates the application of artificial intelligence technology.

For example, drones have high requirements on the weight and power consumption of the camera, otherwise it will affect the take-off and endurance.

However, "Xingguang Intelligent No. 1" is only the size of an ordinary postage stamp and weighs only a few tens of grams. Its birth has made it possible for many small devices such as surveillance cameras to be artificially intelligent, and it has taken artificial intelligence from the mysterious computer room to life applications. step.

What do the CPU GPU NPU units on mobile phones mean?

The meanings of the CPU GPU NPU unit on the mobile phone are as follows: 1. CPUCPU is a general-purpose processor, which is a computing unit, a control unit and a storage unit.

The structure of the CPU mainly includes an arithmetic unit (ALU, Arithmetic and Logic Unit), a control unit (CU, Control Unit), a register (Register), a cache (Cache) and a bus for communicating data, control and status among them.

2. GPUGPU is a graphics processor, its full name is Graphics Processing Unit. GPUs were originally microprocessors used to run graphics operations on personal computers, workstations, game consoles, and some mobile devices (such as tablets, smartphones, etc.).

It is specially used for graphics computing rendering, which is generally used for games. You can also run some AI algorithms. 3. NPUNPU is a neural network processor, a general term for a new type of processor based on neural network algorithms and acceleration. NPU dedicated AI accelerated computing.

For example, the diannao series produced by the Institute of Computing Technology, Chinese Academy of Sciences is a professional chip for deep learning. Extended information: The functions of the CPU GPU NPU unit on the mobile phone: 1. As the core component of the mobile phone, the CPU can perform common instruction calculations and various algorithms.

2. GPUGPU has a lot of ALUs and few caches. The purpose of the cache is not to save the data that needs to be accessed later. This is different from the CPU, but to improve the service of the thread.

If many threads need to access the same data, the cache will merge these accesses and then access the dram. 3. NPU is constructed by imitating the biological neural network, and is composed of several artificial neuron nodes interconnected.

Neurons are connected in pairs through synapses, and synapses record the strength of the connection between neurons. Each neuron can be abstracted as an activation function, and the input of this function is determined by the output of the neuron connected to it and the synapse connecting the neuron.

In order to express specific knowledge, users usually need to adjust (through some specific algorithms) the value of synapses in the artificial neural network, the topology of the network, etc. This process is called "learning". After learning, the artificial neural network can use the acquired knowledge to solve specific problems.

Deep Learning's Hardware Requirements

I was keen on learning theoretical knowledge before, but now I want to run the code and find that I don’t know where to start. The platform built on my computer is basically a decoration, because I can’t run it. Today we will take a look at how to start deep learning.

First understand the basics: 1. The difference between training with cpu and training with gpu for deep learning (1) CPU is mainly used for serial operations; while GPU is for large-scale parallel operations. Due to the huge sample size and the large amount of parameters in deep learning, the role of the GPU is to accelerate network operations.

(2) It is also possible for the CPU to calculate the neural network, and the calculated neural network works well in practical applications, but the speed will be very slow. At present, GPU operations are mainly concentrated on matrix multiplication and convolution, and other logic operations are not as fast as CPUs.

At present, there are three ways to train the model: 1. Configure a "local server" by yourself, commonly known as a high-end computer. This choice is generally a desktop computer, because the "high-end configuration" of a notebook is too expensive, and you can buy a configuration that is much better than a notebook at the same price.

If it is used for a long time and needs to be engaged in research in the field of deep learning for a long time, this choice is still better and more free. ① Machine learning desktop/host configuration with a budget of less than 10,000: ② From Li Feifei’s course, you can see her computer configuration, which is the basic setting for machine learning.

Memory: 4X8G Display card: two NV GTX 1070 hard disk: one HDD, two SSD Reduce your training time from weeks to days, so choose your GPU very carefully.

You can refer to the GPU ladder list, all of which are relatively new models with strong performance. In the Nvidia product series, there is the GeForce series in the consumer field, the Quadro series in the professional graphics field, and the Tesla series in the high-performance computing field. How to choose?

According to research papers, too high precision does not improve the error rate of deep learning, and most environmental frameworks only support single precision, so double-precision floating-point calculations are unnecessary, and the Tesla series have been removed.

From the indicators of graphics card performance, the number of CUDA cores must be large, the GPU frequency must be fast, the video memory must be large, and the bandwidth must be high. In this way, the latest Titan X is a cheap and plentiful choice. CPU: In general, you need to choose a good GPU and a good CPU.

As a high-speed serial processor, it is often used as a "controller" to send and receive instructions, parse instructions, etc.

Due to the limitations of the internal structure of the GPU, it is more suitable for high-speed parallel computing, but not suitable for fast instruction control, and a lot of data needs to be accessed between the GPU and the CPU, which requires the use of the CPU, because This is its forte.

Memory stick: mainly for data exchange between the CPU and peripherals, its access speed is several times faster than that of the hard disk, but the price is relatively expensive, usually proportional to the capacity.

The memory size should be at least larger than the memory size of the GPU you choose (preferably twice the video memory, of course, the bigger the better if you have money). In deep learning, a large number of data exchange operations (such as reading data by batch) are involved.

Of course, you can also choose to store the data on the hard disk and read small batch blocks each time, so your training cycle will be very long.

The commonly used solution is to "select a larger memory, read several batches of data from the hard disk each time and store them in the memory, and then perform data processing", which can ensure uninterrupted data transmission and complete data processing efficiently task.

Power supply problem: The power of one graphics card is close to 300W, and the power supply of four graphics cards is recommended to be above 1500W. For future expansion, a larger power supply can be selected. Solid state drive: As a "local storage", it is mainly used to store various data. Because of its slower speed, the price is naturally cheaper.

It is recommended that you choose a hard drive with a larger capacity, usually 1T/2T. A good idea is: "You can use some old hard drives, because the expansion of the hard drive is very simple, so you can save some money."

Why Use GPUs to Train Neural Networks Instead of CPUs?

Many modern neural network implementations are based on GPUs, which are specialized hardware components originally developed for graphics applications. So the neural network benefits from the development of the game industry.

The central processing unit (CPU for short) is the computing and control core of the computer system and the final execution unit for information processing and program operation.

Since the CPU was produced, it has made great progress in logic structure, operating efficiency and function extension. The CPU appeared in the era of large-scale integrated circuits. The iterative update of processor architecture design and the continuous improvement of integrated circuit technology have prompted its continuous development and improvement.

From being originally dedicated to mathematical computing to being widely used in general computing. From 4-bit to 8-bit, 16-bit, 32-bit processors, and finally to 64-bit processors, from the incompatibility of various manufacturers to the emergence of different instruction set architecture specifications, CPU has been developing rapidly since its birth.

The von Neumann architecture is the foundation of modern computers. Under this architecture, programs and data are stored uniformly, instructions and data need to be accessed from the same storage space, transmitted via the same bus, and cannot be overlapped.

According to the von Neumann system, the work of the CPU is divided into the following five stages: instruction fetching stage, instruction decoding stage, instruction execution stage, access and access, and result write-back.

Which mobile phone is better for playing PlayerUnknown's Battlegrounds?

Honor Play is very good. The screen adopts a 6.3-inch wide-view full-screen, Kirin 970 chip octa-core + micro-intelligence core i710nm flagship processor, and 6GB large memory is optional. It is easy to control large-scale 3D games and enjoy a smooth experience with high frame rate.

The existence of the neural network processor (NPU) allows complex AI algorithms to run quickly. The first mobile phone equipped with GPU Turbo, the performance is released, bringing a continuous and stable high frame rate gaming experience, the picture is smooth, no jitter, no smear; smooth and no freeze.

It is recommended to log in to Huawei Mall to view more relevant parameters of the product.

What does tpu mean on the computer motherboard

TPU is a control chip independently developed by ASUS. Through this chip, players can overclock their CPU through hardware control without occupying CPU performance.

The EPU energy-saving engine can detect the current PC load and intelligently adjust the power in real time, so as to provide the overall system energy-saving function.

EPU provides automatic phase switching for components (including CPU, graphics card, memory, chipset, hard disk and system fan), intelligently accelerates and overclocks to provide the most suitable power consumption, thereby saving power and cost.

What mobile phone is best for playing games

Honor Play mobile phone is very different. Honor’s first mobile phone equipped with GPU Turbo has unleashed performance and brings a continuous and stable high frame rate gaming experience. The game hits the ground running.

10nm flagship processor, 6GB large memory is optional, easy to control large-scale 3D games, and enjoy a smooth experience with high frame rate. The existence of the neural network processor (NPU) allows complex AI algorithms to run quickly; 3750mAh (typical value) high-density battery provides longer battery life.

You can log in to Huawei Mall to learn more about mobile phone parameters, and choose one according to your personal preferences and needs.

What brand of mobile phone is suitable for playing games and cheap?

Generally, playing games requires high mobile phone configuration, X30 Pro&X30, iQOO Neo 855 version, NEX 3, X27/X27Pro, S1/S1Pro, iQOO Pro&iQOO/iQOO Neo, NEX dual-screen version, X23, NEX/NEX flagship version. This mobile phone has a powerful hardware configuration, so it is well reflected in both running speed and processing power.

You can go to the vivo official website to learn more~.

 

Guess you like

Origin blog.csdn.net/aifamao3/article/details/127443666