Integrated Storage and Computing: Core Architecture Innovation Breaks the Limit of Computing Power and Energy Efficiency|In-depth Research Report

 

Author|Jiapan Wang and Sima Jie

This report is jointly released by Shicheng Capital and Lightcone Intelligence

In the post-Moore era, the integration of storage and computing is one of the disruptive technologies that can increase the computing power per unit power consumption by 10 times. What are its underlying principles, application prospects, and feasibility? What is the current status of the industry and the opportunities for innovation and entrepreneurship in the industry? This article starts from the underlying technical principles and changes in industrial demand, and comprehensively sorts out the wave of innovation and investment opportunities in the storage-computing integration industry:

 

1. Core Judgments and Opinions

1. The integration of storage and computing belongs to the underlying architecture innovation of the chip. It is at a very early stage. The gap in the industry chain and the opportunities and challenges are no less than the development of GPU from scratch 20 years ago.

2. Compared with cutting-edge computing power solutions such as quantum computing, photonic chips, and non-silicon-based chips, benefiting from the maturity of media and other technologies, storage-computing integrated chips are more likely to be widely implemented within 3-5 years.

3. The field of integrated storage and computing is a rare chip field that started at home and abroad at the same time, and China is more likely to make products that lead the world.

4. At present, the industry and investors believe that the upstream and downstream of the industrial chain are still not perfect, and it will still take 5-10 years to be put into use, but this also means more comprehensive innovation opportunities.
5. At present, competition among players in the industry is mainly focused on different storage media. In the long run, there is no difference in the storage media route. All-round competition in design methodology, testing, mass production, software, and scene selection is the long-term key.
6. The choice of the first and second chip scenarios is very important. The first to obtain commercialization verification and create explosive models is the key to success in the next three years.
7. As an emerging technology, industrial talents are mainly concentrated in the academic world rather than the business world, so the transformation of technology and talents in colleges and universities is very critical.
8. In addition to start-up companies, universities and giants are also doing research and development at the same time. In the long run, the real strong competitors may be giants on the sidelines.
9. Compared with mainstream computing power such as CPU/GPU, the memory-computing integrated chip is not a replacement relationship, and will become an important supplement to mainstream computing power in the future, focusing more on energy-efficient computing power.

2. The background and principle of storage and calculation integration technology

 

Under the current situation that the global data volume is skyrocketing exponentially and the computing power is in short supply relative to AI computing, the integration of storage and computing technology mainly solves the problem of high energy consumption and cost caused by high computing power, and is expected to reduce energy consumption per computing power by an order of magnitude. , It is expected to take advantage of its low power consumption, low latency, and high computing power density in power-sensitive tens of billions of AIoT devices, high-energy-consuming data centers, and autonomous driving.

Under the existing mature architecture and technology, relying on the progress of process technology, increasing transistor density to increase computing power and reducing power consumption have gradually approached the physical limit, and the cost has gradually increased;

 

Under the von Neumann architecture, due to the separation of data storage and computing units, the increase in computing power is limited and power consumption increases:

 

To cope with the separation of storage units and computing units, the technical idea of ​​integrating storage and computing came into being. The storage and computing units are integrated on the device unit, and the inherent bottleneck of the von Neumann architecture is solved through the innovation of the underlying architecture:

 

Due to continuous breakthroughs in storage media technology in recent years, in addition, the AIoT era has put forward natural requirements for equipment such as intelligence, low power consumption, small size, and low latency (while the existing technical routes cannot meet the requirements well) , under the dual forces of technological breakthroughs and market demand, the integration of storage and computing technology has reached the inflection point of industrialization explosion:

 

Compared with the birth of the CPU more than 50 years ago and the birth of the GPU more than 20 years ago, the current integrated storage and computing technology is still in its early stage. In the future, relying on its better parallelism and better energy efficiency ratio, it is expected to become a smart It is one of the mainstream computing power platforms in the era of modernization, and it complements existing computing power solutions.
With the huge opportunities for architectural innovation and changes in computing power demand, it is hoped that the next 100 billion-dollar chip giant will be born in the field of integrated storage and computing. And the industry is expected to lead the world.

 

The integration of storage and calculation currently has some similar names (such as near-storage computing), and its internal structure differences are as follows:

Near-memory computing: without changing the design function of the computing unit and storage unit itself, through the use of advanced packaging methods and reasonable hardware layout and structural optimization, the communication bandwidth between the two is enhanced, and the transmission rate is increased; it is essentially a von Neumann architecture , to optimize the "storage wall" by shortening the distance between the storage unit and the computing unit.

Internal storage computing: the storage unit and the computing unit are completely integrated, there is no independent computing unit, and the computing operation is completed by the storage unit inside the memory chip through embedding algorithms on the memory particles; its design is more difficult, and the room for improvement in the future is also larger. But it needs to obtain the license support of the foundry. This article discusses the integration of storage and computing / in-memory computing companies mainly focus on this category.

 

3. Selection of storage medium technology route

Analyzing the integration of storage and computing, the current entry points for storage and computing integrated chip R&D enterprises/institutions on mature media are concentrated in SRAM, Nor-Flash, and DRAM; some academic institutions choose to cut into RRAM and other new media research and development.

From the classification of storage media, it is divided into volatile memory and non-volatile memory.

 

The above picture is quoted from the research report of Founder Securities

At present, different storage media undertake necessary tasks in the computer architecture, among which SRAM is the closest to the CPU, has the fastest response time, and has a small storage capacity;

Followed by DRAM, NAND-Flash and other media, each has its own characteristics in terms of transmission rate and storage capacity:

1. Volatile memory: that is, when the system is shut down normally or suddenly or accidentally, the data will be lost and the cost is high.

DRAM: memory stick (one storage unit only needs one transistor and one small capacitor), occupying 58% of the semiconductor storage market share, currently breaking through 20nm and transitioning to 10nm.

SRAM: CPU cache (a storage unit requires 4-6 transistors), which is characterized by the fastest speed (nanosecond level) and does not need to be charged all the time.

2. Non-volatile memory: data will not be lost in the above power failure situation, and the cost is low.

NAND Flash: Such as solid-state hard disk, U disk and memory; large capacity, but extremely low read and write speed.

NOR Flash: code-type memory, mainly storing some instructions; such as the storage of code embedded in set-top boxes, gateways, and routers; the capacity is small and the written data is extremely low, but the reading speed is fast.

 

In the long run, the rapid development of the productization of storage-computing integrated chips is inseparable from the promotion of the maturity of new storage media. The following is a comparison of the principles of different new storage media:

 

In the long run, RRAM (memristor) is a new discovery besides resistors, capacitors, and inductors; it has very similar characteristics to biological synapses, so it is also called an electronic synapse device.

 

The following is a performance comparison of the new storage media:

 

The following is the storage principle and objective performance comparison of different storage media; mature storage media such as SRAM, DRAM, and Flash complete data storage based on charge movement; new storage media and RRAM, MRAM, etc. complete data storage based on changes in resistance.

 

In addition to the medium, the choice of digital computing and analog computing is also one of the factors that affect the performance of the memory-computing integrated chip; among them, digital computing has higher precision.

 

4. Application Scenarios of Integrated Storage and Computing

1. The storage-computing integrated architecture highly overlaps with the deep learning network computing model.

General-purpose computing chips do not have a cost-effective advantage in serving specific AI algorithms, and chips customized for AI will become the underlying core technology in the artificial intelligence industry chain.

As an innovative form of chip architecture, in-memory computing breaks through the problem of storage walls, and its essence is the embodiment of the acceleration of Multiply Accumulate (MAC) operations. It is highly compatible with the basic operators in the deep learning network computing model, making the Compared with the existing AI acceleration chips in the market, chips with internal computing architecture have an order of magnitude improvement in computing efficiency (TOPS/W).

In the smart age, from wearables to autonomous driving, computing efficiency in scenarios under power consumption constraints is an eternal theme. In-memory computing is one of the most powerful weapons to liberate computing power and improve energy efficiency.

 Source: "Advanced Storage and Computing Integrated Chip Design", Zhihu Chen Wei's exploration of the core

2. Applicable industries/scenarios for integrated storage and computing chips

(1) Scenarios with small computing power: the edge side is very sensitive to cost, power consumption, delay, and development difficulty

The computing power of the storage-computing integrated chip in the early and mid-term is small, starting from a small computing power of 1TOPS and going upwards. It solves audio, health and low-power vision terminal-side application scenarios, and chip performance and power consumption issues for AI implementation.

We predict that the market volume of smart devices connected from the edge will grow rapidly, the coverage area of ​​smart products will become larger and larger, and the diversity of product forms will usher in explosive growth. It is foreseeable that due to transmission delay or data security considerations, many data processing and reasoning operations will occur on the end side.

(2) Large computing power scenario: GPU cannot compete with dedicated acceleration chips in terms of computing power and energy efficiency at the same time

In the current cloud computing computing power market, the single GPU architecture can no longer adapt to the discretization of algorithms in different AI computing scenarios. For example, in the image, recommendation, and NLP fields, there are their own mainstream algorithm architectures.
With the continuous improvement of the computing power of the memory-computing integrated chip, the scope of use has gradually expanded to the application field of large computing power. For large computing power scenarios >100TOPS, provide high-performance, large computing power and cost-effective products in the fields of unmanned vehicles, pan-robots, intelligent driving, and cloud computing.
Storage computing technology can support the computing power that can only be provided by matching traditional structures + advanced nodes under mature manufacturing processes, saving manufacturing costs, and bypassing problems such as process blockade.
The requirements for autonomous driving are very high, and the computing power, reliability, and stability need to meet the standards at the same time. It will take several years. At present, there are still process challenges and iterations, and it is still not at the level of the data center.

 

3. Other extended applications of storage-computing integration: sensor-storage-computing integration, brain-like computing

As the basic principle, the integration of storage and calculation has also derived innovative technical directions such as integration of sensory storage and calculation, and brain-like computing:

(1) Integration of sense, memory and calculation:

Traditional chips need to use sensor chips to collect information, storage chips for storage, and computing chips to process data. The integration of sensing, storage and computing integrates sensing, storage and computing. On the basis of the integration of storage and computing, sensing is added, and the three-in-one improves the overall efficiency.
The calculation is performed on the AI ​​storage and calculation integrated chip contained in the sensor itself to realize intelligent processing with zero delay and ultra-low power consumption.
From the perspective of research results, it includes three categories: pressure, optics, and gas; from the perspective of current application directions, it includes the realization of more efficient machine vision and brain-like computing.
(2) Brain-inspired computing:
Brain-inspired computing, also known as neuromorphic computing, is a general term for computing theory, architecture, chip design, and application models and algorithms that draw on the information processing model and structure of the biological nervous system.
Trying to learn from the physical structure and working characteristics of the human brain, let the computer complete specific computing tasks, thereby processing information at high speed, which belongs to the field of high computing power and high energy efficiency.
The integration of storage and computing is naturally a technology that combines storage and computing. It is naturally suitable for application in the field of brain-inspired computing and has become the key technology cornerstone of brain-inspired computing.
 

5. Industry Status and Future Trends

1. The current challenges faced by integrated storage and computing technology:
integrated storage and computing technology is a very complex comprehensive innovation, the industry is not yet mature, and there are still many challenges in the industrial chain, such as insufficient upstream support and mismatching downstream applications. However, many challenges also constitute a comprehensive barrier that can be built in the future for the current integration of storage and computing innovation.

 

2. The development trend of integrated storage and computing technology: higher precision, higher computing power, and higher energy efficiency.

 

3. Talents and ecological problems faced by the current industry:
(1) As a new field, there is a shortage of compound talents in integrated storage and computing chips, and more talents are in academia.

Completing the product development of the memory-computing integrated chip requires strong academic originality (storage-computing integrated architecture and compiler design, storage-computing-related quantitative algorithm development, etc.) and engineering practice capabilities (scene understanding ability, chip implementation) ability).
(2) The ecological incompleteness from upstream to downstream is both a challenge and an opportunity.
The large-scale implementation of storage-computing integrated chips requires vigorous collaborative research and development, promotion and application with industrial ecological partners such as chip manufacturers, software tool manufacturers, and application integration manufacturers.
It is necessary to have a set of convenient and available tool chain and software, so that the migration cost of the purchaser is low.
Compatible with the existing software ecology, so that the purchaser can use it "no sense", such as directly using the existing GPU training software framework.
Guide purchasers to gradually cut into the special tool chain for model adaptation, compression, etc., make better use of the advantages of integration of storage and calculation, and gradually establish an ecology.
 

6. Industry-related enterprise analysis

At present, my country's storage-computing integrated chip innovation companies and overseas innovative companies are in the stage of going hand in hand. They are jointly exploring the industrialization and application scenarios of storage-computing integration technology. In the huge application scenarios of the AIoT era, my country's storage-computing integration field is expected to produce world-leading innovations in the future. enterprise.

Domestic memory computing integrated chip companies include: Apple Core Technology, Houmo Intelligence, Zhicun Technology, Yizhu Technology, Zhixin Technology, Qianxin Technology, Jiutian Ruixin and other innovative companies; foreign companies such as Mythic and Syntiant.
The following is a brief introduction of some domestic and foreign deposit and calculation integration enterprises:

 

 

 

 

 

 

 

 

 Appendix: Some product progress and performance of major players in the track

 

Guess you like

Origin blog.csdn.net/GZZN2019/article/details/130974359