The financial industry is ushering in the era of big models, and building a storage and computing infrastructure is the key to success

At the end of last year, ChatGPT was born, shocking users around the world with its powerful and accurate natural language understanding and generation capabilities.

Since then, all walks of life have participated in large-scale model research and development competitions, setting off a new round of technological innovation. This is especially true in the financial industry. How to build new computing power and storage infrastructure for the era of large models and realize the migration of large model capabilities to the financial field has become a hot topic among financial institutions.

In what scenarios are large financial models useful?

As a new infrastructure for AI, large models have a wide range of application scenarios in the financial industry.

At the front desk , intelligent customer service is one of the most common application directions of AI in the financial field. Do you still remember Jarvis, the AI ​​butler in the Iron Man movie? The large financial model will greatly improve the professional level and service capabilities of account managers, significantly reduce the operating costs of account managers, and allow everyone to have a 24-hour online professional account manager similar to Jarvis.

In the middle office , AI large models have the opportunity to change the way of knowledge acquisition, content creation, meetings and communication, code development and testing within financial institutions, improve internal office efficiency, and even trigger changes in R&D testing models, and comprehensively improve the internal operational efficiency of financial institutions. .

In the background , large models will become standard features of smart technology bases, significantly lowering the threshold for smart technology applications. Only a small amount of annotated data is needed to enable smart technology to cover a wide range of scenarios.

In short, AI large models have excellent capabilities in content generation and creation, information abstraction and summary, knowledge understanding and question and answer, natural interaction and dialogue, etc., and have broad application prospects in the financial industry.

With the scale of 10,000 cards and trillions of parameters, large models have “high thresholds”

Rapid iteration of large models requires the support of efficient computing power and storage infrastructure.

On the one hand, computing power is the engine of large models. The capacity of language and visual models and the corresponding computing power requirements are rapidly expanding. Behind the development of large financial models is the support of huge computing power. If we use "computing power equivalent" (PetaFlops/s-day, PD), that is, the total amount of computing power consumed by a computer running petaflops per second for one day, to measure the total computing power required for artificial intelligence tasks, Large model training requires hundreds or even thousands of PD computing power support, which also means huge computing power costs.

Computing power is the core element in the development of large models

For example, GPT-3, launched by OpenAI in 2020, requires at least tens of thousands of A100 GPUs for computing power , and the total computing power for one model training consumes approximately 3,640 PD. For another example, the "Source" Chinese language large model launched by Inspur Information has nearly 250 billion model parameters and consumes 4,000 PD of computing power. For another example, the current computing power equivalent of GPT-4 and PaLM-2 has reached dozens of times that of GPT-3. Not to mention, Gemini, the next generation multi-modal large model being developed by Google, has a training volume that exceeds GPT-4 by 5 times.

Rapidly rising AI computing power consumption and limited IT budgets have put most financial institutions in a dilemma: they want to build large models, but they lack resources, cost pressure, and talent are scarce; if they don’t build large models, they can only sit back and watch opportunities be missed. .

In this regard, divide and conquer may be a feasible approach. The so-called division is to divide large models into general large models and industry large models. Financial institutions do not need to build a general large model themselves, but based on a third party's general large model, and focus on building industry large models on this basis. According to the "Industry Large Model Standard System and Capability Architecture Research Report" released by the Academy of Information and Communications Technology, general large models lack professional knowledge and industry data, and the construction and training costs are high, making it difficult to achieve commercial use. In order to better solve the problems of specific industries, industry large models emerged at the historic moment. Industry large models can meet the needs of specific scenarios, better provide high-quality services to the industry, and promote the intelligent transformation and upgrading of the industry.

Guo Lei, an expert on AI server products at Inspur Information, said, "Financial institutions can concentrate resources on large industry models, not 'dig a trench one meter deep in a thousand meters of ground', but 'dig a thousand meters in a place of one meter'" deep'".

Four stages of large model training

Specifically, the first stage of large model training is the unsupervised pre-training stage. The training cycle often lasts from tens of days to several months, requiring thousands of GPU cards to calculate simultaneously, consuming huge computing power, and the training time is very long. The model is the basic language model. Financial institutions can obtain basic language capabilities by using open source platforms or third-party cooperation (such as Inspur Information's "source" large model). The second to fourth stages are the supervised fine-tuning stage, reward model training and reinforcement learning. These three stages require dozens or even hundreds of GPU cards to perform calculations at the same time. The scale of computing power consumption and training time are compared with those in the first stage. have dropped significantly, so financial institutions can train in these three stages and build large models with advantages in the financial industry.

On the other hand, computing power alone for large models is far from enough, and also depends on data scale and data quality.

The advantage of large models lies in their ability to collect, extract and analyze massive amounts of information, which is beyond the reach of humans.

Evolution of large model parameter scales

In recent years, the number of parameters of general large models has increased rapidly. In 2016, OpenAI released the Gym reinforcement learning platform. In 2018, GPT-1 came out with 117 million model parameters. After continuous iteration, the parameter scale of GPT-4 reached 1.76 trillion. Since Google released the Transformer (65 million parameters) architecture in 2017, it has successively released BERT (300 million parameters in 2018) and T5 (11 billion parameters in 2019), and the parameter scale has gradually increased. Recently, Google released the generalist model PaLM-E, which is the world's largest visual language model so far, containing 562 billion parameters.

In vertical industries, the data set of a large financial model needs to include professional knowledge in such directions as financial research reports, stocks, funds, banks, insurance, etc. based on the general large model. By adding a large amount of financial dialogue data in the training process and targeting financial Perform specific pre-training and tuning in various fields to improve its performance in the financial vertical field.

At the same time, multi-modality and cross-modality have become the norm, and the data types of large financial models have become more abundant. Among them, unsupervised data, that is, original data, the data format can be web pages, text or voice data; supervised data, that is, labeled data, can be in json or Query format. In addition, in order to provide investors with services such as real-time market opinion and risk prediction, financial institutions must efficiently process financial data such as financial industry news, stock transactions, and even social comments. These huge, multi-modal, real-time new demands and characteristics of financial data are difficult for traditional centralized storage to cope with, and require a new elastic and flexible distributed storage architecture to support them.

It can be seen that with the evolution of large financial models, the entire data center architecture will change. Full-stack solutions from AI servers to storage to networks need to adapt to the needs of the large model era.

How can infrastructure be “saved, calculated quickly, and transmitted stably”

Only when data can be stored, computing power can be calculated quickly, and networks can be transmitted stably can digital infrastructure fully leverage the value of data elements, promote the application of large models, and drive the prosperity and development of new business formats.

In this regard, based on the smart computing strategy, Inspur Information promotes product innovation from four aspects: computing power, algorithms, data, and interconnection, and creates a strong foundation for large models.

In terms of computing power , Inspur Information has built a full-stack leading large-model computing power system solution in terms of computing power cluster construction, computing power scheduling and deployment, and algorithm model development through large-scale model innovation practices with hundreds of billions of parameters to help Large model training and development. Among them, the latest generation of converged architecture AI training server NF5688G7 uses the Hopper architecture GPU, which improves the measured performance of the large model by nearly 7 times compared with the previous generation platform. It also supports the latest liquid cooling solution, which can achieve lower cluster energy consumption and operating costs. The PUE is less than 1.15. Taking a 4,000-card intelligent computing center as an example, it can save 6.2 million kilowatt-hours of electricity and reduce 1,700 tons of carbon per year.

In terms of storage , Inspur Information’s generative AI storage solution uses a set of AS13000 converged storage to support the full-stage application of generative AI. It provides four media types: all-flash, mixed-flash, tape library, and optical disk, supporting files, objects, big data, and video. , block multiple protocols. Combined with the five stages of AIGC data processing: data collection, preparation, training, inference and archiving, Inspur Information provides end-to-end data flow support from the same set of storage to meet the needs of multi-modal data such as text, audio, image, video, code and so on. Storage and processing requirements.

Inspur information storage products

At the cluster high-speed interconnection level, Inspur Information realizes full line-speed networking of the entire cluster based on native RDMA and optimizes the network topology, which can effectively eliminate the computing bottleneck of hybrid computing and ensure that the cluster is always in the best state during large model training.

Currently, major state-owned banks, joint-stock banks and some city commercial banks have already carried out or plan to carry out large-scale financial model research and development, and AI computing power and data infrastructure will usher in rapid development. According to IDC predictions, the compound annual growth rate of China's intelligent computing power will reach 52% in the next five years, and the growth rate of distributed storage will reach twice the growth rate of the Chinese market. In the era of big models, financial institutions need to use AI scenarios and architecture as the starting point, and combine the data characteristics of each bank to create a new generation of intelligent computing infrastructure.

Guess you like

Origin blog.csdn.net/annawanglhong/article/details/133190300