From the big model to the small model, who will be the big winner of the ChatGPT layout to B industry?

ChatGPT Gold Rush

At present, the ChatGPT boom has broken out, attracting many technology companies to join in one after another. This is very similar to the gold rush in California in the western United States.

History will always be surprisingly similar, and the ChatGPT chat robot is like a gold mine in the digital age. Technological gold diggers from all over the world flock here. From the tide to the ebb, when the tide recedes, it may not be the gold diggers who survive, but the group of people who sell shovels, jeans, and other basic tools for gold panning.

 

Standing at the forefront of the ChatGPT industry, in addition to the evolution and development of super-large models, more small models will be born to focus on serving vertical fields, and benefit thousands of households, which will be one of the inevitable trends. From big model to small model, who will be the big winner in the future? With this question in mind, we might as well take a look at the computing power and economic accounts behind ChatGPT.

01

"It seems more than that"

ChatGPT drives server and GPU growth

In fact, ChatGPT has been booming in the industry, not only has led to new concepts in the development of technology companies such as the Internet, IT, and cloud computing, but it can also be seen that the current chatbots have also stimulated the market growth of servers and GPUs. After all, the parallel computing architecture GPUs are more suitable for large-scale AI training and reasoning. What can be seen in China right now is that Inspur Information, Sugon, H3C, and Great Wall in the server field have also benefited from it.

It seems to be more than that. A professional in the cloud computing industry pointed out that ChatGPT has further promoted the development of Cloud Financial Management cloud financial management, which is now the industry's most talked about FinOps, which is a combination of "Finance" and "DevOps". , emphasizing cost management and resource optimization in the operation and maintenance process. However, if FinOps is to achieve a more intelligent purpose, it also needs the strong support of computing power behind it.

People in the industry said with a smile, what is artificial intelligence? As the name implies, there are as many intelligences as there are artificial intelligences. Intelligence is not born out of thin air, it needs to rely on deep learning algorithms to "harden" big data, and the process of training large language models (Large Language Models, LLMs) must rely on the support of powerful computing power. While seeing ChatGPT being hyped, the computing power consumed by related technology companies that have already entered the game is also constantly surging.

Before understanding the computing power and economic accounts behind ChatGPT, we need to understand what are large language models (LLMs)? At present, the industry has a clear definition of large language models (LLMs). With deep learning algorithms for training, a large amount of corpus data is used to learn the probability distribution and grammatical structure of the text, and automatically generate a large number of high-quality new texts similar to the corpus. Continuous Training can improve generation quality. At present, large language models (LLMs) can realize applications such as interactive question answering, text recognition, text classification, text generation, and code generation. But large language models (LLMs) are currently unable to identify inauthentic corpus data. Therefore, in the process of adopting correct corpus data, it also stimulates the development of relevant data standard companies in the industry.

 

Classic models involving LLMs such as BERT, GPT-3, Megatron-Turing NLG, GPT-4, etc. Large language models (LLMs) are often trained on large-scale datasets with large amounts of data, such as GPT-3, which has about 175 billion parameters and is trained on 570 gigabytes of text. The development of GPT-4, the latest launch of OpenAI in March 2023, marks the beginning of the rise of large-scale multi-modal AI. Some people in the industry have previously estimated that GPT-4 will have more than 1 trillion parameters. Although OpenAI has not announced the specific parameters of GPT-4, according to DeepMind research, GPT-4 will be slightly larger than GPT-3, and there will be 5 trillion training parameters required to achieve optimal calculation.

 

In fact, the amazing large-scale model training requires extremely high floating-point computing capabilities of the chip. Currently, ChatGPT uses GPT-3 large-scale language models (LLMs) for training. A GPT-3 training requires a total computing power consumption of about 3640PF-days, which is equivalent to 1 quadrillion calculations per second and takes nearly 10 years. time. This will mean that hundreds of millions of dollars will be needed to invest in several large-scale data centers, each with a computing power of 500P, to support it. Analysis from DeepMind shows that in order to minimize the training loss, the number of FLOPs per second required to train GPT-4 will be 10-20 times that of GPT-3.

 

In terms of cost, according to Chuan Li, chief scientific officer of Lambda Labs, the single training cost of GPT-3 with 175 billion parameters is expected to reach millions of dollars. Compared with DeepMind's research, the cost of a single training GPT-4 is estimated to reach tens of millions of dollars.

According to SimilarWeb data, the total number of visits to ChatGPT’s official website in January 2023 is 616 million; according to Fortune magazine, each time a user interacts with ChatGPT, the cost of the computing power cloud service is about $0.01. ChatGPT training is based on the GPT-3.5 model, and the basic parameters will naturally not be less than the GPT-3 model. Assuming that the unit cost of computing power is fixed, the computing power required for ChatGPT’s monthly operation is estimated to be about 4874.4PFlop/s-day, and the corresponding cost of single-month operation will reach several million dollars.

It should be pointed out that in order to support the large-scale model training of GPT-3, GPT-3.5, and GPT-4, OpenAI has built a special supercomputer with tens of thousands of NVIDIA high-end A100 GPUs, and its infrastructure cost is as high as hundreds of millions of dollars. .

After a little calculation of the computing power and economic accounts behind ChatGPT like this, the "blowout" investment of giant players is all surprising. It now appears that the early development of ChatGPT based on large-scale language model (LLMs) training and reasoning may only be dominated by a few global technology giants.

However, no matter what kind of training is conducted for ChatGPT, any "gold rush" entrant must have the same rigid need. They hope that the GPU computing power supporting the training platform will be more efficient and cost-effective. This is related to the early stage of any entrant Investment and research return.

So, who will provide better GPU support tools for ChatGPT "gold rush" entrants? Worth pondering.

02

"From Giant Players to Vertical Industry Applications"

In the future, the model will be smaller to have more opportunities

However, for the "gold rush" of vertical industries, such mainstream players of ChatGPT should not be Microsoft, Google and other technology giants who are currently investing heavily in ChatGPT. After all, they are still keen on large-scale model training for large applications such as search engines. Of course, Baidu, Tencent, Ali, ByteDance, JD.com, 360, HKUST Xunfei and other well-known domestic technology companies have also participated in it one after another. However, these technology companies are more focused on their existing business systems to support ChatGPT and have begun to be enthusiastic There are still relatively few ChatGPT vertical industry layouts.

It can be seen that the protagonists who focus on the development of the ChatGPT vertical industry should be those software developers with strong integration capabilities.

"In the next step, once ChatGPT focuses on the development of vertical industries and is applied in thousands of industries, it will inevitably make the model smaller." Wang Kun, CEO of VirtAI Tech, holds the same view as many experts in the industry.

From a further analysis, the "industrialization" of ChatGPT may be better commercialized. Some people may also have questions: Why does ChatGPT tend to miniaturize large models as it moves towards vertical industries? From large models to small models, there are four major influences, which are very beneficial to the popularity of the ChatGPT industry.

One is to lower the training threshold, reduce high computing power and high investment, and allow more companies to participate. As mentioned above, for the training of large models such as GPT-3, GPT-3.5, and GPT-4, it is necessary to have super-normal powerful computing support and huge cost investment. Technology application innovation brings great challenges. Only by lowering the threshold can it be possible to achieve the subsequent popularization of the ChatGPT industry.

The second is to focus on professional fields, which will help improve the quality of data sets and accelerate the quality of ChatGPT training. On the premise of correct data labeling, high-quality data sets determine the quality of chatbots. The larger the data set, the higher the accuracy of ChatGPT training. According to OpenAI, the newly released new-generation multimodal model GPT-4 has more parameters and a larger data set than GPT-3.5, which has achieved huge improvements in security and accuracy, and has the possibility of responding to restricted requests. 82% lower in sex; 60% lower in the possibility of fabricating content.

 

Technology giants such as Microsoft and Google, if they want to create industry-wide chatbots, they must base their training on super-large-scale data sets covering all fields. In order to ensure the quality of data, it is necessary to strengthen the quality of data through cleaning and standards. Truth, Accuracy, Completeness and Timeliness. For this reason, doing a good job of data labeling has also become a key part of achieving better training results. According to the data of Time Magazine in the United States, for ChatGPT, OpenAI cooperated with outsourcing companies to hire a large number of people to do data labeling services. Even if we already have a large-scale data set, but in order to meet more needs of subdivided industries, the effect that ChatGPT can currently achieve is still beyond our reach.

Once ChatGPT is promoted to enter vertical subdivision industries such as medical care, banking, securities, transportation, etc., the data set focusing on a certain vertical industry is much smaller than that of the entire industry and society. Small but refined, small and specialized. The more focused the industry is, the more conducive it is to improve the quality of the data set and achieve a leap in the quality of ChatGPT training.

The third is that it is easier to combine ChatGPT with the needs of vertical industries to bring out the value of industry applications. Rooted in the application needs of different industries, build a relatively independent, logically clear, and data-accurate industry corpus database to promote ChatGPT training to achieve better results. More accurate ChatGPT training will also be integrated into vertical industries faster. Small vertical industry models and professional field datasets reduce the scope and intensity of enterprise training, reduce the overall training cost, and drive ChatGPT to B enterprise level. At that time, it will be the commercial development stage of ChatGPT, which will bring about ChatGPT, an industry of contention among hundreds of schools of thought.

The fourth is to promote the cloudification of ChatGPT, build a cloud ChatGPT model and tool set, and the extensive innovation channels of the public cloud will accelerate the popularization of ChatGPT. For example, Amazon Cloud Technology is working with related technology company Hugging Face to create a Bloom similar to the ChatGPT model, and is also working with Stability AI to build an image tool similar to OpenAI's Dall-E. These models and tools will be released based on the public cloud.

Of course, Microsoft, which invested hundreds of millions of dollars in OpenAI, first launched OpenAI services on its Azure public cloud service platform, and then extended ChatGPT technology to Power Platform to help developers achieve low-code or no-code development. And so on, there will be more public cloud vendors combining ChatGPT with public cloud in the future, or more software developers like ChatGPT will cooperate with public cloud to accelerate the process of cloudification of chat robot related technologies, and will further expand the popularity of ChatGPT Spend. This will not only affect individual C-end users, but also B-end enterprise users, and bring an agile and effective way for ChatGPT application optimization.

It is true that from a few giant players to more vertical industries, it is in line with the objective law of industry development that ChatGPT-related models will become smaller in the future. However, in Wang Kun's view, the growth of any industry will go through three stages of development, from make it work to make it perform to make it cheap. ChatGPT is currently in the usable stage of "make it perform", and it will inevitably move to a stage of easy-to-use and affordable use in the future, so that ChatGPT can truly integrate into vertical industries and enter the "home of ordinary people".

The reason for emphasizing the development law of "from usable to easy to use to affordable" is that only after the "make it cheap" stage of ChatGPT is truly realized and a professional chat robot that meets the needs of industry applications is established, can it be used by everyone. There will be large-scale popularization, and the real commercialization of chatbots will be realized. From the enthusiasm for "molesting" ChatGPT to the commercialization of industrial applications, it will demonstrate its unprecedented industry value and allow more companies interested in walking on the ChatGPT gold rush road to succeed. .

03

"Software Defined AI Computing Power"

GPU pooling brings ChatGPT into thousands of industries

Since the size of the model fundamentally affects the computing power and cost required for ChatGPT training, it is necessary to work hard on GPU computing power and cost in order to send ChatGPT to thousands of industries.

Before the ChatGPT boom, VirtAI Tech, established in 2019, had already brought the original technology of "GPU resource pooling" to the industry, and it has been widely used in many vertical industries such as banking, securities, energy, education, etc. The large-scale landing and successful application have achieved higher GPU utilization and lower application costs for users' needs for "cost reduction and efficiency increase".

 

Perhaps this is like a technological shovel, which will become a powerful tool to promote more ChatGPT entrants to enter thousands of industries and "gold rush".

Relevant data show that in terms of computing power, the average utilization rate of global GPU users is less than 15%. According to data released by Amazon Cloud Technology on AWS re:Invent, the average utilization rate of its GPU products is only 10-30%. The utilization rate of many domestic users is even less than 10%. For these users, a $10,000 chip may cost $9,000 in vain.

Training for ChatGPT requires the strong support of the underlying AI server. The higher the efficiency, the better the performance, the more valuable the results of ChatGPT training can be. The key is to maximize the capabilities of the GPU. In terms of supporting ChatGPT for model training in vertical industries, users in the industry need to build more efficient servers and GPU computing platforms. For any industry user, once the ChatGPT model training is launched, the demand for computing power will inevitably become prominent, and reducing costs and improving the efficiency of computing power will be an inevitable choice.

Throughout the development of GPU, the evolution route of GPU resource utilization can be divided into four stages: simple virtualization, arbitrary virtualization, remote call, and resource pooling. When it comes to GPU pooling, some people tend to think of traditional GPU virtualization technology, or GPU slicing technology. However, it should be noted that the traditional GPU virtualization technology is based on hardware thinking and can only virtualize the GPU on the local physical machine. The GPU resource pooling technology is based on the entire data center. It can not only support local GPU virtualization, but also break the physical boundary of single-machine resource scheduling, allowing users to transparently use any number of GPU resources from any brand manufacturer on any physical machine. As an innovative provider of AI computing resource pooling solutions, OrionX of TrendForce fully supports complete resource pooling in various environments such as bare metal, virtual machines, containers, and K8S.

Global Cloud Observation analysis pointed out that only by standing at the height of the entire data center and solving the problems of low GPU utilization, high cost, and difficult allocation and management can we truly support ChatGPT's data set training for vertical industries of various sizes.

According to the different procurement needs of users, two flexible business models have been adopted. One is to adopt private deployment, which is similar to VMware's software authorization sales method, and the other is to adopt a public cloud-based service method similar to Flink.

For the first privately deployed business model, TrendForce focuses on software-defined AI computing power. With the innovation of GPU pooling, it supports ChatGPT's industry model training to reduce costs and increase efficiency. It is more possible to send ChatGPT to thousands of industries. For this reason, Trend Technology's innovation goal in the field of GPU software is very clear, to build the "VMware" of the GPU industry, and to take the road of standardized commercial software. Trend Technology enables customers to build an elastic, dynamic, flexible and efficient software-defined GPU computing power resource pool, and converts the static allocation of all GPUs into dynamic allocation according to the user's specific hardware and software requirements, flexibly supporting users to realize GPU utilization Maximize the efficiency, increase the utilization rate by 3-5 times, and help ChatGPT training in vertical industries to reduce costs and increase efficiency.

For the second public cloud-based service method similar to Flink, TrendCloud created by TrendTech is based on its deep accumulation in the field of computing power resource pools and development and training platforms, and is aimed at enterprises, scientific research and individual AI developers. Build development and inference training services. By connecting global computing power, Trendcloud can provide users with AI computing power with low cost, high on-demand guarantee, and no vendor lock-in; by providing optimization services for the entire process of AI algorithm development, and building a global developer and project resource sharing community , Trend Cloud can help AI developers quickly implement best practices. Relevant tests have shown that using the same model training to achieve the same accuracy, the current use of computing power pooled by the trend cloud is 60% lower than the cost of the public cloud.

So far, Trend Cloud has been online for about 5 months, and has gained nearly 10,000 registered users and thousands of active users. Most of the active users come from teachers and students of AI-related majors in universities around the world, small and medium-sized enterprises and individuals with AI algorithm teams. Development enthusiast. At present, the main components of the AI ​​​​developer ecosystem in the early stage are to cultivate the future user base of the GPU pooling model. Trendcloud implements the "carry-in bag" method for AI developers, and opens a large number of data sets and source code sets. get extended.

In addition, some people in the industry have said that the end of AI is computing power, that is, GPU. Moore's Law is still maintained in the field of computing chips, largely due to the rapid rise of GPUs to make up for the slowdown in the development of CPUs. At present, the number of transistors of GPU has increased more than that of CPU, and CPU transistors have begun to lag behind Moore's Law. This is an era of data intelligence, and it will also be an era where GPUs that are good at floating-point calculations will shine.

There will be times when the wind and the waves cleave, hang the clouds and sail straight to the sea. For Trend Technology, which has user experience, technology accumulation, and industry ideals, in the face of a rare opportunity, it will strongly support ChatGPT and other artificial intelligence technologies to achieve more industry applications, and strengthen the development of related technological innovations in the GPU field. Seize the unprecedented development opportunities, continue to polish ChatGPT's powerful tools for vertical industry "gold panning", and further enrich the AI ​​computing power resource pooling solutions, which will surely win its own future on the road to the digital development of China's industry.

Now or never. Standing at the forefront of the ChatGPT industry, Trend Technology is facing the wind.

(by Aming)

- END-

you

How

What?

look

Comments at the end of the article are welcome!

[Global Cloud Storage Observation|Amin Observation|Technology Explanation] Focus on the analysis of technology companies, use data to speak, and show you how to understand technology. This article and the author's reply only represent personal opinions and do not constitute any investment advice.

Guess you like

Origin blog.csdn.net/qq_41689867/article/details/130053327