How difficult is it to land a vertical model?

657ac6151ee897e8a2eb754046f5a2d9.gif

Original source: those things on the Internet

7d99660e6dcea32df9519b655556e926.jpeg

Image source: Generated by Unbounded AI‌

The current status of the large-scale model track: On the one hand, start-up companies are based on open-source large-scale models, and on the other hand, large-scale manufacturers are involving various large-scale model parameters. 

According to the incomplete statistics of the organization, there are currently 79 large-scale models with a parameter scale of more than 1 billion in China that have been released. In the process of large parameter involution, another voice began to appear in the market, "parameter improvement without development direction is meaningless." 

For this reason, in terms of development direction, some large models have been focused on the application of vertical fields. Based on the development of thousands of models, the base may change, but if you think about it carefully, there will always be someone who can run out of the vertical industry. 

At the same time, in the early stage of development, although closed-source large models are better in quality and relatively safe, the large-scale model ecology needs a certain degree of involution after all, and open source can actually promote the prosperity of large models. From another perspective, based on open source, many companies have the qualifications to participate in the track, but there are always some people who easily fall in the first level - the shortage of computing power. 

After all, the number of large models is increasing in multiples, but if we look at the increasing number of large models one-sidedly, then to some extent, we will ignore the choices, struggles, and even problems of some companies behind the large models. The possibility of giving up after choosing. 

As we all know, the three elements of artificial intelligence are: computing power, algorithms and data. Open source is only in the algorithm stage, after which enterprises need a lot of computing power support and data training, the cost behind this is high. 

96a7ef508bd05ddca8a18d09d838aae4.png

01

vertical mockup,

Is there still hope for start-up companies?

In the selection of open source large models, based on cost and custom development reasons, there are not a few entrepreneurial enterprises that choose small parameter models, and they are even the first choice of such enterprises. 

One is the pre-training cost issue. 

Guosheng Securities once estimated that the cost of GPT-3 training is about 1.4 million U.S. dollars, and for some larger LLM models, the training cost is between 2 million U.S. dollars and 12 million U.S. dollars. 

Including in January this year, an average of about 13 million unique visitors use ChatGPT every day. The corresponding chip demand is more than 30,000 NVIDIA A100 GPUs. The initial investment cost is about 800 million US dollars, and the daily electricity cost is about 50,000 US dollars. 

What's more, before a lot of money is invested, a lot of data resources are needed to support model training. Another reason for this is the issue of pre-training requirements. 

Some people in the industry have also expressed their views on this: "The generalization ability of the large model itself is still limited by the data." 

b30ca2886674eda6f4eda0452ee2bfb2.png

Because once the high-quality data of the large model is screened and trained too little, the problem of the output quality of the large model will be obvious, and the user experience will be greatly reduced in terms of experience. 

It can be said that in the process of pre-training, a lot of money and time have been spent just on the accumulation of data. 

What's more, in the large-scale model track, most startups develop around the vertical field of the industry. Although the effort is relatively small, it must not be easy. 

Specifically, if a large model wants to change the business model of the industry, then the simplest criterion for judging this is whether the large model of this type has enough industry data, for example, it is necessary to analyze the black products hidden in the dark. Only with enough understanding can we not be used by black products and be in a safe and passive state. 

b2b85267bdf4418530b3c27d7c11c1c2.png

Another criterion for judging is the quality of the final output of the data processed by the large model when it is running. 

After all, if you want to break the model monopoly based on the open source model, you need to optimize and improve a large amount of data, and invest in enough infrastructure. 

Today's open source model is actually more like Android in the Internet age. It is not easy for start-up companies without the advantages of large manufacturers' landing scenarios and data accumulation to develop, but there are still opportunities. 

In fact, Bodhidharma Institute once regarded "cooperative development of large and small models" as one of the future trends. 

Even the start-up company Zhuiyi Technology believes that "the vertical large model is a solid opportunity, just like the discovery of the American continent is far more than just one person." 

f79fed03bc88d93482e88e01271fccbb.png

So now we can see that many startups have begun to choose to enter the large-scale model track, including DriveGPT Xuehu Hairuo, Qizhi Kongming, and ChatYuan Yuanyu launched by AI startups such as Momo Zhixing, Innovation Qizhi, and Yuanyu Intelligence. and other large models. 

However, although there are no domestic products for the C-end, based on the B-end, major manufacturers have begun the process of initial implementation. 

It is reported that major manufacturers are currently planning to export large models through the cloud. Cloud computing has become the best way to implement a large model. Model as a service (MaaS) has attracted more and more attention, and this will also bring the cost of large models. decrease. 

So, is there still hope for startups? 

a2aee860c6df4c089dc82fce135651e0.png

02

Win or lose is that the product experience matches the market demand?

According to the prediction of the authoritative magazine "Fast Company", OpenAI's revenue in 2023 will reach 200 million US dollars, including the provision of API data interface services, chat robot subscription service fees, etc. 

Obviously, there is a demand for large models in various industries, but based on safety considerations and to B's attitude towards large models, the current safety factor of large models is limited. Therefore, on a relatively basic basis, large Internet companies are also giving priority to high-demand dialogues, document content generation, and question-and-answer scenarios, including dialogues in collaborative office, document generation, and many other scenarios. 

For example, now humans only need to tell AI about the product information, let AI automatically generate a variety of styles of product delivery scripts and styles, and then assign a digital human anchor to help companies sell the goods. According to Baidu, compared with live broadcasting, digital live broadcasting can achieve 7*24 hours of uninterrupted live broadcasting, and the conversion rate is twice that of unmanned live broadcasting rooms. 

With cloud infrastructure as the necessary base for large-scale entrepreneurship, Internet giants with cloud computing have certain advantages. 

According to the 2022 global cloud computing IaaS market tracking data released by IDC, the top 10 market share players are all large companies in China and the United States, including Amazon, Google, Microsoft, and IBM in the United States, and Ali, Huawei, Tencent, and Baidu in China. 

46bd2120a9a73593b532248a63a08091.png

Although the open source and closed source disputes of large models will not end with the emergence of one or several products, more top talent participation, technical iteration and financial support are needed. 

But compared horizontally, many AI start-up companies also lack the luck of the startup unicorn company MiniMax. (The difference is that MiniMax focuses on general large models) 

On July 20, Tencent Cloud disclosed the latest progress in helping MiniMax develop large models. At present, Tencent Cloud supports MiniMax's kilocalorie-level tasks to run stably on Tencent Cloud for a long time, with an availability of 99.9%. 

It is reported that starting from June 2022, based on product capabilities such as computing power clusters, cloud native, big data, and security, Tencent Cloud has built a cloud architecture for MiniMax from the resource layer, data layer, and business layer. 

The reality seems to prove once again that getting the admission ticket is the first step, and the next test is the ability of market players to explore commercialization and technology upgrades. To put it bluntly, AI start-up companies want to run to the end on the track, and they must not miss every step. 

f2f19564dab03f753572923547831ed3.png

To some extent, start-up companies are not without advantages in the development of large models. 

Although some major Internet companies have already realized initial scenarios, or started to sell services to earn income, the eyes of major companies and MiniMax are more focused on general-purpose large models. 

The vertical mockup is still a vacuum. Especially for traditional enterprise groups, considering the low IT attributes of their own businesses and the low input-to-production ratio, the probability of choosing a large self-developed model is low. 

For example, Chuangxin Qizhi focuses on the industrial large-scale model product "Qizhi Kongming"; has a certain data advantage and develops a large-scale model of language in ChatYuan; the main self-driving generative large-scale model DriveGPT Xuehu · Hairuo.

However, there is one thing to say, the training data and direction are different, and the cost varies greatly. 

First, the cost of training a large metalanguage model from scratch can reach tens of millions of RMB. In the field of generative autonomous driving, it is necessary to design a new language than ChatGPT, and then "translate" all real road driving data into a unified language. cost input. 

7f8edeaab74bb6c8a7c06deb8a948ebc.png

To a certain extent, AI start-up companies can realize a large amount of investment in large models, and more benefit from the success of ChatGPT in business and marketing, which can instantly let people witness the feasibility of large models, instead of continuing to hide in the In the long technical iteration. 

For this reason, the first step to realize the current implementation is that the training cost and reasoning cost of the large model must be lower than that of search, and the immediacy can also be guaranteed. 

03

From concept to implementation,

How difficult is it?

There is a view that the Chinese large-scale start-up companies that can run out are likely to be vertically integrated. 

To put it simply, while making the underlying large model, identify the final main application scenario of a model, and collect user data and make rapid iterations. 

Visually, metalinguistic intelligence is more inclined to this category. To sum up, for a long time, meta-language intelligence has focused on the business of large natural language models. 

Yuanyu COO Zhu Lei also said, "We will not blindly expand the image and video business just to follow suit. Good business focus is important." 

However, for other start-up companies that are developing into large-scale vertical models such as autonomous driving and industrial production, they may lack the knowledge of some special industry data. 

9211e58bc037e42068835fde0a2d9b5d.png

After all, in the vertical large-scale model track, a core factor of future enterprise competition is private data and private experience. When the process of an individual company is not known to the large-scale modelers, it may have unique competitiveness. 

In addition, in the process of business focus, the accuracy of data from source to pre-training and output is also required. 

Generative AI is also currently receiving more regulatory attention. Recently, China released the "Generative Artificial Intelligence Service Management Measures (Draft for Comment)", which clearly requires that there should be no discrimination, the generated content should be true and accurate, and false information should be prevented. If there is, in addition to content filtering, model optimization and so on for optimization. 

However, if it is an inherent defect of generative artificial intelligence, it is technically difficult to guarantee and completely solve it. 

d073f5fb8092f23e23619b5fba895ced.png

In addition, with the emergence of a better open source model, there will be an influx of more companies eager to try. For start-up companies, is this not competition? 

For example, the current Llama 2, on July 18, Meta released the commercial version Llama 2 of the first open source artificial intelligence model Llama. Some companies believe that, according to the current various evaluation documents, in addition to the poor coding ability, in fact, many places have begun to approach ChatGPT. 

Perhaps the frenzy of the open source community in the future will popularize large-scale models with basic capabilities, and privatized large-scale models will be the price of cabbage in the future. To put it bluntly, companies can use the privatization model very cheaply. 

More importantly, Tang Daosheng once said: "The general large model has strong capabilities, but it cannot solve the specific problems of many enterprises. It can solve 70%-80% of the problems in 100 scenarios, but it may not be able to solve the specific problems of many enterprises. 100% meet the needs of a certain scenario of the enterprise. However, if the enterprise conducts fine-tuning based on the industry's large model and its own data, it can construct a dedicated model and create highly available intelligent services." 

Of course, this kind of privatization model has not yet come, but the startups in the track must have both opportunities and difficulties. 

Babbitt Park is open for cooperation!

3a753a87006d86779f25242d20554cd5.png

dd33cabcf15315543e1b52be08bd77a9.jpeg

5194b6566872afebdc7cbcca750e3b5e.gif

Chinese Twitter: https://twitter.com/8BTC_OFFICIAL

English Twitter: https://twitter.com/btcinchina

Discord community: https://discord.gg/defidao

Telegram channel: https://t.me/Mute_8btc

Telegram community: https://t.me/news_8btc

1e8a8a7ae34947c3148e494ca29b9110.jpeg

Guess you like

Origin blog.csdn.net/weixin_44383880/article/details/132094810