8.23 Notes on China’s Big Model “Top Group Chat”

What was discussed in the closed-door communication between Alibaba Cloud and China's large model "Half the Country"?

Text | Zhang Peng

In the history of domestic technological innovation, there has never been a time like this for large-scale model technology, where a "consensus in the technology community" was established in just a few months.

I entered the technology industry in 1998 and have witnessed the changes in the PC era, the Internet era, and the mobile Internet era. I have never seen such a rapid "speed of reaching consensus." Take Founder Park, the entrepreneur community of Geek Park, for example. Because it paid early attention to technological changes in the field of large models, it gained 150,000 new followers in just 4 months, and the community members have expanded to 7,000 to 8,000. There are so many people.

Just yesterday, the first batch of domestically produced large models passed the registration process, igniting people's enthusiasm again. The registration system management means that the policy for the development of large models is relaxed, which also means that the commercialization and industrialization of large models in the country will truly start.

However, if the "consensus" is reached too quickly, there will be concerns, because this technology is still in the early stages of development, and it has not yet been able to reach a wide range of fields like mercury pouring down.

Objectively speaking, if we believe that large model technology has brought about the dawn of AGI, then we must honestly see that it has been truly commercialized and become a productive force, and we can only start exploring it now. The KnowHow and problems experienced by front-line startups are precisely the sparks most worthy of being gathered together.

Based on this idea, Alibaba Cloud and Founder Park invited more than 20 outstanding entrepreneurs in the model layer, tool layer, and application layer of China's large-scale model field to have a face-to-face closed-door exchange at Xixi Wetland in Hangzhou.

Alibaba Cloud Chairman Zhang Yong also gave this closed-door meeting a very good name - "Xixi Discussion". During this five-hour closed-door meeting, Zhang Yong sat next to me and participated in the group chat of entrepreneurs throughout the entire process. I saw that his notes filled several pages.

On August 23, a group photo of participants of Xixi Forum

It can be seen that how Alibaba Cloud, as the computing infrastructure layer, should connect and co-create with these levels, and how to support entrepreneurs at all levels to make good use of large models, are the topics that Zhang Yong is most concerned about. This shows that Alibaba Cloud has a completely different attitude from other domestic companies. How to promote the prosperity of the large model ecosystem is what Alibaba Cloud is most concerned about.

These can be said to be the most active and active forces in the field of large-scale models in China. They chatted from two o'clock in the afternoon to nine o'clock in the evening. They had three-dimensional exchanges and collisions from multiple levels of the industry, and also learned a lot from their respective latest practices. Insightful perspective. According to them, during discussions here, they heard a lot of truth and a lot of "real feelings."

I have compiled some impressive points and share them with you in this article.

Focus on large models,

More attention needs to be paid to infra

Now anywhere in the world, when building large models, the most scarce resource besides talent is GPU.

Wang Xiaochuan, founder and CEO of Baichuan Intelligence, shared that he went to Silicon Valley and talked with friends. Nvidia’s GPU shipments per year were 1 million, but OpenAI said it would design a supercomputer with 10 million GPUs connected together.

So how much GPU is enough? Is there any solution to the limited computing power?

Kai-fu Lee, chairman of Sinovation Ventures and founder of Zero-One Everything, said that although tens of millions of GPUs are a fantasy, the violent aesthetics of "power can produce miracles" has a background. Richard Sutton, the father of reinforcement learning, pointed out in "The Bitter Lesson": In the past seventy years, I have wanted to put some knowledge into AI, want to add some capabilities, and want to adjust the model architecture. In the end, Found basically worthless. The only force driving AI progress over the past seventy years has been general and scalable computing power. The increase in computing power will lead to the advancement of algorithms and data. This is the background for miracles to occur.

Therefore, companies that emerge in this wave of large models must first have computing power, the "endowment" of a few people, and dozens of cards, or it may be more pragmatic to choose to call centralized large models.

" When there is relatively sufficient computing power, and under this premise, we can make good use of the computing power, we can make many things that cannot be made today using only open source and Llama2 (Meta's large language model) ." In the past, there was OpenAI, which set a new benchmark for models regardless of cost, and then there was Meta, which opened up the road to pave the way for everyone. In the turbulent and highly uncertain environment of large model entrepreneurship, this is Kai-fu Lee's thoughts on the new goals and new practices of large model companies. .

What is this play style? How can one GPU exert the capabilities of two or even three GPUs? This issue may require more attention to the team composition. Kaifu Lee believes that the Infra (hardware bottom layer) team must be stronger than the Modelling (model) team . He said that soon everyone will find that people who have done Infra for large models are more expensive and scarcer than people who have done large models; while those who can do Scaling Law (the law of expansion, model capabilities increase as the amount of training calculations increases) People are scarcer than those who can make large model Infra .

This is because an excellent Scaling team can avoid futile training. When training, it has a high probability of success. If it fails, it has the ability to stop immediately and has enough mathematical ability to do this. In addition, there are many subtle details and experiences. For example, reading a paper will save you a lot of detours, because some papers deliberately write out things that don’t work. If you don’t read them , you will easily be led astray .

In fact, objectively speaking, the problem of GPU shortage is not only a problem for Chinese entrepreneurs, but also faced by entrepreneurs all over the world. Therefore, how to do a good job with limited computing power will become the key to the competition among large model companies.

Kai-fu Lee mentioned a clear point of view: There must be talents in every position of the large model team, Pre Train (pre-training), Post Train (post-training), Multi-Modal (multi-modal), Scaling Up (scalability) , Inference (reasoning), etc. all have their importance. Among them, the talents of the Infra team are more scarce and should be valued more.

In fact, in addition to entrepreneurs themselves needing a deeper understanding of large models, more dimensions of technological innovation are also needed. For example, an infra-level entrepreneur at the scene, Wang Wei, founder and CEO of Moxin, shared a computing solution ——Sparse calculation. I saw the possibility that cloud and terminal AI chip acceleration solutions can fully sparse neural network development by optimizing computing models and provide a universal AI computing platform with ultra-high computing power and ultra-low power consumption.

ChatGPT ignites passion,

Llama2 keeps you grounded

If ChatGPT has ignited the enthusiasm of many entrepreneurs, then Meta's open source LLaMA and LIama2 have given most entrepreneurs "all beings are equal " at the starting line of the basic model . But in the direction of future development, entrepreneurs will obviously have different missions and visions based on their own resource endowments and capability structures.

For entrepreneurs who still choose to make large base models, open source bases are just the starting point. Kai-Fu Lee pointed out that although in various comparisons with SOTA (state of the art, advanced) models such as GPT-3 and GPT3.5, the gap between Llama2 is not big. But in actual use, the capabilities of Llama2 today are very different from GPT - 4 and the next version of Bard (Google's large language model) .

Image source: Meta

This also seems to give some room for large-scale model companies to maneuver. In the future, large-scale model entrepreneurs who are "really rich" and "really capable" will have the opportunity to switch to a New Bard or New GPT-4 style of play.

On the other hand, many entrepreneurs said that Meta's open source has had a great impact on the industry. "Today xxx may still be the best model in China, but tomorrow it may be surpassed. One day you may even suddenly discover that you have practiced Those models are basically useless. When technology changes or stronger open source models come out, the investment in the past may be completely wasted. For example, the open source model has read one trillion English tokens in pre-training. Your own model If you have to watch it again, it may be pointless. ” Mobvoi founder & CEO Li Zhifei believes that we must fully appreciate the far-reaching impact of open source.

"Although everyone has great ideals and ambitions, it depends on whether there are enough funds to support that day. So you have to be down-to-earth and see that. Living may be more important than anything else." Zhou Ming, CEO of Lanzhou Technology, also believes, Many companies that originally wanted to be the "best big model" actually need to rethink the ecological niche of entrepreneurship, choose to embrace open source, and build things "for me" on the basis of open source . For example, English open source models are weak in Chinese language skills and have not been polished in industry scenarios and data. This happens to be an opportunity for entrepreneurial teams.

At this point, Lanzhou Technology regards the open source model as the L0 base, and on top of this, it builds L1 language model, L2 industry model, and L3 scene model. Zhou Ming believes that by doing this layer by layer, interacting with customers through AI Agents to get feedback, and iterating the model little by little, barriers will gradually be established. Even if a better open source model appears in the future, there are ways to retrain or continue to iterate based on it. " The open source model 'a rising tide lifts all boats', you grow with the growth of people who are better than you ."

Making good use of the open source model is also a barrier and threshold. This may not be what many people think. Some people may even ask, if it is based on an open source model, is it still considered a large model? On the other hand, many companies themselves avoid talking about using open source models.

In fact, based on the open source model, the subsequent investment threshold is not low, and the ability requirements are not low. Using open source only effectively reduces the cost of cold start, which is not shameful for entrepreneurs. For example, Li Zhifei analyzed that an open source model may have seen 1 trillion Token data and saved you millions of dollars. The model manufacturer needs to continue to train the model. In the end, the model must be brought to the level of State of the Art (SOTA, specifically referring to leading large models). Data cleaning, pre-training, fine-tuning, and reinforcement learning are all indispensable steps. The annual computing power may start at tens of millions of dollars. It doesn’t necessarily mean that the threshold will disappear all of a sudden, nor does it mean that if you use the open source model, you don’t have to continue investing.

From this perspective, open source models are a more pragmatic choice, and optimizing and training practical models is also a real skill. Based on open source, you have the opportunity to make good large-scale models. The core is to have relatively leading cognition and the ability to continuously iterate the model .

Large model ToB status and practice

Improving model capabilities is one thing, but applying it to customer scenarios is another.

From the customer's perspective, large models and "bigness" are not the only pursuit, and may not even be what customers want at all .

An entrepreneur shared a very realistic customer scenario: To really talk to B-side customers, customers only need language understanding, multiple rounds of dialogue and a certain amount of reasoning ability, and no other AGI (artificial general intelligence) abilities are needed.

The customer reported to him that other functions had caused trouble, and the "Hallucination" problem could not be solved. Moreover, the customer originally had many AI 1.0 models. They were used well, so why should they throw them away? AI2.0 is not There is no need to cover the capabilities of 1.0, it is good if it can be called reasonably. This also explains why the RPA field at home and abroad is the most active in introducing large models. Wang Guanchun, co-founder and CEO of Laiye Technology, has also verified that customers have clear needs in this regard in the domestic market this year.

In this case, as long as the natural language is clearly understood, the parameters are passed to call the AI 1.0 model and the external database, the results are reliable and the cost is relatively low, and finally the large model is used to assemble the results to form a report. The model plays the role of task distribution here: divided into subtasks and what each subtask calls. Among the subtasks, some are supported by large models, some are original statistical models, and some are not even our own, but are models of a third party. The last thing the customer wants is as long as the task can be completed .

After trying to find such a PMF (Product Market Fit, product market fit), if you only do this kind of To B, its model capabilities include language understanding, multiple rounds of dialogue and a small amount of reasoning. This model does not need to be very large, ranging from 10 billion to 10 billion. A 100 billion model is relatively enough. Accordingly, it is necessary to have good language understanding and multiple rounds of dialogue based on hundreds of cards , and to have certain reasoning capabilities. Together with AI Agents, it can basically meet the needs of customers in many scenarios.

A general model does not mean it can solve all problems. In many scenarios of B-side customers, general large models will not work . This means that more and more models are needed, and there are more and more convergent scenarios. It also means that more forces need to be involved to help align technology and scenarios, rather than a universal technology to adapt to all scenarios.

Zhou Ming, CEO of Lanzhou Technology, believes that user data, industry data, and even maps or rules must be put into the model to continue training. This is the necessity for the existence of large industry models. In some industries that cannot be covered by general large models, adding such data can solve industry problems very well, and it can also overcome many phantom problems.

I remember that Li Zhifei also added this perspective. He believed that general large models and vertical large models each have their own uses, and you cannot have both . A particularly large model means that the cost of inference is very high. Moreover, it is meaningless to use a large model for chip design to answer entertainment content such as movies and celebrities. He believes that To B is more about verticality and reliability, while versatility lies in IQ, strong reasoning ability, logical ability, and rich knowledge . This is not necessarily what To B needs at this stage.

At the same time, domestic industries from all walks of life have a very strong demand for adding large models to their business. Ren Yanghui, founder & CEO of Blue Lake, and Li Guoxing, co-founder & CEO of Moka, after the products of these two SaaS companies were integrated into the large model, they have been recognized by customers and have actually received money.

By observing the changes in the status of these two entrepreneurs from February and March to July and August, I found that the sooner the SaaS field sees the technical changes brought about by large models, it is at the level of "redefining software" and dares to come up with Practicing this "redefinition" process with the thinking of "living toward death" will basically eliminate anxiety and give people hope in a few months.

Therefore, entrepreneurs with customers and scenarios in their hands may be the beneficiaries of the technology dividends of those large-model entrepreneurs earlier.

Because in specific scenarios, large models will actually have different pursuits. For example, Peng Jian, founder and CEO of Huashen Intelligent Medicine, said that the hallucination brought by large models may be beneficial to AI for Science fields such as drug design. To a certain extent, the so-called hallucination is the meaning of intelligence in some fields. Because this can help design protein combinations that no one would have imagined.

Just like Zhipu AI, which has the most and fastest large-scale model implementation cases in China, its CSO Zhang Kuo believes in practice that for the value of future large models, “20% may be centralized, and 80% will be It is decentralized. That is to say, it is inevitable to use richer and more types of large models to specifically create value in customer scenarios, rather than just one large model with unlimited generalization capabilities to solve all problems. trend. This was also recognized by many entrepreneurs who communicated together.

AGI is worth the sacrifice;

But don’t “risk your life”

Large models are a watershed moment in AI. In the past, artificial intelligence pursued certain goals in closed systems , such as facial recognition systems pursuing 100% accuracy. But now, the "emergence" brought by large models is an open intelligence that generates various possibilities . Beyond the expectations of designers, this is the true characteristic of intelligence, and it is also the biggest change in artificial intelligence in the past sixty or seventy years.

After the emergence of such a new intelligent system, everyone will be able to obtain intelligence conveniently and at low cost in the future, just like the electricity revolution.

Huang Tiejun, president of the Zhiyuan Artificial Intelligence Research Institute, believes that this technological change has been passed down very quickly, and everyone from large manufacturers to startups has quickly reached a consensus: This is the beginning of a new era . If we don’t do anything in this era, it seems that we are sorry for this era and the development of technology.

Baichuan Intelligent, which ended in April, is currently the most "volume" company in China in terms of large-scale models. It maintains the pace of releasing one model every 28 days on average. Although Wang Xiaochuan, founder & CEO of Baichuan Intelligent, does not admit that he is "volume", he shared Tips for quick implementation: For example, a team with accumulated search technology is very helpful in data processing issues. And by introducing search enhancement, reinforcement learning , and other supporting full-stack technologies, it can indeed help the model do better . "If you look at the high-level backgrounds of technology companies on the market today, you will find that many of the companies that are doing well in technology have search backgrounds. This shows that the logic of some technologies is gradually being understood."

However, Huang Tiejun believes that from a scientific research perspective, we are still in the early stages of a great era . If we compare it to the age of electricity, today’s intellectual age is actually the same as when Faraday built a generator. When it rotated, electric current was generated; now it is using The big data training model trains intelligence. This is a stage. We need another person later - Maxwell, because the establishment of electromagnetism later is the prerequisite for electricity to become reliable and available in human society and promote the industrial revolution.

Today's large models still have many things that are black boxes. On the one hand, the "upper limit" of large models still has huge room for improvement. AIGC can often bring huge surprises, but on the other hand, the "lower limit" of large models is not stable enough. , at this time, it is necessary to understand the boundaries of technology and reasonably set goals and problems to be solved. Some people want to solve the problem of exploring the upper limit, and some people want to solve the problem of stabilizing the lower limit .

For entrepreneurs, the dawn of AGI (artificial general intelligence) has emerged. This is a career worth pursuing, but don’t risk your life.

Image source: Visual China

On the other side, in addition to waiting for large model technology to advance further, many middle-level entrepreneurs are improving the environment for implementing large models into applications.

Liu Cong, head of BentoML Asia Pacific, said that compared with previous traditional machine learning, overseas customers can basically get some budget to build product prototypes or demos related to large models. But it has not yet entered the production environment to generate commercial value for the company, and many middle-level entrepreneurs have seen this opportunity.

Dify.ai founder & CEO Zhang Luyu’s entrepreneurial insights also stem from this. He said that from a developer’s perspective, getting a model is not enough. He shared a data. After analyzing more than 60,000 application samples, he found that the proportion of those currently in production or close to production is almost 5%. Some are not very satisfied with the model technology, and some have team workflows that have not yet adapted to AI application development. Accordingly, Zhang Luyu’s team has developed some special capabilities for applications that are now more likely to be put into production. For example, they have an indicator called improvement in consumer friction to see how much value AI can provide in this matter and provide corresponding capabilities.

Zilliz founder & CEO Xingjue added this perspective. He believes that an extremely simple development stack is a prerequisite for the democratization of AI . Based on this judgment, he proposed CVP (large model + vector database + prompt word project). Development stack.

How to get to AI native?

What is the Killer App in the AI era? When Microsoft released Copilot in March this year, many people's curiosity was instantly ignited. But at this closed-door meeting, Kai-fu Lee put forward a different perspective. Copilot is not an all-in large-model product .

He believes that from the perspective of WeChat, one of the most successful products on the mobile Internet, it is important to give up compatibility. The earliest ones were MSN and QQ, but the winner was WeChat, because Zhang Xiaolong made a decision. Since it is the era of mobile Internet, there is no need for PC. WeChat focused on the characteristics of mobile Internet in the early days, betting 100% on it. onto new technology platforms.

From this perspective, AI native applications may have the following characteristics: if the large model is removed, the application will collapse. It is an application that completely relies on the capabilities of the large model . But take away Copilot, and Office software is still Office, and AI is just the icing on the cake.

This view has been most recognized by on-site entrepreneurs, and has also triggered discussions on AI native applications with this definition.

Zhang Yueguang, the product manager of Miaoya, a popular product some time ago, believes that without large models, there would be no Miaoya. This is consistent with Kai-fu Lee’s thinking on AI first and AI native.

He believes that the most important thing about Miaoya, as the first application to get out of the circle, is its controllability . The Miaoya team did not initially want to work on the underlying model, but was more focused on how to use various plug-ins and small models developed by open source enthusiasts in the existing ecosystem to achieve controllability. Anchoring that the most important thing is controllability, Miaoya achieved an average photo quality score of over 90 points and ushered in rapid success.

"At the application layer, we pay special attention to how to make the model more controllable, and we find that there are already some relatively controllable technologies in the image track. Maybe in the language track, if something like this appears, it will be a big blow to upper-level application entrepreneurs. A moment of change." Zhang Yueguang's practice has given some inspiration to companies doing large-scale model applications. Controllability may be a condition for the birth of AI native applications. Stability. AI China Lead Zheng Yizhou has also observed this trend. After open source community contributors solved the problem of controllability, a large number of applications emerged .

In exploring the new generation of applications, Li Yan, founder of Yuanshi Technology, pointed out that the reasoning ability brought by large models is the essential difference of the new generation of products.

Social+Agent is a promising opportunity and will definitely be one of the first batch of AI native products, but this is likely to require entrepreneurs to have "end-to-end" construction capabilities from large models to products. For example, Li Zhifei shared that when discussing with Character.ai why the latter wanted to build its own large model, the other party said that it was because using centralized large models like OpenAI or Google would not answer "flirting" questions . This is a unique space that Character.ai has found, and it is also a barrier that can be gradually accumulated.

Lingxin Intelligence in the same field has discovered unique scenarios in the application of large social models. Zhang Yijia, CEO of Lingxin Intelligence, shared that what they saw was different from what they expected. The social scene where large models can now be implemented is not companionship. It takes time for people to accept the companionship of avatars. The social scene now implemented is role-playing. The user portrait is a fan of online novels. Role-playing is a new form of online novels .

As for the latest direction of AI Agent, whether it is a large-scale model "the hope of the whole village", and whether it will eventually bring about interaction revolution, terminal revolution, and business model revolution, is likely to depend on the development of multi-modal capabilities.

Tao Fangbo, founder and CEO of Xinshi Universe, explained that everyone had high expectations for Agent at the beginning, but under the current technical conditions, it was difficult to explain how Agent solved more problems than ChatGPT. He believes that if you really want Agent to be effective, you don't need to connect so many software APIs, because connecting software APIs is essentially for compatibility, which is old wine in new bottles.

Does Agent have some more Native form to complete the last mile? There are many, many things to do, including spatial perception capabilities and multi-modal capabilities, said Song Zhen, founder & CEO of Digital Life. When these conditions mature, a Killer Case may appear.

Li Zhifei firmly believes that now it seems that multimodality is a C position, not a vase. Because Agent input and output rely on multi-modal capabilities , there would be no Agent without multi-modality. However, today’s Agents provide feedback more through language models and text, but in the end the Agent will be a multi-modal observation, Perception, action. He predicts that the transfer of cross-modal knowledge will be the biggest contribution of large language models in two or three years.

In the era of large models,

Serve big B or small B?

A few months ago, I was in San Francisco just in time to catch up with the data company Databricks' developer conference. This is a data platform company that specializes in "data lakes". It can be said to be a "middle-tier" company that thrives on cloud computing platforms. It is such a company whose valuation has reached tens of billions of dollars in a few years and is still growing. Databricks' customers range from large enterprises to small startups, large and small.

This year, the company quickly gained access to large models, and also acquired the large model company Mosaic ML, and began to help customers implement large models into the business. This trend has made it worth hundreds of billions of dollars.

What I was very curious about at the time was why there seemed to be no such "middle-tier" company based on cloud computing in China, and whether this wave of AI technology advancements could spawn such a group of cloud-based companies in China. Are computing power turned into business competitiveness and an outstanding company in the "middle layer" that brings digital advancement to more industries?

Zhang Yong, chairman of Alibaba Cloud, believes that the emergence of "middle-tier" companies is definitely possible, and cloud computing companies are happy to see it happen. But what these companies have to solve is still a core problem - to clearly define who and what problems they want to solve. The clearer the definition, the more in-depth the capabilities, and the things they do can be truly "convergent" and have real business "penetration" .

This also triggered discussions among entrepreneurs attending the conference. For example, large model technology has just begun to enter the industry, but the problem of "non-convergence" and project-oriented enterprise services has begun to arise. For example, we do large model training for B-end users, but because the data belongs to the other party, it is difficult for our team to "close the loop" after the cooperation is completed - there is no flywheel for data, and the revenue and gross profit are also low, and we accidentally end up with a "high-end" model. "Technology construction team " is a common problem faced by technology companies on the B-side. Some entrepreneurs have even begun to suspect that the large model To B may be inherently lacking in soil.

But Zhang Yong, who had been taking notes in the entrepreneur group chat, gave a different opinion in a systematic way here: "To B actually has another possibility, which is "small B", that is, those small, medium and micro enterprises Enterprises may seem inconspicuous, but they are numerous in number. Just serving them can create the current Internet giants. "

For example, Alibaba's early "Yellow Pages" allowed small and medium-sized sellers to be seen by foreign buyers, bringing about the prosperity of cross-border trade; Taobao solved the problem of information and logistics circulation, and established the e-commerce category.

Moreover, compared with large companies, these small B companies do not care about technology and vision. Whoever can help them solve their growth problems will be paid for it.

One of the main purposes of the current digitalization of large companies is to "reduce costs and increase efficiency". To put it bluntly, it means "saving expenditures." However, there is always an end to the space for efficiency optimization, but the "open source" space for growth and development is relatively unlimited. Zhang Yong believes that "open source" is far more important than "reducing expenditure" in enterprise services , and people are always willing to pay for development .

He even believes that it may be a misunderstanding that in the past digital enterprise services placed too much emphasis on "cost reduction and efficiency improvement", because those who are willing to pay for a few percent improvement in efficiency are often large companies. They are large, and this improvement is in line with input-output. Compare. Then let everyone work on projects with big companies. But conversely, it is difficult for small companies to rely on "cost reduction and efficiency improvement" to activate demand. What they want is the ability to grow and develop.

In fact, there is a duality for Little B customers, that is, if they adopt the "subscription" method, then they can actually be regarded as a "C-end user".

On this point, Zhang Yong's views were also recognized by entrepreneurs participating in the conference. For example, Li Zhifei of Mobvoi once did To B business in the field of speech recognition, and was very painfully involved by his peers. Later, the AI dubbing tool "Magic Sound Workshop" he made served content creators and converged into a product that truly solved the common problems of little B's. These "little B's" actually allowed him to truly transform AI technology into It has become a healthy and growing business.

Zhang Yong also suggested that startups need to determine the customers they want to serve from the beginning, whether they are C or B, small B or big B. They must be well defined. Zhang Yong even feels that it will not work for companies doing AI to do both To Big B, To Small B and even To C.

Although the development of AI technology has brought about many changes and will become more and more versatile, in addition to the technical level, there are also organizational "DNA issues". "You are a big customer and an Internet user in the same company?" The team, the way they dress at work, and the way they speak may all be different." Zhang Yong feels that it is necessary to clearly define who he serves and what problems he solves, rather than where to go when there is an order.

Large model to the cloud

What does it mean?

During the last wave of AI a few years ago, many startups also received a large amount of financing, and many well-known companies and entrepreneurs emerged. However, after a few years, it was still very difficult. I have been in constant communication with many of that wave of entrepreneurs, and I have made many appointments to chat. When I see their tired faces and hoarse voices, when I ask, it is often because they have not recovered from drinking with a big client the day before.

Many entrepreneurs this time also witnessed that in that era, technology eventually became unable to form standardized products and became "senior human outsourcing" and "high-tech construction teams" that could only take on projects. They all felt that they must not repeat the same mistakes again.

At the same time, everyone is also very concerned about the changes that cloud computing platforms like Alibaba Cloud will face in the era of large models. Everyone also asked Zhang Yong, in the era of big models, whether he thinks the cloud itself is a technology or a product?

Zhang Yong's response was very direct, "The cloud itself should be a product, and not one, but a series of products." Driven by the wave of large models and AI, one thing is certain, that is, the relationship between the industry and customers in terms of computing power. Brand new requirements were put forward. How to meet customers' further needs for computing power has become Alibaba Cloud's basic starting point. Zhang Yong feels that there must be technical problems to be solved, but Alibaba Cloud also needs to think about how to "converge" into products that truly solve industrial ecological problems, rather than just output computing power itself.

Interestingly, although this exchange event was jointly invited by the FounderPark community and Alibaba Cloud, no sharing about "Tongyi Qianwen" was arranged. Entrepreneurs are of course also concerned about the purpose of the cloud platform's own large-scale model. Zhang Yong’s point of view is: In this cross-technology era where Hyper Scalers (hyper-scale players) are prone to emerge, no one dares to fall behind, and it is impossible not to touch the technology itself. But he feels that in such an era of great change, Alibaba Cloud still needs to grasp a more core role, which is Cloud Service Provider.

"To do this role well, you must not understand the big model." Zhang Yong said: "If we don't do Tongyi Qianwen, we may not know how to help the entrepreneurs participating in the meeting today."

In fact, what makes Zhang Yong more excited is that he is very sure that human society will have unlimited demand for computing power in the future, and its efficiency requirements will also become higher and higher. Therefore, Zhang Yong said that Alibaba Cloud definitely hopes that "the more models, the better, and the more scenarios, the better." The more of the two, the higher the demand for computing power and technical requirements, which means that the cloud has new capabilities. problems to face and solve. Only continuous "difficult problems" worth solving can drive the value of the cloud to have greater room for growth.

" Unprecedentedly, cloud computing platforms need an ecosystem, rather than doing everything themselves . At present, no company can use its own chips, cloud computing, data platforms, machine learning frameworks and large models to form a so-called "closed loop" ”, which is almost physically impossible. "

Zhang Yong feels that the development of AI technology has brought new possibilities to the ecosystem. He has a regret. The past ten years have been a period of rapid development of cloud computing in China, but China's SaaS industry has not substantially improved due to the rapid development of infrastructure. SaaS companies in the United States are currently exploring embedding AI into platforms to upgrade, taking a different path from domestic companies.

He believes that in the AI era, a new generation of SaaS may appear in China, which will be a brand-new intelligent service. Different from the previous SaaS process-driven, this new service will be driven by data and intelligence, and may not be called SaaS.

Li Dahai, director and CEO of Face Wall Intelligence, pointed out that the domestic To B market is very fragmented, which is why SaaS services cannot be launched. But now that there is a technical variable like a large model, it is worth looking forward to whether some changes can be made. At the same time, he also hopes that cloud vendors like Alibaba Cloud can come up with some good solutions and foundations to help everyone work together to make this matter smoother.

In Zhang Yong's view, many SaaS companies in China in the past cannot be safely regarded as Cloud Native (cloud native). However, for a new service that naturally grows on the cloud or is intelligent native (intelligent native), there is an opportunity. "Replace" the non-native products of the previous era.

Many times we lament that the growth of China's SaaS industry in the last decade was unsatisfactory, but large models now provide new opportunities for startups and the possibility of shaping a new pattern in a new digital ecosystem. Zhang Yong concluded: Such opportunities and challenges are common to Alibaba Cloud and to all entrepreneurs. They must find their own position in the future, jointly form ecological partnerships, and jointly create value.

Hangzhou Xixi Wetland, famous for its ecology｜Source: Visual China

Okay, here are some excerpted notes from my 7-hour exchange. My strongest feeling is that the changes brought about by today’s large model technology have just begun. After the extreme excitement and "over-imagination" in the first half of the year, a technological revolution that may last 10 years has now truly begun the "Long March." After the fanatic period, we truly enter the pioneering period, where the "consensus" that can only be obtained after enough time and hard work and hard work is the true consensus .

I hope that there will be more candid exchanges and thinking collisions with the "open source spirit" between entrepreneurs and industrial ecology. In fact, the name Zhang Yong gave this exchange, "Xixi Lun Tao", is quite good. If you sit down and talk about the Tao, it is even more important to get up and practice it.

I think this "Tao" should be the "Tao" of innovation from technology to products, from vision to value in the AGI era .

*Source of header image: Visual China

This article is an original article by Geek Park. For reprinting, please contact Geek Jun on WeChat geekparkGO

Good stuff to learn, like three times in a row ↓

8.23 Notes on China’s Big Model “Top Group Chat”

Guess you like