China's open source models are free for commercial use, this is the right step

On the second day after China's supervision of generative models came into effect, when LLaMA, the most powerful open source model at present, was about to allow commercial use, and the secret recipe of GPT4 was further "leaked", the most recognized domestic open source model in the global development community The large model ChatGLM announced an important decision: its latest model ChatGLM2-6B, which can run on a single card, is open to enterprise users for free commercial use.

This news is drowned in news such as the release of Claude2, the establishment of Musk's xAI, etc., and there is a lack of discussion. But it is actually another important moment in the open source process of China's basic model.

On the evening of July 14th, Zhipu AI and Tsinghua KEG issued an announcement, saying that in order to better support the open source ecology of domestic large-scale models, according to the decision of Zhipu AI and Tsinghua KEG laboratories, the weight of ChatGLM-6B and ChatGLM2-6B will be adjusted from now on. Academic research is completely open, and free commercial use is allowed after completing enterprise registration and obtaining authorization. The announcement provides an entry for enterprise registration, and the only information required is name, country, email address, organization, purpose and the model to be applied for (ChatGLM-6B or ChatGLM2-6B).

According to the announcement, since ChatGLM2-6B was released on June 25, the download volume of this model on Huggingface has exceeded 1.2 million in less than a month.

According to Zhipu AI's official website, the privatization price of GLM2 unlimited instances + unlimited reasoning or fine-tuning toolkit was previously 300,000 a year. According to a developer who just asked Zhipu about the price before the announcement was released, the other party replied that they can wait a while and "will reduce the price."

"Then it's free," he said.

But in fact it was not sudden.

According to recent reports, Meta is preparing to release a commercial version of its artificial intelligence model LLaMA. Not long ago, OpenLLaMA, an open source model that uses exactly the same preprocessing steps and training hyperparameters as the original LLaMA, has taken the lead in announcing complete open source commercial use. At the same time, the author of Google's "no moat" internal letter had previously been dug out, and "revealed" the engineering and training details of GPT-4. Many practitioners in the industry's discussions tend to believe in its reliability.

Everything is evolving rapidly, and the rapid evolution of the basic model means that there are fewer and fewer secrets: on the one hand, the large model itself is not so mysterious. After getting out of the halo that surprised everyone at first, more people will realize it At the same time, the frequent transfer of technical core talents between several major companies will eventually leave few secrets; and more importantly, the amazing energy of the open source community, where many talents are optimizing the model based on the open source community. , this kind of long-term stamina is beyond the reach of closed-source models, and these capabilities will eventually be combined together. In the past few months, Stanford Alpaca, which allows anyone to tune LLaMa instructions, has appeared in the open source community. For $100, you can train and see GPT4All, which is a collection of various models. The performance is comparable to LLaMA's UAE model Falcon. , a higher quality data set Redpajama, and "crack" models like OpenLLaMA.

And the recent "secret revelation" also made many people feel that the non-disclosure of GPT-4 is not a security consideration, but it is too easy to learn - rumors such as the MoE model architecture have some calming meaning. So it seems that an important consensus has been formed, that is, any achievements of the models at a certain stage cannot become a moat. Therefore, for the most high-profile and popular open source basic models, allowing commercial licenses is a must, because this will further attract developers and allow these ingenuity to grow based on its ecology.

This also means that many analyzes based on the short-term starting point of "LLaMa is difficult to allow commercial use" and "the open source model is close to GPT4 are hyped by the media" are unnecessary in the long run.

For model providers, this requires them to quickly adjust their strategies. Not only should you not be entangled in whether it is open source or not, even free commercial use must be fast enough and decisive. Zhipu, which has just released the latest 6B version model for free commercial use, is a typical example. From the initial coveted for a long time, finally found the opportunity to engineer, made a 130B base version model, to discovering that the ability of the 6B version can even be close to the old 100 billion model version, seeing a model that can be installed on your own computer. Attracting so much attention from the open source community, Zhipu has actually been adjusting according to the changes.

According to people familiar with the matter, Zhipu wanted to release its own model in February this year, which is more like OpenAI's route. But later chose open source for various reasons. The performance after open source and the progress of the open source community made many people in the team change their minds. After it was released on March 14, it was ranked first on GitHub's list on March 16, and it ranked first on HuggingFace's hot list for more than ten consecutive days.

According to insiders, the quick recognition has been a shock within the team.

Tang Jie, who is close to the person in charge of ChatGLM technology, said that after the open source, he said internally that more open source is to let Chinese scientists and the industry know more about the training and operation mechanism of the large language model, rather than simply using it. Take someone else's model and fine-tune it. This is the essence of open source.

In the next few months, more and more models are destined to go to free commercial use.

In fact, as long as you want to understand what the generative artificial intelligence has changed today, you will see it more clearly:

Today's artificial intelligence enhanced by large models is not intended to replace humans, but to replace the interaction between humans and machines in the past. In the past, the payment model based on computing power can be understood as a business based on the monopoly of human-machine interaction, and the big model is to use the natural language that everyone can understand to break the past part of the computing power elite. A monopoly that allows everyone to participate.

The logic of open source is obviously more in line with this trend.

"It would be very significant if a personalized language model could be fine-tuned on consumer-grade hardware in a matter of hours. In particular, it could integrate a lot of up-to-date and diverse knowledge in real-time .” It was written in Google’s internal “No Moat”.

A technical leader who has used multiple open source large models for development told me that not everyone needs to retrain the model, but most developers who want to use the model have a strong willingness to do various optimization solutions, and in the end They will in fact focus on optimizing for one or a few open source models.

Therefore, when the closed source model and the open source ecology are destined to solve the same problems more and more, after the closed source proves the possibility and ceiling of the route in a miraculous way, the open source will really make it easier With the trend that the technical principles of large models are becoming less and less secret, the attractiveness of open source will continue to become stronger. The open source community is built around whose open source model becomes the key, and providing free commercial use is the key to competing for this core role.

At present, there are also a certain number of domestic free commercial licensed basic models. In addition to Zhipu, the Baichuan 13B model trained by Baichuan Intelligence using 1.4 trillion tokens is also a free commercial license. Many developers compare Zhipu and Baichuan. , and after free commercial use, the effect of comparison will be more direct, accurate and meaningful.

These domestic models are still far from the world's top model level. If you pay close attention to these teams, you will know that these development teams are well aware of this. Commercially available open source can make the domestic model further get rid of the stage of benchmark scoring, and enter the stage of mule or horse. The real specific different scenarios, how the reasoning performance is, how to solve the problem of disaster forgetting, and the reality Whether the environmental data flywheel can speed up after the start will be something that everyone can actually see.

In the end, the ecology that is really attracted by this is the real moat.

Guess you like

Origin blog.csdn.net/elinkenshujuxian/article/details/131779317