Soon after, the language model of Kunlun Wanwei Tiangong University was released

Tiangong University language model

China-made ChatGPT welcomes new members again. On the afternoon of April 17, Kunlun Wanwei officially released the 100-billion-level language model "Tiangong". The Tiangong University language model is the first large language model with a scale of 100 billion to benchmark against ChatGPT in China. It conducts question-and-answer interactions with users through natural language. Diversified needs.

According to the official website, “As a large-scale language model, Tiangong has powerful natural language processing and intelligent interaction capabilities, can realize intelligent question and answer, chat interaction, text generation and other application scenarios, and has rich knowledge reserves, covering scientific , technology, culture, art, history and other fields".

Tiangong is not much different from the large language models released in China. They all focus on Chinese application scenarios. On the one hand, the source of training data is convenient, and on the other hand, it is easier to compete with similar products in China than to compete with OpenAI overseas. , ChatGPT has been banned or restricted by various countries, not to mention our products.

insert image description here

Only invited users can log in to the official website of "Tiangong" to experience this experience.

Benchmarking GPT3.5

According to the official statement of Kunlun Wanwei, the parameter scale of Tiangong’s large model reaches hundreds of billions, and its level is very close to the intelligence level of OpenAI ChatGPT. ChatGPT is based on the GPT3.5 large model, so Kunlun Wanwei named this version “Tiangong” 3.5.

Tiangong can conclude that the level close to GPT3.5 has passed the unified standard test. In the GPT-3.5 and GPT-4 papers, corresponding large-scale test data sets are attached. At present, there may be nearly 20 categories, covering various ability tests of large-scale multi-dimensional models. Tiangong and other large models will use these data for evaluation before releasing products. According to the test of these public data, it is concluded that Tiangong can benchmark GPT3.5.

dialogue ability

Under the current version, it can support a text dialogue of more than 10,000 characters at most, and realize more than 20 rounds of user interaction. It has to be said that its ability to answer multiple times in connection with the context is already very outstanding, and 20 rounds of interaction can already meet the user's need for dialogue correction until the user obtains a satisfactory answer.

Since it is impossible to actually test it, it is not clear the real ability of Tiangong's connection context, but the single question and single answer is quite OK.

The following is the salary table table designed by Tiangong, which can list the common items of a salary slip and calculate the actual salary, but all the actual salary is incorrect, which is regrettable.

insert image description here

Drinking more hot water used to be a standard word for caring for girlfriends, but as everyone abuses it infinitely, it has now become a meme. I believe that many programmer brothers will feel it. It’s okay not to say this sentence, but it may arouse the anger of my girlfriend. At this time, you should ask Tiangong, if your girlfriend is not feeling well, what will be the consequences if you tell her "drink more hot water"? Look at Tiangong's answer, it's much better than us straight men, standard warm men.

insert image description here

From the above dialogue, we can see that Tiangong has the ability to generate text, but just like the 360 ​​large model, the accuracy is still lacking. In addition, although Kunlun Tiangong AIGC's full series of algorithms and models cover images, music, text, and programming, the big language model released this time does not have image and music generation capabilities, nor does it reflect coding capabilities.

multimodal application

Before the release of the large language model, Kunlun Wanwei released a full series of AIGC algorithms and models in December 2022, covering multi-modal AI content generation capabilities such as images, music, text, and programming. Gongqiaohui SkyPaint, Tiangong Yuefu SkyMusic, Tiangong Miaobi SkyText, Tiangong Zhicode SkyCode.

insert image description here

Tiangong Yuefu and Tiangong Zhima are based on the self-developed Tiangong series models, and the downstream of Tiangong Qiaohui is based on the Stable Diffusion model. After the Tiangong 3.5 large model was officially launched, Fang Han, CEO of Kunlun Wanwei, said that it can be used to replace the underlying model of Tiangong's multi-modal application.

It is foreseeable that Kunlun Wanwei will rely on the large model of Tiangong as a base to upgrade and integrate the Tiangong series of applications and enhance the capabilities of the entire series of generative AI. GPT-4 has image generation capabilities, and GPT-5 will have video generation capabilities. If Tiangong wants to realize the goal of Tiangong 4 and Tiangong, it is imperative to integrate image, audio, video and programming capabilities.

peer comparison

With OpenAI ChatGPT igniting the torch of artificial intelligence, domestic ChatGPT products have sprung up like mushrooms after rain. Compared with its peers, Tiangong has no obvious advantages.

The first is computing power. The ability of GPT is trained. There are three core elements that determine its ability: algorithm, data volume and computing power. how big. At present, Kunlun Wanwei has a training cluster with 200 cards, Baidu Wenxinyiyan has about 1,000 cards for training resources, ChatGPT training needs more than 10,000 NVIDIA A100 GPUs, plus other applications, the corresponding chip demand is 30,000 Multiple GPUs. It can be seen that there is still a big gap between Tiangong and other bigwigs in terms of core computing power support.

The second is application. Based on the original image, music, text, and programming multi-modal AI content generation capabilities of Kunlun Wanwei, coupled with the release of the Tiangong University language model, it is trying to build a large-scale model system like Shangtang. The idea is obvious. With the size of Kunlun, it is obviously unable to support a large number of C-side users like Baidu and Microsoft, but it is oriented to the B-side like other ChatGPT products in China. In the B-end market, Ali and SenseTime, which have already been released, are ahead of Kunlun in terms of product maturity. In particular, the former can acquire users by connecting to all Ali-related apps, thereby obtaining a large amount of user usage data for upgrade iterations.

postscript

Although there are still many shortcomings in the large model of Tiangong, and there is still a long way to go, the successful release of the large model means that Kunlun has won the admission ticket for the AI ​​feast. Since then, the domestic large-scale model has added a word. As an ordinary user, I hope that the competition will be as fierce as possible. A dynamic and competitive market can bring more opportunities and benefits to ordinary people.

Guess you like

Origin blog.csdn.net/NoBack7/article/details/130233404