Can Wen Xin Yiyan, who is overwhelmed by internal testing, face GPT-4?

On February 7, Baidu announced the launch of the ChatGPT type artificial intelligence product "Wen Xin Yi Yan", and completed the internal test in March and opened it to the public. Today, more than a month later, at Baidu headquarters in Beijing and Zhangjiang Artificial Intelligence Island in Shanghai, Wen Xin arrived as scheduled.

Robin Li, chairman and CEO of Baidu, introduced that Wenxinyiyan has outstanding abilities in business copywriting and mathematical calculations. At the same time, he also admitted that Wenxinyiyan "is not completely ready".

Bearing the expectations of the domestic market, why did Wenxinyiyan go online in a hurry? Industry insiders believe that the development of domestic GPT technology should take the opportunity to seize the application scenarios, and then assist with the iteration of algorithm technology. Only by "learning while catching up" can there be a chance to catch up with the international AI giants.

Have a stronger understanding of Chinese

At the press conference, Li Yanhong demonstrated the performance of Wenxin Yiyan in five usage scenarios, including literary creation, commercial copywriting creation, mathematical calculation, Chinese understanding and multi-modal generation.

In the scene of literary creation, Wenxin Yiyan summarized the core content of the well-known science fiction novel "Three-Body Problem" based on dialogue questions, and put forward five suggested angles for continuing to write "Three-Body Problem", reflecting dialogue questions and answers, summary analysis , The comprehensive ability of content creation and generation. In addition, Wen Xin accurately answered factual questions such as the author of "Three-Body Problem" and the role player of the TV series. According to reports, AICG content is prone to factual errors when answering factual questions, while Wenxinyiyan continues Baidu's knowledge-enhanced large-scale model concept, which greatly improves the accuracy of factual questions.

In the commercial copywriting scene, Wenxin Yiyan also completed the creative tasks of naming the company, writing slogans and press releases.

"In order to write a good manuscript, AI needs not only to accurately understand our intentions, but also to have the ability to express clearly." Li Yanhong explained that humans often say "read thousands of books", while AI means "read hundreds of billions of books." . The training data of the Wenxin Yiyan large model includes trillions of web pages, billions of search data and pictures, tens of billions of voice calls per day, and knowledge graphs of 550 billion facts. "Studies have shown that when the data scale is large enough and the parameters reach hundreds of billions of levels, the 'intelligent emergence' of large models may occur. Even in fields that have not been specially trained, knowledge understanding and logical reasoning capabilities can emerge."

Wenxin Yiyan also has a certain thinking ability, and can learn relatively complex tasks such as mathematical deduction and logical reasoning. Facing classic questions like "chicken and rabbit in the same cage" that exercise human logical thinking, Wen Xin can understand the meaning of the question and have the correct thinking to solve the question, and then follow the correct steps to calculate step by step, just like a student doing a question. correct answer.

It is worth mentioning that Wenxin Yiyan is rooted in the large language model of the Chinese market, so it has advanced natural language processing capabilities in the Chinese field, and has better performance in Chinese language and Chinese culture. In the on-site demonstration, Wen Xin correctly explained the meaning of the idiom "Luoyang Zhigui" and the corresponding economic theory of "Luoyang Zhigui", and created a Tibetan acrostic poem with the four characters "Luoyang Zhigui".

In addition to daily conversations, Robin Li also demonstrated Wenxin Yiyan's ability to generate text, pictures, audio and video, as well as speech in dialects such as Sichuan dialect. Unfortunately, the video generation capability is not available to all users at this stage due to its high cost.

"Multimodality is a clear development trend of generative AI." Li Yanhong said, "In the future, as Baidu's ability to unify large models with multiple modalities increases, Wenxinyiyan's multimodal generation capabilities will continue to improve."

Commonly used functions perform normally, but there are still many bugs

"Conceive in October, and give birth in one day." Li Yanhong described the birth of Wen Xinyiyan at the press conference.

It is reported that the launch of Wenxin Yiyan has been undergoing stress tests for several consecutive days. The largest single intelligent computing center in Asia, Shanxi Yangquan Baidu Intelligent Computing Center, has increased its computing power to 40 billion floating-point calculations per second. And together with several other intelligent computing centers across the country, it provides computing power support for Wenxin Yiyan.

In addition to daily conversations, Wenxin Yiyan also provides three functional templates for writing reports, drawing with AI, and checking knowledge points.

In the report titled "Convergence and Transformation of Traditional Media", Wen Xinyiyan not only explained the meaning of "Media Convergence", but also gave suggestions on digital transformation, content innovation, user analysis and other aspects. Brother Xiaojing found that these contents are not directly presented by search engines, but sorted and summarized by large models. In terms of checking knowledge points and AI painting, Wenxin Yiyan also performs normally, and can usually give feedback within one minute.

However, Brother Xiaojing discovered during the trial that Wen Xinyiyan still has many loopholes in communication, and there are often inconsistencies. In addition, there is a lack of echo between contexts during the conversation, which is more like a one-to-one answer.

Some internal test users also said that the same question was thrown to Wenxinyiyan and GPT-4, and there was a certain gap between the answers of the two. For example, in the continuation of "The Three-Body Problem", the answers of "Wen Xin Yi Yan" are more abstract, such as the meaning of life and the relationship between human beings in the universe, while GPT-4's answers are more specific and conflicting, such as the rise of humanoid robots , the challenge of the law of the dark forest, etc.

Trial application crowds the test page

On the 15th of this month, OpenAI launched GPT-4, and the market feedback is better than ChatGPT. Wenxinyiyan is launched today, and it is inevitable that it will be compared with GPT-4.

Baidu launched the Wenxin language model as early as 2019. The Wenxin word based on this also belongs to Baidu's "accumulation and slow development" over the years, but even Li Yanhong himself admitted: "It cannot be said that we are completely ready, Wenxin In a word, the threshold for benchmarking against ChatGPT, or even against GPT-4, is still very high, and I feel that there are still many imperfections in my own testing.”

Perhaps because he was worried about the stability of Wenxin Yiyan, Li Yanhong did not use live demonstrations at the press conference, but used a pre-recorded video to demonstrate Wenxin Yiyan's ability. Li Yanhong also said that the model still has some shortcomings. "No matter which company it is, it is impossible to make such a large language model in a few months. Deep learning and natural language processing require years of persistence and accumulation. , there is no way to speed it up."

The capital market's response to Wen Xinyiyan was not ideal either. This afternoon, Baidu's Hong Kong stock market continued to fall, with a drop of nearly 10% at one point, and then narrowed slightly. As of the close, Baidu’s stock price closed at HK$125.1 per share, down 6.36%, with a total market value of HK$345.8 billion.

Why did Wenxin Yiyan go online in a hurry? The two sets of data disclosed by Baidu today can explain the problem-in just one month, more than 650 partners announced to join the Wenxin Yiyan ecosystem; More than 30,000 enterprise users have been tested, the application product testing webpage has been overwhelmed many times, and the traffic of Baidu Smart Cloud official website has soared hundreds of times.

"Everyone hopes to use the latest and most advanced large language model earlier." No wonder Robin Li said bluntly, "Wen Xin Yi Yan" is not perfect, but it must be launched if the market demands it.

Zhou Hongyi, the founder of 360, also publicly agreed with the practice of "trading the market for time". "At present, the development of GPT technology in China must first occupy the application scenarios and develop the core algorithm technology at the same time." He said that the application scenarios of GPT technology require complex engineering and commercialization capabilities, as well as rich experience in data cleaning and manual labeling. If you wait for the domestic algorithm to catch up with GPT-4 before launching, the market will miss it.

Is the concept of GPT an outlet or a bubble?

In addition to Wenxin Yiyan, many domestic institutions and companies have launched GPT-like large models. In February of this year, Xiaoice’s ChatGPT application “X-Chain of Thought & Action” started a small-scale internal test. At the beginning of March, Qiu Xipeng’s team from the School of Computer Science and Technology of Fudan University released the ChatGPT-like model MOSS, with the goal of creating a large-scale Chinese language model with Chinese characteristics; 360 also said that it will learn from the New Bing model launched by the combination of Microsoft and OpenAI’s capabilities to launch a new generation Intelligent search engine, and launch artificial intelligence personal assistant products based on search scenarios. Alibaba Dharma’s ChatGPT products are already in the internal testing stage, and JD Cloud will launch the industrial version of ChatGPT—ChatJD... More large models are also gradually advancing internal testing.

According to the International Data Corporation (IDC), the revenue of the artificial intelligence market in the global market will reach US$85 billion in 2021, and will exceed the US$200 billion mark in 2025, with a compound annual growth rate of 24.5%. According to the report of China Securities Construction Investment, China's artificial intelligence industry is developing fiercely. It is second only to the United States and the European Union in the global artificial intelligence industrialization area, accounting for about 9.6% of the global market. In 2022, China's artificial intelligence market will reach 272.9 billion Yuan.

Is the concept of GPT an outlet or a bubble? Zhu Keli, the founding director of the National Research Institute of New Economic Research, told Brother Mingjing: "Technology-based companies such as Baidu have been deeply involved in the field of large-scale models for many years, and AI technology is relatively mature. It can give priority to grabbing the market." He also advised investors not to follow the trend blindly. Some companies lack technological advantages and only use ChatGPT as a gimmick to win the favor of the stock market. Once the bubble bursts, they will suffer heavy losses. "

"The explosive demand growth in the AI ​​market will unleash unprecedented and exponential commercial value." Li Yanhong predicted that the big language model will bring three major industry opportunities: cloud computing, model fine-tuning, and application service providers. , image generation, audio generation, video generation, digital human, 3D and other scenarios, many start-up star companies have emerged, and they may be the new giants in the future.”

Guosen Securities Research Report also believes that AIGC application scenarios are expected to explode in an all-round way, and as a productivity tool, it will continue to promote the development of chat robots, digital humans, metaverses and other fields. As the three major elements to promote the development of artificial intelligence, the algorithm is still iterative, the amount of data accumulation is not enough, and the computing power has just broken through. The breakthrough of the "three brothers" will continue to create new formats and applications.

Guess you like

Origin blog.csdn.net/weixin_42814075/article/details/129612430