The big model "slims down" into the mobile phone, and the next iPhone is coming?

A wave of "end-to-side large models" is coming. Chip giants such as Huawei and Qualcomm are exploring the implantation of large AI models on the device side, allowing mobile phones to realize the evolution of a new generation of species.

Compared with AI applications such as ChatGPT and Midjourney, which rely on cloud servers to provide services, the end-side large model focuses on realizing intelligence locally. Its advantage is that it can better protect privacy, and at the same time, the mobile phone can become the user's personal smart assistant through learning, and there is no need to worry about cloud server downtime and other issues.

However, under the existing technical conditions, the performance of mobile phones is far from enough to support the operation of large models. The mainstream technical solution in the industry is to "slim down" the large model through pruning, quantization, and distillation, and reduce the resources and energy consumption it requires on the premise of reducing the accuracy as little as possible.

Qualcomm has started developing chips for end-side large models. This indicates that mobile terminals deploying AI models are coming to us.

Mobile phone manufacturers lead the large model to the mobile terminal

AI large models are rushing from the cloud to the smart terminal.

On August 4th, at the 2023 Huawei Developer Conference, Huawei released HarmonyOS 4. Compared with previous generations of operating systems, its most significant change is that it built AI large model capabilities into the bottom layer of the system. Huawei is opening the prelude to the AI ​​model moving to the "smart terminal side".

At present, the services provided by AI applications such as ChatGPT and Midjourney are basically completed through cloud servers. Taking ChatGPT as an example, the large model and computing resources behind it are stored on a remote server. Users interact with the server in real time, and the input text is processed by the server to obtain a response. The advantage of this is that it can ensure the efficient and stable operation of the model, because the server is usually equipped with powerful computing resources and can be expanded at any time to accommodate high loads.

Now, new supporting logic has emerged. Huawei is trying to introduce large-scale models into terminals, which means that all the above-mentioned tasks can be done locally. The mobile phone system itself has certain AI capabilities, and it does not need to access AI cloud services to achieve intelligent upgrades.

Yu Chengdong, executive director of Huawei and CEO of Terminal BG, introduced that HarmonyOS 4 is supported by Huawei's Pangu model, and hopes to bring users a brand-new AI experience revolution of smart terminal interaction, high-level productivity efficiency, and personalized services.

 

HarmonyOS 4 introduces AI megamodels

The AI ​​capabilities of HarmonyOS 4 are currently mainly embodied by Huawei's smart assistant "Xiaoyi". After the large model is connected, Xiaoyi expands the input of various forms such as text, pictures and documents on the basis of voice interaction, and the natural language understanding ability is improved. Xiaoyi can also connect to a variety of services and scenarios according to instructions, such as automatically extracting text from pictures, generating various types of commercial email content or generating images, etc.

The more important change is that Xiaoyi has the ability to remember and learn. As it continues to be used, it will become more and more aware of the "master", able to intelligently give plans such as travel and activity plans, and realize it according to user habits. Personalized recommendations. Huawei revealed that these new capabilities of Xiaoyi will start a public test experience in late August.

By building the AI ​​model into the bottom layer of the mobile phone system, Huawei hopes to improve the overall intelligence of the mobile phone. Although the above-mentioned functions of Xiaoyi are not "advanced", to realize them, users often need to call ChatGPT, Midjourney and many other applications at the same time to complete. When the mobile phone itself has AI capabilities, it is like a more versatile assistant, providing comprehensive services.

Before the release of HarmonyOS 4, Huawei has actually tried to connect AI large models to mobile terminals. In March this year, Huawei released the P60 mobile phone. The built-in smart image search function is based on the multi-modal large model technology. By miniaturizing the model on the mobile phone side, the natural language model operation on the mobile phone side is realized.

Huawei is not the first company to introduce AI models into devices. At the 2023 World Artificial Intelligence Conference, Qualcomm demonstrated the operation practice of large-scale models entering the end-side, running the generative AI model Stable Diffusion on a mobile phone equipped with the second-generation Snapdragon 8, and executing 20 steps of reasoning within 15 seconds , and a 512x512 pixel image is generated, and the image effect is not significantly different from the cloud processing level.

During Shanghai MWC 2023, Honor CEO Zhao Ming also said that Honor will promote the deployment of end-side large models on the smartphone side to achieve multi-modal natural interaction, precise intent recognition, and closed-loop services for complex tasks.

Also attracting attention is Apple. A month ago, Apple was revealed to be secretly developing "Apple GPT", which is an artificial intelligence tool based on Apple's self-developed Ajax framework. Although the specific details have yet to be disclosed, it is generally speculated in the industry that Apple is likely to add a large model at the system level to improve the intelligence of the voice assistant Siri, so that Siri can take off the hat of "artificial mental retardation".

Hype or New Revolution?

It is not uncommon for mobile phone manufacturers to focus on large models, but why do they take the "device-to-side" route? After all, the interaction and generation capabilities of Huawei Xiaoyi can also be provided through cloud servers, and it seems that the cost is more economical and the technology is easier to implement.

Is it a hype or is it really necessary to put AI large models into smart mobile terminals? On this issue, both Yu Chengdong and Zhao Ming mentioned two key words: privacy security and personalization.

Yu Chengdong emphasized that Huawei advocates that the first principle of all AI experience innovation and scene design is security and privacy protection, to create a more responsible AI, and promises that the content generated by AI will be marked.

Compared with processing data in the cloud, the most obvious advantage of the smart terminal side is privacy and security. Previously, ChatGPT has repeatedly been involved in data leakage storms. In March of this year, Samsung issued a ban on the use of ChatGPT internally. The reason was that semiconductor employees were suspected of leaking company secrets by using ChatGPT; In the case of using and leaking personal privacy data, the claim amount is as high as 3 billion U.S. dollars.

When the data processing is on the end side, the user's personal data will not be uploaded to the cloud server, which greatly reduces the risk of privacy leakage. This also provides a prerequisite for the mobile phone AI assistant to truly become a life steward-only when privacy is guaranteed, users will feel relieved to hand over data to AI for learning.

In Zhao Ming's understanding, the mission of the end-side AI model is to better understand users, "knowing what time I go to bed and what I like to eat can solve my immediate needs, which is equivalent to having the ability to gain insight into my needs. "To do this, AI needs to be trained based on the user's personal data and habits. Eventually, the smartphone will hopefully become an all-round assistant, or a personal robot secretary, able to help users with catering, booking, consulting, entertainment, Office and other multi-scenario requirements.

In contrast, both ChatGPT and other mainstream AI applications are standardized products, and it is difficult to have the ability of a personal assistant without modification. It does not understand the user, but only responds to the user's input instructions. response. A personal mobile phone is already a private personal smart device. If the AI ​​model that understands human language can run on the mobile phone, the degree of intelligence will undoubtedly be greatly improved.

In addition, applications that rely on the cloud are also unstable. For example, due to network or server reasons, the response speed of the cloud may slow down, or even crash altogether. This has happened many times on ChatGPT, and the localized large model will be greatly weakened. Reliance on the cloud, so as to avoid "cloud lag".

Based on the above characteristics, the "device-to-side revolution" of large models has shown potential, and it is even hoped that mobile phones that have been in the bottleneck of development for many years will undergo another exciting species evolution, just like the emergence of large-screen smartphones and the release of the iPhone.

But there is an obvious problem for the big model to show its strength on the mobile phone: Can the mobile phone chip withstand it? Since large models often contain tens of billions or hundreds of billions of parameters, and require astronomical-level training and consume huge computing power, the performance of existing mobile phone chips obviously cannot meet the requirements.

In this regard, the current mainstream solution in the industry is "model miniaturization".

To put it simply, when the model network structure is determined, the model is “slimmed down” on the premise of reducing the accuracy as little as possible, thereby reducing the resources and energy consumption it requires. This process usually has three steps, cutting out the parameters in the model that have a very small impact on accuracy, which is called "pruning"; using lower-precision data types for reasoning, which is called "quantization" in jargon; and from complex models , to extract a similar but simpler model, which is vividly called "distillation". The ultimate goal is to reduce the size of the model.

On the other hand, chip manufacturers such as Qualcomm are also deploying and developing dedicated chips for the end-side of AI large models. Previously, Qualcomm's 5G mobile platform Snapdragon 8 Gen2 integrated the AI-specific Hexagon processor for the first time, using an independent dedicated power supply system, supporting micro-slicing reasoning, INT4 precision and Transformer network acceleration, etc., while providing higher performance, Reduce energy consumption and memory usage.

The end-side large model is setting off a new generation of smart terminal revolution. IDC predicts that by 2026, nearly 50% of terminal equipment processors in the Chinese market will have AI engine technology. Another great change that AI brings to human technological life may appear.

Guess you like

Origin blog.csdn.net/MBNews/article/details/132181783