Let generative AI "everywhere", Intel's soft and hard

Since the beginning of this year, generative AI tools such as ChatGPT have swept the world, and the calculations required behind it have brought performance, cost, and energy efficiency into the focus of everyone's attention. How to universalize generative AI so that it can be used by all walks of life?

Recently, at the Intel Generative AI Technology and Industry Development Media Communication Conference, Mr. Dai Jinquan, Academician of Intel and Global Chief Technology Officer of Big Data Technology, gave us a detailed analysis of Intel's insights and thinking.

01 Make generative AI "everywhere"

Dai Jinquan said that Intel has been committed to promoting the beautiful vision of "AI everywhere", so it also hopes to promote the "ubiquitous" of generative AI.

Specifically, Intel hopes to support the ubiquity of generative AI by optimizing computing power and improving computing power.

First of all, in terms of professional AI hardware acceleration, whether it is a GPU or a dedicated AI accelerator such as Gaudi 2, the newly released fourth-generation Xeon Scalable Server adds an accelerator (Intel® AMX) specifically for matrix operations to meet the needs of generative AI. The need for performance, price and sustainability.

In terms of software, Intel uses software to release the computing power of hardware, through cooperation with PyTorch, TensorFlow, Hybrid Bonding, etc., as well as work on OpenAI's AI compiler Triton, and large-scale distributed training with Microsoft A lot of optimization has been done on the software stack DeepSpeed.

Intel is committed to embracing open source. Through the open source software tool oneAPI under the XPU architecture, it provides support for generative AI in different scenarios for customers' notebooks, data center processors, and accelerators.

Intel pays attention to open large language models, for example, has done a lot of work with open source models such as Hugging face. In cooperation with Hugging face, using the Gaudi 2 accelerator, BLOOMZ 176B has been optimized and evaluated, which may be the largest open source language model so far. The results show that if 8 Gaudi 2s are used to run inference, compared with 8 Nvidia A100s, Gaudi 2 is more than 20% faster than A100. In addition, Intel and Hugging face have also launched some cooperation on Stable Diffusion. On the latest fourth-generation Intel Xeon Scalable processor, AMX advanced matrix extension is used for matrix acceleration, and it is possible to fine-tune a belonging to within 5 minutes. Own Stable Diffusion model, reasoning in four or five seconds.

Through the above initiatives, Intel supports generative AI applications by providing a full range of "intelligent computing" capabilities. "Intel's goal is to universalize generative AI and make AI ubiquitous. Only when each Intel chip can provide intelligent computing capabilities to support generative AI can we truly achieve AI ubiquity." Dai Jinquan said.

As we all know, Intel has powerful computing power, which can be fully felt from the CPU on the laptop, the integrated graphics card, the discrete graphics card, and the Xeon server in the data center.

02 Ordinary notebooks can also run large models, Intel did it

At the meeting, Dai Jinquan showed a video of running a large language model on a personal notebook. The 7B (7 billion parameters) entry-level large language model is currently running faster on laptops, and the 13B (13 billion parameters) large language model can basically keep up with people's reading speed and interact with people. The 65B (65 billion parameters) large language model can also run on Xeon processors.

In addition to the large language model, Stable Diffusion can run on 12th-generation Core notebooks. It does not require any independent graphics card. It can directly use the integrated graphics card to generate a picture in 20 to 30 seconds. The user experience is very good. Dai Jinquan said that if the user has an Intel Arc discrete graphics card, the picture can be generated within three or four seconds.

After speaking, Dai Jinquan demonstrated using a laptop to run Stable Diffusion and let it generate a cat in the style of Chinese painting, which was realized in just over 20 seconds. "Users can implement it on any ordinary Intel notebook, even a thin and light notebook, which truly reflects the 'generative AI is everywhere'."

In addition to the above-mentioned consumer-level users who can easily experience generative AI technology, when it comes to the huge opportunities of generative AI, Dai Jinquan said that generative AI can help improve productivity. Taking Stable Diffusion as an example, it can help designers through simple sketches and Describe the generation of final renderings and renderings, which will greatly improve their work efficiency. At the 2023 Intel On Industry Innovation Summit, Intel and the fashion industry 3D design software service provider Zhejiang Lingdi Digital Technology Co., Ltd. (Style3D) cooperated to help them build design applications that combine generative AI and 3D, which can be deployed to them through laptops. Client's client.

In addition, there are good development prospects and prospects in AI for Science and other fields. How to apply AI large models to the scientific field is of great significance. At present, Intel has launched Aurora genAI, a large-scale generative AI model, which is a generative AI model with up to one trillion parameters, mainly for scientific research fields, including biology, medicine, atmospheric science, chemistry, astronomy and other scientific research fields.

03 Responsible AI

Since the birth of generative AI, security and privacy protection issues cannot be ignored and have attracted the attention of the industry. Dai Jinquan introduced that Intel mainly carries out the "responsible AI" work from the following three aspects:

First, Intel has a "responsible AI" process for data, models, and applications involving AI, which defines how to eliminate bias and use correct data.

Second, Intel has done a lot of related work on data security and privacy computing, creating hardware-level security technologies such as Intel TDX and Intel SGX, and building a privacy computing platform for big data analysis and machine learning at the software layer. Language model and Stable diffusion protect the application of generative AI from both data and model to ensure data security and privacy.

Third, some content of generative AI is generated by machines. In this regard, Intel Research Institute has also done a lot of work to implement algorithms to determine whether the generated content is generated by applications such as Deepfake.

In addition, by running large language models such as Stable Diffusion on laptops, it not only lowers the threshold for using AI, allowing consumers to use generative AI, but also greatly protects the privacy of data models, because generative AI can be used , The large language model is deployed locally, and the algorithms, applications, and data are all local without sharing with others.

Let generative AI "everywhere", Intel's soft and hard

Guess you like