Li Hongyi Machine Learning 2023: Big Language Model - Two Expectations of Humans

Human beings have two different expectations for large language models - domain experts or know-it-alls

 vs 

In recent years, with the continuous development of artificial intelligence technology, the application of large-scale language models has become more and more extensive, and people have different expectations for such technologies.

From the perspective of technical means, people mainly expect that large-scale language models can become experts or generalists through the two technologies of finetune and prompt , so as to better meet people's needs.

Finetune and Prompt : are two technical means to make large models into  experts or  generalists  in specific fields 

Finetune and Prompt are two techniques commonly used in large language models:

        Finetune is essentially a gradient descent method to change parameters;

        Prompts are instructions manipulated in human language , allowing large language models to better adapt to human needs.

Two expectations of human beings for large models

Expectation 1: Become an expert

One of the expectations people have for a large language model is to be able to become an expert and solve a specific task .

For example, translation tasks require large language models to accurately translate text between different languages, which requires specialized language knowledge and skills.

Through Finetune technology, large-scale language models can be fine-tuned for specific tasks, improving their performance in specific fields, and are more likely to win in specific aspects.

Expectation 2: Become a generalist

On the other hand, people would prefer large language models to be generalists, everything. This is also what humans expect from ChatGPT. To achieve this goal, pre-trained models need to be adapted to different tasks. In addition to Finetune technology, Adapter technology can also be used.

Adapter is a technology to transform the pre-training model - instead of adjusting all the parameters in the model like Finetune, it inserts additional plug-ins into the language model .

This plug-in may be a new layer that can be inserted in various places, depending on the specific application method. Through the Adapter technology, different Adapters can be added outside the model to suit different tasks.

What are the benefits of adapting the large model to different tasks by adding plug-ins?

If we want a large model to perform very well on a specific function , we need to adjust the parameters inside the large model in a targeted manner;

So! If I want the machine to perform very well in translation, painting, and writing , should I train 3 different sets of parameters ? Is the energy and memory spent in this way too large?

So, the advantages of using the corresponding n different types of plug-ins are reflected!

After using the Adapter, only one large model and n Adapters need to be stored in the memory ,

What needs to be fine-tuned is not the big model itself, but the plug-in!

 

This is where the charm of the Adapter lies: it can not only reduce storage costs, but also reduce training costs, greatly improving the ability of the model to solve different tasks. Thus becoming a generalist to meet more needs of people~


Summarize

In general, people's expectations for large language models can be divided into two types: to be experts and to be generalists. Finetune and Prompt are commonly used technologies to achieve these two expectations, while Adapter technology is a more efficient way - with plug-ins, it can help large language models better adapt to different tasks at a lower cost and become a real generalist.

Guess you like

Origin blog.csdn.net/fantastick99/article/details/130321483