Prompt words plus mysterious spells make large models smarter

Research by the Google team found that by combining the prompt words with the mysterious mantra " take a deep breath " and the familiar " Let's think step by step" (Let's think step by step ), the performance of the large model on the data set improved by 12 %. Moreover, this most effective prompt word was found by AI itself.

117c35476dd66c9ece242b5087ceadf8.jpeg

Thesis: Self-optimization of large language models

Paper source: https://arxiv.org/abs/2309.03409

The paper comes from the merged department of Google and DeepMind, but the authors are mainly from the original Google Brain team, including Quoc Le and Zhou Dengyong. The co-authors are Chengrun Yang, a Fudan alumnus who graduated from Cornell University with a Ph.D., and Chen Xinyun, a Shanghai Jiao Tong University alumnus who graduated from UC Berkeley with a Ph.D.

As we all know: the best prompt words for different models are different. This paper found that the prompt words designed by the large model can be improved by up to 50% on the Big-Bench Hard data set. In addition to the task of prompt word design, the paper also tested the ability of large models on classic optimization tasks such as linear regression and the traveling salesman problem.  

bf8cce016c0acdb13796e39f3d807d79.jpeg

Different models have different optimal prompt words.

Optimization problems are ubiquitous, and algorithms based on derivatives and gradients are powerful tools. However, in real-life applications, situations where gradients are not applicable are often encountered. To solve this problem, the team developed a new method OPRO, which is Optimization by PROmpting.

Instead of defining the optimization problem formally and then solving it with a program, the optimization problem is described in natural language and requires a large model to generate new solutions.

A graph flow summary is a recursive call to a large model.

02e7fefec01b6785afbeac1a99b18088.jpeg

In each step of optimization, the previously generated solutions and scores are used as input, and the large model generates new solutions and scores, and then adds them to the prompt words for use in the next step of optimization.

57a23442513984e9cca7b8428b75e701.jpeg

The paper mainly uses Google's PaLM 2 and the text-bison version in Bard as evaluation models. Together with GPT-3.5 and GPT-4, a total of 4 models are used as optimizers. The results show that not only the prompt word styles designed by different models are different, but the applicable prompt word styles are also different.

The optimal prompt word previously designed by AI on the GPT series is "Let's work this out in a step by step way to be sure we have the right answer."

This prompt word was designed using the APE method. The paper was published on ICLR 2023 and exceeded the human-designed version "Let's think step by step" on GPT-3 (text-davinci-002).

But this time on Google PaLM 2 and Bard, the APE version is not as good as the human version as a baseline.

2ecdc5fb749c8866a8419036bd7b2488.jpeg

Among the new prompt words designed by the OPRO method, "take a deep breath" and "disassemble the problem" are the best for PaLM. For the text-bison version of the Bard large model, detailed prompt words are preferred.

In addition, the paper also demonstrates the potential of large models in mathematical optimizers.

Linear regression as an example of a continuous optimization problem

2e7af61b349efff928692fb7fcc528ea.jpeg

The traveling salesman problem serves as an example of a discrete optimization problem.

0b9be1a59e3c23c00dd49779d1ee6d6c.jpeg

Just by using hints, large models can find good solutions, sometimes matching or exceeding hand-designed heuristics. However, the team also believes that large models cannot replace traditional gradient-based optimization algorithms. When the scale of the problem is large (such as the traveling salesman problem with a large number of nodes), the OPRO method does not perform well.

Regarding future improvement directions, the team proposed that the current large model cannot effectively utilize error cases. Only providing error cases cannot allow the large model to capture the causes of errors. A promising direction is to incorporate richer feedback on error cases and summarize key feature differences between high- and low-quality generated cues in optimization trajectories. This information may help the optimizer model improve past generated hints more efficiently and may further reduce the number of samples required for hint optimization.

The team also presented a large number of optimal prompt words obtained in experiments in the paper, including practical scenarios such as movie recommendations and spoofing movie names. Friends can draw their own mysterious spells.


Guess you like

Origin blog.csdn.net/specssss/article/details/132909747