GPT-4 embarrasses DeepMind: You boarded Nature's sorting optimization algorithm, and I found out in two paragraphs

Source | Qubit | Public Account QbitAI

DeepMind's new AI has only been on Nature for a day, and GPT-4 is here to fight!

With only two prompts , GPT-4 gives the same sorting algorithm optimization method as AlphaDev.

And AlphaDev is called by DeepMind as "reproducing the magic stroke of AlphaGo", and discovered a sorting algorithm that speeds up by 70%.

Oh, now AlphaDev is even more embarrassed.

Let GPT-4 "discover" the brother who operates the same operation directly yin and yang:

There is no need for reinforcement learning at all. Can I publish this discovery in Nature?

Musk "passed by and saw it", and also left a sentence " Listen because of blowing ".

So how does GPT-4 do it?

2 prompts and it's done

This new discovery was brought about by an associate professor from the University of Wisconsin-Madison named Dimitris Papailiopoulos (hereinafter referred to as Professor D).

The steps he took to get GPT-4 to do this were as simple as typing two prompts.

First, he said to GPT-4:

Here is a sorting algorithm, I think it can be further optimized. Can you please mark with * which instructions can be deleted or improved in the following lines? If you don't need to modify it, don't move anything. Explain why step by step, then go back and verify it's true.

At the first step, he also emphasized that if there are any new discoveries, don’t make any changes, just “look at” and write out some written suggestions for improvement.

Be very detailed and very careful.

GPT-4 then gave a detailed explanation of the given code.

Then Professor D gave a second hint:

continue. If you are very sure, follow the tips above. Temperatur=0 (make the generated results deterministic and consistent), try to be brief to avoid confusion.

Then GPT-4 gave detailed steps, and finally concluded that:

We found that the instruction "mov SP" is redundant and can be removed, all other instructions are required. But after deletion, P should be replaced by S.

Compared with DeepMind's new job AlphaDev's thinking on dealing with the same problem, it cannot be said that it has nothing to do with it, but it can only be said that it is exactly the same:

DeepMind's operation on AlphaDev reminds people of AlphaGo's "37th move" - ​​a counterintuitive move that directly defeated the legendary Go player Lee Sedol, which shocked the audience.

Similarly, AlphaDev skipped a step by swapping and copying moves, and achieved the goal in a way that seemed wrong but was actually a shortcut .

According to reports, AlphaDev is a reinforcement learning algorithm based on AlphaZero. Its discovery is not based on existing algorithms, but from the bottom of the assembly instructions .

Its innovation lies mainly in two instruction sequences:

(1) AlphaDev Swap Move (exchange move)

(2) AlphaDev Copy Move (copy move)

In principle, DeepMind researchers designed a single-player "assembly" game for it:

As long as you can search and select the appropriate instructions (flow A in the figure below), and arrange the data correctly and quickly (flow B in the figure below), you will be rewarded.

But the challenge of this game is not only the size of the search space (the number of combinable instructions is equivalent to the number of particles in the universe), but also the nature of the reward function, since one wrong instruction can disable the entire algorithm.

Netizen: We always underestimate the ability of GPT-4

Regarding GPT-4's "show operation", some people said that even senior developers underestimated GPT-4.

Some people said with emotion that Professor D's operation has been further verified. As long as you have patience and understand prompt engineering , there are still many things that GPT-4 can do.

Some people also questioned whether GPT-4 can do this because its training data contains some sorting algorithm optimization methods?

But having said that, a large part of the reason why this matter has attracted so much attention and discussion is that AlphaDev is controversial on Nature.

Many people think that this is not a groundbreaking research, and DeepMind is exaggerating.

Not only Professor D Yinyang said "Can I also enter Nature", but some netizens said that they optimized the quick queue when they were teenagers, and this should also be published.

Of course, some people think that the innovation of AlphaDev itself is that it uses reinforcement learning to discover new algorithms.

What do you think?

Reference link:
[1]https://chat.openai.com/share/95693df4-36cd-4241-9cae-2173e8fb760c
[2]https://twitter.com/DimitrisPapail/status/1666843952824168465

Guess you like

Origin blog.csdn.net/lqfarmer/article/details/131309103