GPT-4 successfully concluded that P≠NP, Terence Tao’s prediction came true! 97 rounds of "Socratic reasoning" dialogue to solve the world's mathematical problems

 Source | Machine Heart ID | almosthuman2014

For those who are in the field of scientific research, they have more or less heard of the P/NP problem. This problem was included in the Millennium Prize Problems by the Clay Mathematics Institute. There are seven major problems in it, which are well-known to everyone. Poincaré Hypothesis, Riemann Hypothesis, etc. are included. And the organization has offered millions of dollars in prizes to researchers who can solve the problem.

The P/NP problem was first proposed in 1971 by Stephen A. Cook and Leonid Levin respectively. Over the years, many people have devoted themselves to researching this problem. But some people say that a conservative estimate of the solution of P=NP may take another 100 years.

In recent years, many people have claimed to have proved that P is equal to or not equal to NP, but there are errors in the proof process. So far, no one has been able to answer this question.

Now, with the development of AI technology, especially the rapid iteration of large language models this year, some research has begun to try to use AI technology to solve these world problems.

In this article, researchers from Microsoft Research, Peking University, Beihang University and other institutions propose the use of large language models (LLM) to enhance and accelerate the research on P versus NP problems. 

Specifically, this article proposes a general framework that enables LLM to think deeply and solve complex problems: Socratic reasoning. Based on this framework, LLM can recursively discover, solve, and integrate problems while also self-evaluating and improving.

This paper's pilot study on the P vs. NP problem shows that GPT-4 successfully generated a proof pattern and conducted rigorous reasoning in 97 dialogue rounds, reaching the conclusion of "P≠ NP", which is consistent with (Xu and Zhou, 2023).

picture

Paper address: https://arxiv.org/pdf/2309.05689.pdf

The contributions of this article can be summarized as:

  • Use LLM as a collaborative partner with humans to address complex scientific challenges, and propose the "LLM for Science (LLM4Science)" paradigm.

  • Introducing a framework called "Socratic Reasoning" to encourage LLM to use deduction, transformation, decomposition and other modes to stimulate critical thinking.

  • A pilot study is conducted using GPT-4 and the Socratic reasoning framework to solve P versus NP problems in theoretical computer science.

  • GPT-4 successfully generates proof patterns and performs rigorous reasoning in 97 dialogue turns, reaching the conclusion that P ≠ NP, consistent with recent work by Xu and Zhou (2023).

  • This research demonstrates the potential ability of LLMs such as GPT-4 to infer new knowledge and explore complex expert-level problems in collaboration with humans.

  • This article emphasizes that LLM is a general innovation leader across fields, which is different from previous specialized AI models tailored for specific tasks.

  • LLM’s ability to use natural and mathematical language fluently is essential for interdisciplinary discovery.

  • This work reveals how LLMs can be leveraged as partners to enhance and accelerate scientific research processes across diverse fields.

The article stated that the reason why they named the framework "Socratic Reasoning" was inspired by the ancient Greek philosopher Socrates. Socrates once said: "I can't teach anyone anything. I can only make them think." The overall design idea of ​​the framework is also the same. It is a general problem-solving framework that allows LLM to be used in a wide range of Navigate the solution space and arrive at answers efficiently.

As shown in Table 1, "Socratic Reasoning" has five prompt modes: deduction, transformation, decomposition, verification, and integration. These patterns are used to discover new insights and perspectives, break complex problems into sub-problems or small steps, and engage in self-improvement by challenging responses to answers.

On smaller problems (atomic problems), LLM can directly give reasoning results. At this time, a deductive mode (for example, the prompt is let us think step by step...) is used to guide LLM to directly draw conclusions.

For more complex problems, this paper first requires LLM to transform the problem into a new problem or decompose it into several sub-problems. These patterns are then executed recursively until an atomic ji problem is reached.

When new questions arise or new conclusions are drawn, the verification mode is adopted and LLM's self-evaluation ability is used for verification and improvement.

Finally, the fusion mode requires LLM to synthesize conclusions based on the results of subproblems.

The LLM is motivated to continue the above process recursively through a series of conversations until the target problem is solved.

picture

In this work, "Socratic Reasoning" provides a systematic prompt framework for challenging questions.

The picture below is an example of a dialogue used to solve the P vs. NP problem in "Socratic Reasoning". The GPT-4 API is used in the case study, and in addition, the article sorts the processes based on round index. 

picture

During the exploration process, this paper introduces five different characters (e.g., mathematicians proficient in probability theory) as auxiliary provers. A total of 97 rounds of dialogue were conducted to complete this experiment, divided into 14 dialogue rounds in the first stage and 83 dialogue rounds in the last stage.

For example, the first round prompt: Can you find the fundamental problem behind P!=NP? From a philosophical perspective, not from a computer theory perspective.

picture

Other tips are as follows:

picture

picture

After that, the dialogue continued, and the last round of dialogue was like this: Finally, the conclusion P≠ NP was given.

picture

Interested readers can view the original paper to learn more.

Guess you like

Origin blog.csdn.net/lqfarmer/article/details/133181859