OpenAI's chief scientist has a plan to find ways to control super artificial intelligence

OpenAI recently published a research paper describing the results of an experiment. The experiment was designed to test a way for a poorer AI model to guide a smarter AI model without losing intelligence. Although the technology involved does not yet exceed human dexterity, the experiment is designed for a future time when humans must work with artificial intelligence systems that are smarter than themselves.

According to news on December 15, as early as when OpenAI was founded, it promised to build artificial intelligence that would benefit all mankind, even if these artificial intelligences may be smarter than their creators. Since the debut of ChatGPT, OpenAI’s business ambitions have gradually become more prominent. Recently, the company announced the establishment of a new research team dedicated to studying future super artificial intelligence, and it has already begun to achieve some results.

Leopold Aschenbrenner, a researcher at OpenAI, pointed out: "Artificial General Intelligence (AGI) is rapidly approaching, and we will see super-intelligent models that have great capabilities but may also be very dangerous, and we have not yet No way to control them has been found." He participated in the "Superalignment" research team established in July this year. OpenAI said it will devote one-fifth of its available computing power to the "Super Alignment" project to explore how to ensure the safety and controllability of super artificial intelligence.

OpenAI recently published a research paper describing the results of an experiment. The experiment was designed to test a way for a poorer AI model to guide a smarter AI model without losing intelligence. Although the technology involved does not yet exceed human dexterity, the experiment is designed for a future time when humans must work with artificial intelligence systems that are smarter than themselves.

In experiments, OpenAI researchers examined a process called supervision, which is used to tune systems like GPT-4 to make them more helpful and less harmful. GPT is the large language model behind ChatGPT. Currently, this involves humans feeding back to the AI ​​system which answers are good and which are bad. As artificial intelligence advances, researchers are exploring how to automate this process to save time. Additionally, this is because they believe that as AI becomes more powerful, humans may not be able to provide useful feedback.

In controlled experiments, the researchers used OpenAI's GPT-2 text generator, first released in 2019, to teach GPT-4 and tested two workarounds. One approach is to incrementally train larger models to reduce performance loss at each step; another is an algorithmic tweak to GPT-4 that allows stronger models to follow the guidance of weaker models without weakening its performance. The second approach proved more effective, and while the researchers acknowledge that these methods don't guarantee that the stronger model will work perfectly, they can serve as a starting point for further research.

“It’s great to see OpenAI proactively addressing the problem of controlling super-intelligence, a challenge that will take years of hard work,” said Dan Hendryks, director of the Center for Artificial Intelligence Safety, a San Francisco-based organization. A non-profit organization dedicated to managing the risks of artificial intelligence.

Aschenbrenner and two other members of the Super Alignment team, Collin Burns and Pavel Izmailov, both said in interviews that they are taking this important step forward. Encouraged by this first step, we think it could help tame potential super artificial intelligence. Izmailov gave an analogy: “It’s like a sixth-grade student, even though they know less mathematics than a college mathematics major, they are still able to convey to college students what they want to achieve, and That’s exactly the effect we’re after.”

The Super Alignment team is co-led by Ilya Sutskever, OpenAI’s chief scientist and co-founder. Sultzkefer was also one of the original board members who voted last month to fire CEO Sam Altman. However, he later reversed his decision and threatened to resign if Altman was not reinstated. Sutskefer is a co-author of the latest paper, but OpenAI declined to make him discuss the project.

Last month, Altman reached an agreement with OpenAI, most of the board of directors have resigned, and Sultzkefer's future at OpenAI is also full of uncertainty. Still, Aschenbrenner said, "We are very grateful to Sutzkefer, who was the driving force behind this project."

In the field of artificial intelligence, researchers at OpenAI are not the first group to try to use existing technology to test what could help tame future artificial intelligence systems. However, as with previous studies in corporate and academic laboratories, we cannot be certain that ideas that work in well-designed experiments will be practical in the future. The researchers will have a weaker AI model train a stronger AI model, a capability they call "a key component in solving the broader 'super-alignment' problem."

This AI alignment experiment also raises a key question: How trustworthy can the control system be? At the heart of OpenAI's new technology is the idea that a more powerful AI system can decide for itself what guidance from a weaker system it can ignore, a choice that could cause it to ignore important information that might prevent it from acting in an unsafe way in the future. For such a system to be effective, progress needs to be made in providing consistency. "You ultimately need a high level of trust," Burns emphasized.

Stuart Russell, a professor at the University of California, Berkeley who studies AI safety, said the idea of ​​using less powerful AI models to control more powerful ones has been around for some time. But he also points out that so far it’s unclear whether the methods used to teach AI behavior are feasible because they don’t yet enable current models to run reliably.

While OpenAI is taking its first steps toward controlling more advanced artificial intelligence, the company is eager for outside help. OpenAI announces $10 million in grants to outside researchers in partnership with former Google CEO Eric Schmidt to encourage their work on weak-to-strong supervision, explainability of advanced models and progress in areas such as enhancing models under prompts aimed at breaking through limitations. Researchers involved in writing the new paper said that OpenAI will also hold a conference on "super-alignment" next year.

As co-founder of OpenAI and co-lead of the Hyper-Alignment team, he leads many of the company's most important technical efforts. At the same time, he is one of the leading experts increasingly worried about how to control artificial intelligence as it becomes more powerful. Since the beginning of this year, the issue of how to control future artificial intelligence technology has gained new attention, largely due to the influence of ChatGPT. Sutskefer studied for his PhD under the guidance of deep neural network pioneer Geoffrey Hinton. The latter left Google in May this year amid warnings that artificial intelligence appeared to be approaching human levels in some tasks.

Guess you like

Origin blog.csdn.net/leyang0910/article/details/135025565