It found a faster sorting algorithm that was 70% faster

9ea0fee8466a78dcaa1dc48d45e4ad8d.png

来源:机器之心
本文约3700字,建议阅读5分钟
AlphaDev 发现了一种全新且更快的排序算法。

"By swapping and copying the move, AlphaDev skips a step and connects items in a way that may seem wrong, but is actually a shortcut." This never-before-seen, counterintuitive thought brings back memories of 2016 that spring.

Seven years ago, AlphaGo defeated the human world champion in Go, and now AI has taught us a lesson in programming.

Recently, Google DeepMind CEO Hassabis's two words detonated the computer field: "AlphaDev discovered a new and faster sorting algorithm, and we have open sourced it to the main C++ library for developers to use. This is just AI The beginning of progress in improving code efficiency.”

ef498a386bcae739fe7b7ba21fa389df.png

This time, AlphaDev, Google DeepMind's new reinforcement learning system, discovered a faster-than-ever hashing algorithm, a fundamental algorithm in computer science, and the results of AI have now been incorporated into LLVM's standard C++ library Abseil and open-sourced .

How important is this result? "We estimate that the sorting and hashing algorithms it discovered are invoked trillions of times a day around the world," said Daniel J. Mankowitz, a Google DeepMind research scientist and one of AlphaDev's lead authors.

d9aab273b8860c3f54c0b663bc3dc570.png

AI seems to be algorithmically speeding up the world.

These algorithms improve upon the LLVM libc++ sort library, which is 70% faster for shorter sequences and about 1.7% faster for sequences over 250,000 elements. Google DeepMind says this is the first change to this part of the sequencing library in more than a decade. It seems that now AI can not only help people write code, but also help us write better code.

In their latest blog, the authors of the new system describe AlphaDev in detail.

New Algorithms Will Change the Basis of Computing

The digital society drives the ever-increasing demand for computing and energy. Over the past fifty years, the digital age has relied on hardware improvements to keep up with demand. But as microchips approach their physical limits, improving the code that runs on them becomes critical. This is especially important for algorithms that comprise code that runs trillions of times per day. 

The result is this research by Google DeepMind, which was published in Nature, and AlphaDev is an AI system that uses reinforcement learning to discover algorithms that go beyond what scientists and engineers have honed over decades.

eab674afb6e0b5bda453a50c1fde5da9.png

Paper address:

https://www.nature.com/articles/s41586-023-06004-9

Overall, AlphaDev found a faster sorting algorithm. Although billions of people use these algorithms every day, no one realizes that there is still room for improvement in this algorithm. Sorting algorithms are used in a wide range of applications, ranging from online search results, sorting social posts, to various data processing on computers and mobile phones, all of which are inseparable from sorting algorithms. Using AI to generate better algorithms will change the way humans program computers, with major implications for an increasingly digital society.

By open sourcing the new sorting algorithm in a major C++ library, millions of developers and companies around the world can now use it in AI applications across industries as diverse as cloud computing, online shopping, and supply chain management. This is the first change to the sorting library in more than a decade, and the first time an algorithm designed with reinforcement learning has been added to the library. That sees this as a major milestone in the progressive optimization of the world's code using artificial intelligence.

About sorting

A sorting algorithm is a method of ranking certain tasks in a specific order. For example, alphabetize three letters, order five numbers from largest to smallest, or sort a database of millions of records.

This algorithm has a long history and has evolved well. One of the earliest examples of sorting dates back to the 2nd and 3rd centuries AD, when scholars alphabetized thousands of books by hand on the shelves of the Library of Alexandria. With the advent of the Industrial Revolution came machines that could help people sort, including tabulating machines that stored information using punched cards that were used to collect the results of the 1890 U.S. census.

With the rise of commercial computers in the 1950s, the earliest computer science algorithms for sorting algorithms began to develop. Today, many different sorting techniques and algorithms are used in codebases around the world to process massive amounts of online data.

f21342495dca8303632d9fb81a9c2100.png

Input a sequence of unsorted numbers to an algorithm and output sorted numbers.

After decades of research by computer scientists and programmers, the current sorting algorithm has become so efficient that it is difficult to achieve further improvements. This is a bit like trying to find a new way to save power or be more efficient. And these algorithms are the cornerstone of computer science.

Exploring New Algorithms: Assembly Instructions

In addition to exploring faster algorithms from scratch rather than building on top of existing ones, AlphaDev can also be used to find what most people don't: computer assembly instructions.

Assembly instructions can be used to create binary code that a computer executes. Developers write code in a high-level language such as C++, but it must be translated into "low-level" assembly instructions that computers can understand.

Google DeepMind sees room for many improvements at this level that might be hard to spot in higher-level programming languages. At this level, the storage and operation of computers are more flexible, which means that there are more potential improvements that can have a greater impact on speed and energy use.

784d877b82c4aa8984dad099e1966308.pngCode is usually written in a high-level programming language such as C++. The compiler then translates it into low-level CPU instructions called assembly instructions. An assembler converts assembly instructions into executable machine code so a computer can run.

e369442e972848f361bb59179fe9c515.pngFigure A: Example of a C++ algorithm that sorts up to two elements; Figure B: corresponding assembly representation.

Finding the Best Algorithm Using AlphaGo's Method

AlphaDev is based on a previous achievement of Google DeepMind: AlphaZero, the reinforcement learning model that beat the world champion in games such as Go, chess and chess. And AlphaDev shows how this model transfers from games to scientific challenges, and from simulations to real-world applications.

To train AlphaDev to discover new algorithms, the team turned sorting into a one-player "assembly game." On each turn, AlphaDev observes the algorithm it produces and the information contained in the CPU, and then makes a move by selecting an instruction to add to the algorithm.

Assembly games are very difficult because AlphaDev must efficiently search through a large number of possible instruction combinations to find an algorithm that can be sorted and is faster than the current best algorithm. The number of possible combinations of instructions is similar to the number of particles in the universe, or the number of possible combinations of moves in chess (10^120 games) and Go (10^700 games), and one wrong move can disable the entire algorithm.

bfd35da4a268c8640cdc3a6a8e5ca471.pngFigure A: Assembling the game. The player AlphaDev receives as input the state of the system st and plays chess by selecting an assembly instruction to add to the algorithm already generated so far. Figure B: Reward calculation. After each move, the resulting algorithm is fed a test input sequence -- for sort3, this corresponds to all combinations of three-element sequences. The algorithm then produces an output which is compared to the expected output of the sorted sequence in the sorted case. The agent is rewarded based on the correctness and latency of the algorithm.

When building an algorithm, one instruction at a time, AlphaDev checks that it is correct by comparing the output of the algorithm with the expected result. For a sorting algorithm, this means that unordered numbers go in and correctly sorted numbers come out. The team rewards AlphaDev for the correct ordering of numbers and the speed and efficiency of the ordering, and AlphaDev then wins the game by discovering the correct, faster program. 

It found a faster sorting algorithm

AlphaDev discovered new sorting algorithms that lead to improvements in the LLVM libc++ sorting library: the sorting library is 70% faster for shorter sequences, and about 1.7% faster for sequences over 250k elements. 

Among them, the Google DeepMind team is more focused on improving short sequence sorting algorithms with three to five elements. These algorithms are among the most widely used because they are often called multiple times as part of a larger sort function, and improving them can increase the overall speed of sorting any number of items.

To make the new sorting algorithm more useful to people, the team reverse engineered the algorithms and translated them into C++, one of the most popular programming languages ​​used by developers.

These algorithms are currently available in the LLVM libc++ standard sorting library (https://reviews.llvm.org/D118029), used by millions of developers and companies around the world.

"Exchange and Copy Actions", God's Hand Reappears?

In fact, AlphaDev not only discovered faster algorithms, but also new methods. Its sorting algorithms incorporate new sequences of instructions that save one instruction each time they are applied -- which obviously has a huge impact, since these algorithms are used trillions of times a day. They call these "AlphaDev swap and copy actions".

The novel approach is reminiscent of AlphaGo's "move 37" — the counterintuitive move that stunned onlookers and led to the defeat of the legendary Go player at the time. By swapping and copying actions, AlphaDev skips a step and connects items in a way that looks like a bug but is actually a shortcut. This demonstrates AlphaDev's ability to uncover original solutions and challenge the way humans think about how to improve computer science algorithms.

0fdb3ffacacd96fe83ae8ef94cb2d510.pngLeft: min(A,B,C) original sort3 implementation; Right: AlphaDev swap move - AlphaDev figured out you only need min(A,B).

9c96345150c3a9d66f2a7d35400871bc.pngLeft: The original implementation of max(B,min(A,C,D)) used in a larger sorting algorithm to sort eight elements; Right: AlphaDev found that when using its copy action, only max(B,min(A,C)).

Extended ability test: from "sorting" to "hashing"

After discovering a faster sorting algorithm, the team tested whether AlphaDev could generalize and improve on a different computer science algorithm: hashing. 

Hashing is a fundamental algorithm used in computing to retrieve, store, and compress data. Like a librarian who uses a classification system to locate a certain book, a hashing algorithm helps users know what they are looking for and where to find it. These algorithms take the data for a specific key (such as the username "Jane Doe") and hash it -- a process that turns the raw data into a unique string (such as 1234ghfty). Computers use this hash to quickly retrieve data related to the key, rather than searching through all the data.

The team applied AlphaDev to one of the most commonly used hashing algorithms in data structures in an attempt to discover a faster algorithm. When applying it to a hash function in the 9-16 byte range, AlphaDev found a 30% speedup for the algorithm.

This year, AlphaDev's new hashing algorithm was released into the open-source Abseil library, available to millions of developers around the world, and it's now probably used trillions of times a day. 

Open source address:

https://github.com/abseil/abseil-cpp/commit/74eee2aff683cc7dcd2dbaa69b2c654596d8024e

epilogue

By optimizing and rolling out improved sorting and hashing algorithms from Google DeepMind to developers around the world, AlphaDev demonstrated its ability to generalize and discover new algorithms with real-world impact. AlphaDev can be seen as a step toward developing general-purpose AI tools that can help optimize the entire computing ecosystem and solve other problems that benefit society.

While optimization in low-level assembly instruction space is very powerful, AlphaDev still has limitations as the algorithm grows, and the team is currently exploring its ability to optimize algorithms directly in high-level languages ​​such as C++, which is very useful for developers. more useful.

AlphaDev's findings, such as swapping and copying actions, not only show that it can improve algorithms, but also find new solutions. These findings may inspire researchers and developers to create techniques and methods that can further optimize the underlying algorithms to create a more robust and sustainable computing ecosystem.

Reference content:

https://www.deepmind.com/blog/alphadev-discovers-faster-sorting-algorithms?utm_source=twitter&utm_medium=social&utm_campaign=OCS

https://news.ycombinator.com/item?id=36228125

https://twitter.com/DJ_Mankowitz/status/1666468646863130631

Editor: Wen Jing

a981464e5d9f5af54b3b01f1a30d4cdf.png

Guess you like

Origin blog.csdn.net/tMb8Z9Vdm66wH68VX1/article/details/131297612