What did OpenAI do right?

a7f8d8d18bee277cb2cf21b49d0d2f2e.gif

Author | Li Jianzhong       

Produced | Public Account of "Li Jianzhong Yansi"

Through a series of breakthroughs in AI technology and products, OpenAI detonated the development of general artificial intelligence (AGI), which was called "a technological wave comparable to the industrial revolution" by Microsoft CEO Satya Nadella. The AGI technology route based on the large language model developed by OpenAI basically announced the end of other AI technology routes, so that it can change the course of the entire field by itself, which is unprecedented in the history of technology. How does a start-up company with two to three hundred people (when ChatGPT was launched at the end of last year, the OpenAI team had about 270 people), how did it overcome all obstacles in the AI ​​arena where many giants have competed for many years, and won the holy grail of general artificial intelligence? Whether in Silicon Valley or in China, many people are asking:

Why is a start-up company like OpenAI behind such an epic revolution like AGI? What did OpenAI do right?

I have been tracking and researching the industrial development in the field of AI since 2016. The reason for organizing the Global Machine Learning Technology Conference (ML-Summit) every year has been with many experts from OpenAI, including Ilya Sutskever (Chief Scientist), Lukasz Kaiser (Transformer Co-inventor), Andrej Karpathy (co-founder), Ian goodfellow (father of GAN), etc. often contact and communicate. I have long noticed OpenAI, a maverick "heterogeneous" in the field of AI.

If I look back at the development history of the AI ​​industry and the many key choices made by OpenAI on the historical road, I am almost amazed to find that as a start-up company, OpenAI has chosen "difficult but correct" without hesitation at every critical fork in the road. "decision. Looking back at these "difficult and correct" choices in the history of OpenAI's development, I think it will be an important inspiration for many of our colleagues in the AI ​​field today.

3dc6782b17b91c4772bc0be00d75105d.png

Vision and Mission: Swords towards General Artificial Intelligence

“Our goal is to advance artificial intelligence in a way that benefits all of humanity. Today’s AI systems are amazing, but many perform poorly. But in the future, it is very possible for AI to match human performance on almost all intellectual tasks. The outcome of this undertaking is uncertain and the work is difficult, but we believe our goals and structure are right."

This passage is an excerpt from the "Vision and Mission" blog published by OpenAI's founding team shortly after its establishment in December 2015. Reading it eight years later, the feeling is still sincere and exciting.

OpenAI was able to put forward the powerful "vision and mission" of "general artificial intelligence" when the entire AI field was shrouded in various fogs in 2015, which was based on the founding team's solid belief in artificial intelligence, deep understanding and current research Based on these, I call these collectively the "will" for general artificial intelligence. This kind of "willing power" has helped OpenAI to keep upright again and again on the road of artificial intelligence development.

"Vision and mission" in today's impetuous venture capital circle can easily be alienated as "fooling VCs and drawing cakes for employees". But if you study the history of human science and technology development, you will find that putting forward a strong "vision and mission" in a field is a distinctive feature of being a pioneer in a field. Conversely, all revolutionary things are extremely difficult. Without a strong "vision and mission" guidance, it is easy to give up and collapse when encountering difficulties. So, for those founders who have a strong belief, I encourage everyone to speak out about your "vision and mission." I also hope that our venture capital and media circles will encourage and support entrepreneurs' "vision and mission" instead of ridiculing them.

I often think that if the time is pulled back to 2015, if the two boys, 30-year-old Sam Altman and 29-year-old Ilya Sutskever, spoke about the above-mentioned "vision and mission" at one of our venture capital activities, would it be Wouldn't he be drowned by the spittle of all the "big bosses"? The fact is that OpenAI received about 100 million US dollars in donations when it was established. At that time, OpenAI was established as a non-profit organization.

83d2bad62c90062fa458899f3963f0d7.png

Technical Route 1: Unsupervised Learning

Not long after OpenAI was established, it bet on the path of "unsupervised learning" under the leadership of Ilya Sutskever. Friends who are familiar with the field of AI research know that today's seemingly incomparably correct decision was definitely not so obvious in 2015-2016. Because in the field of artificial intelligence at that time, "supervised learning" through the method of labeling data was popular, and the effect was better in many vertical fields such as recommendation systems and machine vision.

However, "unsupervised learning" was very immature in terms of theoretical breakthroughs and engineering technology at that time, and the effect was greatly reduced, which was a typical "non-mainstream". However, "unsupervised learning" that does not require manual labeling of data has strong universality and is easy to expand. Through large-scale data pre-training, the model can learn the rich human knowledge contained in the data, so that it can perform well in various tasks. Show your skills. For the goal of "general artificial intelligence", "unsupervised learning" obviously has "task universality" and the ability to quickly "scale (expand)" based on massive data.

Looking at it today, many "supervised learning" methods have been greatly thrown off by OpenAI's "unsupervised learning", but choosing "unsupervised learning" at that time was obviously a "difficult but correct" decision. The vision of AGI is inseparable.

a58a5d1ce43991551b492d9537101ff7.png

Technical Route 2: Generative Model

When various "recognition" tasks (such as visual recognition, speech recognition, etc.) became popular in 2016, OpenAI quoted the famous physicist Feynman at the beginning of "Generative Models" published in June 2016. Famous quote "What I cannot create, I do not understand. If I cannot create, I cannot understand". It also focuses OpenAI's research on generative tasks.

At that time, although there was an amazing moment of GAN (generative confrontation network) invented by Ian goodfellow, its inexplicability, and its "usefulness" compared to recognition tasks, in general, the mainstream artificial intelligence industry is actually very helpful for generating The judgment of the formula model is "difficult, but not very useful".

However, after reading through the article "Generative Models", we can see that the OpenAI team is determined that the generative model is "the only way for AGI", and you can appreciate the characteristics of the OpenAI team that are outstanding and self-reliant.

b822b7bb6af49b6910aa9d6078336794.png

Technical Route Three: Natural Language

Although deep learning entered the industry and became the mainstream method in 2012, machine vision soon became a more mature field with better effects and stronger monetization capabilities. Although Ilya Sutskever also became famous in the field of machine vision through AlexNet's participation in the ImageNet competition, OpenAI did not choose vision as the main direction after some attempts, but chose to bet on the more difficult and risky " natural language".

Compared with vision, speech and other fields, natural language processing has long been considered a relatively backward field, because natural language tasks have huge complexity and solution space, and many methods are good for a single task, but they are not good for another task. Poor, choppy. There is also a saying in the industry that natural language processing is the "holy grail" in the field of artificial intelligence.

While trying OpenAI Gym (an open source reinforcement learning platform) and OpenAI Five (using reinforcement learning to play Dota2 games), OpenAI is going further and further on the task of using unsupervised learning for natural language. Especially in 2017, the generative approach to predict the next character of Amazon reviews has achieved great results.

Why OpenAI chose to bet on natural language? To paraphrase the famous philosopher Wittgenstein "the boundaries of language are the boundaries of the world". In the words of Ilya Sutskever, "language is the mapping of the world, and GPT is the compression of language". As far as human intelligence is concerned, natural language is the core of the core, while other vision, voice, etc. are just auxiliary materials for natural language.

It is precisely because of the faith-like bet on the road from natural language to AGI that Google's Transformer foundational paper "Attention is All You Need" was released on June 12, 2017. In Ilya Sutskever's original words, the paper was published for the first time. The next day, his first reaction after seeing the paper was "that's it". The Transformer model theoretically subverts the previous generation of natural language processing methods such as RNN and LSTM, and clears some key obstacles for the OpenAI team to explore the field of natural language.

Unfortunately, the theoretical model of Transformer has not received enough attention within Google, but it has made the OpenAI team feel like a treasure. This scene is very similar to Jobs’ visit to the Xerox PARC Research Institute’s Alto computer in 1979. After the graphical interface (GUI) and mouse, he returned to Apple and began to bet on the graphical interface and opened the era of personal computers. Xerox PARC’s leadership However, it has been slow to see the huge computing potential released by the GUI to the general public. The seven co-inventors who personally built Transformer also left Google one after another. Some joined OpenAI (including our keynote speaker at the 2021 Global Machine Learning Technology Conference and OpenAI research scientist Lukasz Kaiser), and some founded new companies with the support of Silicon Valley VC. A generation of artificial intelligence companies. It's a bit like yesterday's reappearance of the "eight traitors" of Fairchild Semiconductor in Silicon Valley.

f14e72212d6b1c4459489020bcaade7c.png

Technical Route Four: Decoder

After Transformer opened the theoretical window of the large language model, the large language model developed three routes. The first type is the Encoder-Only (encoder) route represented by Google BERT and ELECTRA; the second type is the Encoder-Decoder (codec) route represented by Google T5 and BART; the third type is based on OpenAI GPT Decoder-Only (decoder) route represented.

Of these three routes, the Encoder-Only route is suitable for comprehension tasks, it is difficult to deal with generative tasks, and it does not have good scalability and adaptability. Although Google BERT was once popular in some sub-fields, it is now almost in the mainstream. The point of abandonment. The Encoder-Decoder route is suitable for specific scenario tasks, but its versatility and scalability are relatively poor. The Decoder-Only route is first of all very suitable for generating tasks, and at the same time, it has good versatility for various tasks, and it also has high scalability (scale) in engineering, which is very suitable for expanding the scale of the model.

aa2ef5401a0cf23d25dabee8304a342e.jpeg

Based on these characteristics, if AGI general artificial intelligence is to be the destination, then the Decoder-Only route is obviously the best choice. From the evolutionary tree of the large language model above, it can be seen that the Decoder-Only route chosen by GPT clearly leads the development and prosperity of the large language model.

50b93637211aa0901892694e0d1d03a8.png

Technical Route Five: From Reinforcement Learning to Alignment

After going through the above key technical fork selections of unsupervised learning, generative models, natural language, and decoders, the GPT model is clearly on the road to AGI. But the power of GPT also introduces some new worries. Will its power bring danger to human beings, destroy human values, help evil, disrupt social order, or even threaten human survival?

This is a serious violation of OpenAI's vision and mission. How to align strong GPT models with human values ​​and social norms? How to become "useful to mankind" after being strong? Technical problems must be solved by technology. At this time, OpenAI's long-standing reinforcement learning skills accumulated in the early training of Dota game intelligent agents came in handy. By adding reinforcement learning based on human feedback (RLHF, Reinforcement Learning from Human Feedback) after pre-training, AI is taught to be a "good AI" that is beneficial to humans, and guardrails are set up to prevent it from being used for evil. In this regard, OpenAI thinks very far and invests a lot, which is worthy of its "vision and mission".

52a8cf6a445dfb621ec156c691fbf628.png

Engineering Wisdom: Scale Law

If we look back at the series of technical choices made by OpenAI in history, we will find that almost all of the choices were made around the principle of "whether it is beneficial to general artificial intelligence Scale", and it was not related to "whether the technology can be realized quickly" at that time, "Is it mainstream", "is it easy to use", "whether the effect is immediate" is completely irrelevant.

Anyone who has done technical architecture or business strategy also knows that "fast and easy to expand" is the "iron law" of a good technical architecture or business model. This iron law also applies to the development of general artificial intelligence. The team at OpenAI is clearly aware of this. They even published a famous paper "Scaling Laws for Neural Language Models" in 2020 to summarize the scaling laws between model parameters, training data set size, computing power input (FLOPs floating point operations per second), and network architecture.

In fact, in addition to the scale law of the model, OpenAI has very deep insights and wise choices for various scale forces on the road to AGI.

f3c3d04ddbf417d4645bc182b2ccdd23.png

Product Wisdom: From Super Apps to Ecological Platforms

From the launch of GPT 1.0 by OpenAI in 2018 to the development of GPT 3.0 in 2020, OpenAI has already won such a trump card as the big language model at this time, but how to play the card is also very important. In history, there are many players who hold a good hand of technology, but they are also poorly played. In terms of OpenAI's powerful vision of "general artificial intelligence", it is difficult not to be a platform company. However, most of the companies that started as platforms in the history of science and technology have failed in the battlefield. In contrast, most of the successful platform technology companies started by building "super applications".

After being tempered by the president of YC, the top incubator in Silicon Valley, another soul of OpenAI, CEO Sam Altman, is of course a master of product strategy. OpenAI chose to start with ChatGPT, a "super application". In just a few months, it has accumulated hundreds of millions of users, a large amount of interactive data, and a strong brand appeal. Only then can the following ChatGPT API, Plugins Wait for a series of generous platform layouts. Judging from the current news from various channels, OpenAI still has a lot of big moves in its products, let us wait and see.

By the way, in the previous article "Product Layout and Paradigm in the AGI Era" , I also talked more deeply about my thoughts on product innovation in the AGI era.

1e3537b5b8fa2216e17dfeb0513ec1c4.png

Equity Design: Limit Profitable Companies

OpenAI was first established as a non-profit organization, and the initial funds were raised through donations. However, it is clear that the founding team underestimated the hardware and talent investment needed to develop AGI, and overestimated the fulfillment of donations (many of the donations promised in the early days were not in place). Therefore, in March 2019, OpenAI redesigned its corporate governance structure and changed it to a "profit-limiting" company, accepting a US$1 billion investment from Microsoft.

"Restricted Profit" stipulates that shareholders who invest in OpenAI will receive a maximum of 100 times the amount of investment from OpenAI in the future. The excess will be controlled by the non-profit organization OpenAI Nonprofit.

This ingenious equity design can not only attract the investment needed by OpenAI, but also prevent AGI from being too powerful and grabbing huge profits. Balance the contradiction between the commercial support needed to develop AGI and the grand vision of AGI benefiting all mankind. I think looking back at history a few years later, this equity design is also a great invention in business history. Founder and CEO Sam Altman does not take equity, does not seek commercial returns, and is also admirable for his dedication to pursuing AGI to change the world.

cf105baa333b31a4f8b1b31effd41f7f.png

Strategic Design: Combine Vertical and Horizontal

If OpenAI is compared to a small dinosaur in the AI ​​era, then Google and Microsoft, which have invested heavily in the AI ​​​​field for a long time and have a market value of trillions, are obviously the two big dinosaurs in the AI ​​​​era. A "spoiler" like OpenAI would get burned if it was targeted by either of the two big dinosaurs. And OpenAI obviously has sufficient anticipation and exquisite strategic design for the AI ​​Warring States dispute caused by the launch of "popular apps" such as ChatGPT.

First of all, through the strategic cooperation with Microsoft, the big dinosaur, OpenAI not only obtained tens of billions of dollars in valuable development funds, but also empowered Microsoft’s Bing search through GPT, intercepted Google, the big dinosaur, and also empowered GPT by the way. To obtain appropriate profits in the B-end market (Azure cloud service, Office 365, etc.) that I will not be able to take care of for a while, but I can concentrate on the C-end market as an entry point to build an ecological platform in the AGI era. Eyes run wildly.

This combination of skillful use of the giant's "innovator's dilemma" allows a start-up company with only more than 300 people and a valuation of less than 30 billion US dollars to simultaneously leverage two technologies with a market value of trillions of US dollars and nearly 200,000 employees. Looking at the entire business history, the strategic layout of the giant crocodile is unprecedented and magnificent.

964c6286141b19d18b10c97bf0099561.png

Team structure: academic + engineering + product + business

After reading this, many friends may ask, what is the origin of OpenAI, and how can He De be so sturdy? There is no other secret, the most expensive thing in technology companies is talent. OpenAI has a co-founding team combination that is enough to be proud of the AI ​​world.

No. 1 CEO Sam Altman dropped out of Stanford to start Loopt at the age of 20, and sold the company for $43 million in 2012. In 2014, he was persuaded by Graham, the founder of YC who was 20 years older than himself and the godfather of entrepreneurship in Silicon Valley, to succeed him as the president of YC. Graham saw Sam Altman's outstanding talent very early. In his eyes, Sam Altman was the future Jobs of Silicon Valley. Sam Altman's entrepreneurial and YC experience in Silicon Valley has forged his top talents in product model, business strategy, investment and financing.

Ilya Sutskever, the chief scientist of position 2, is a close disciple of Geoffrey Hinton, the father of deep learning. He became famous in the ImageNet competition, and later joined Google Brain, invented Seq2Seq to greatly improve machine translation, and participated in the development of TensorFlow and AlphaGo. He is the "pioneering hero" in the academic field of deep learning.

President Greg Brockman previously founded the famous payment company Stripe and served as CTO. He has strong engineering skills and experience in building a technical team from zero to one. It is the long-term engineering technology pillar of OpenAI. In addition, the gathering of brilliant stars such as Andrej Karpathy, John Schulman, Lukasz Kaiser, etc., makes OpenAI among the top AI talents in the world in terms of density. OpenAI's team structure also reflects OpenAI's AGI entrepreneurial outlook: academics, engineering, products, and business, the four pillars are indispensable, and each is very strong.

In addition to focusing on AGI, OpenAI and Sam Altman have also invested in many companies such as nuclear fusion, quantum computing, and cryptocurrencies, and have made large-scale layouts around future changes such as energy, computing power, and wealth distribution. Each of these points to the future of AGI.

To sum up, OpenAI has played a good hand whether it is a key choice at multiple forks in technology, or in product, engineering, equity, strategy, and team. It is a company worthy of research and attention. We peek through a window into the age of AGI.

About the Author

Li Jianzhong Boolan founder and chief technical expert, chairman of the Global Machine Learning Technology Conference. He has extensive experience and in-depth research on artificial intelligence, product innovation, and business models. In recent years, research on artificial intelligence methods based on large language models, related research and consulting have attracted strong attention from the industry. From 2005 to 2010, he served as Microsoft's most valuable technical expert and regional technical director. With nearly 20 years of experience in technology and products, he provides high-end product innovation and technical strategy consulting services for well-known brands including many Fortune 500 companies.

Note: This article is reproduced from the WeChat public account "Li Jianzhong Yansi" with authorization. If you need to reprint, please contact the other party for authorization!


[ Event Sharing ] The Global Machine Learning Technology Conference (ML-Summit) will be held on October 20-21, 2023 at the Westin Jinmao Hotel in Beijing. The slogan of this conference is "Embracing the Era of AGI Revolution", focusing on engineering practice, with a total of eight themes: "Large model cutting-edge technology evolution, large model system engineering practice, large model application development practice, AIGC and machine vision, AIGC industry application and practice , AIGC enables software engineering transformation, ML/LLM Ops large model operation and maintenance, AI Infra large model infrastructure". For details, refer to the official website: http://ml-summit.org/ (or click on the original link)

739a1145acc0fd109357de2e6e71f214.gif

Guess you like

Origin blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/131566157