Dialogue with Jia Yangqing, Vice President of Alibaba: Pursuing large models is not a bad thing

In the sixty-year history of development, AI has experienced ups and downs and ups and downs. Facing the early challenge of "machine learning solves 80% of the problems 80% of the time, but I don't know which 80% of the problems are solved 80% of the time", now AI standardization and inclusiveness have made great progress. In " New Programmer 004 ", we invited Jia Yangqing, vice president of Alibaba, to share the AI ​​framework, model and algorithm computing power, engineering, open source, and the characteristics that developers need to have based on himself. The mental process and method in the process of large-scale implementation of algorithms and applications.

Dialogue | Tang Xiaoyin Author | Tu Min

Interviewed Guests | Jia Yangqing

Produced | " New Programmer " editorial department

The revolution of artificial intelligence not only leads the ideological trend of the new digital era, but also affects the technical life of thousands of programmers, including the current vice president of Alibaba, senior researcher of Alibaba Cloud Computing Platform Division, Dharma Jia Yangqing, head of the AI ​​platform of the hospital.

Jia Yangqing doesn't think he is a person with high IQ, just because of his interest. He didn't think he was a diligent person, but he would definitely do the work that should be done. Now he doesn't really want to mention the various achievements of the past. The so-called gentleman does his job and does not want to ignore it. Because everything in the past is a prologue, for him, he wants to waste his life on interesting things. In matters, such as exploring the implementation of large models of artificial intelligence, how to engineer artificial intelligence and other issues.

Therefore, when the Internet is turned on, Jia Yangqing's figure is looming. For him personally, the media's coverage of him is more on March 19, 2019. Because on this day, in addition to the author of the mainstream deep learning framework Caffe, one of the co-founders of TensorFlow, the co-leader of PyTorch 1.0, and the founder of ONNX (Open Neural Network Exchange), Jia Yangqing added an Ali Vice President of Baba Technology. In just two years, Jia Yangqing handed over another innovative achievement following Caffe and TensorFlow - leading the team to forge the big data + AI product system with 4S standards, Ali Lingjie, open source to promote the implementation of AI engineering, and help AI become an industrial digital upgrade weapon.

Today, " New Programmer " had an in-depth dialogue with Jia Yangqing, and what we are deeply familiar with is the technical person who has gone through thousands of sails, but still insists on technology, and his original intention has never changed. In the new era of technology, he hopes that the AI ​​algorithms coming out of the laboratory can be scattered like stars, entering and illuminating all walks of life.

1. From automation to AI, Jia Yangqing's persistence and insights

As is well known to the outside world, Jia Yangqing is not a pure computer professional. In 2015, "Programmer" (the predecessor of " New Programmer ") first met Jia Yangqing, who was devoted to AI at Google. When talking about the story of his relationship with AI deep learning, this Tsinghua student who specializes in automation at the undergraduate and graduate levels, Zeng joked that automation mainly does two things, one is to burn the boiler, and the other is to open the elevator.

So, after graduating from graduate school, he decided to go to the University of California, Berkeley, to study computer science, which he liked more, and won a Ph.D.

During this period, he developed a relatively complete deep learning framework Caffe, inspired by Alex Krizhevsky's success in ImageNet, while researching a psychology topic entitled "How do humans form concepts such as categories in the process of personal growth"? , the reason for research and development is also as humorous as that of science and engineering men. "The reason why I write Caffe is because I don't want to write a graduation thesis."

Taking this as a reference, Jia Yangqing has gone deeper and deeper on the road of artificial intelligence supported by deep learning, neural network, large-scale training and other technologies, and has his own different opinions and thinking in this field.

Jia Yangqing 

" New Programmer ": In the past two years, many practitioners said that artificial intelligence has not made major breakthroughs in theoretical research, and many artificial intelligence scientists chose to return to academia after entering the industry from scientific research. Under this phenomenon, will artificial intelligence enter the next cold winter again?

Jia Yangqing: I think the word "cold winter" may not be used. Today's artificial intelligence is more like returning to normal temperatures after a hot summer. At present, whether in the Internet industry or traditional industries, it is normal for everyone to use more intelligent algorithms to solve previous problems. After returning to normal temperature, everyone can really calm down and do something solid. AI technology is always a process of breakthrough from theoretical research and then large-scale deployment. At present, its theory and practice have achieved the state of walking on two legs. The ups and downs of AI are essentially the conclusions drawn from the perspective of ordinary people who eat melons in the industry. You may have seen that many companies that used to grow faster are now slowing down, and they mistakenly think that the cold winter of AI has come. In fact, this The event itself has little to do with the AI ​​industry itself.

"New Programmer": Now many colleges and even primary and secondary schools have opened artificial intelligence majors or courses. What do you think of this phenomenon?

Jia Yangqing: I think it's a good thing to have access to more cutting-edge technologies from the student days. Recall that ten years ago, Carnegie Mellon University has set up the Machine Learning Department. However, this phenomenon may bring some short-term invisible shortcomings, such as ripening and the possibility of causing many people to fish in troubled waters. However, we believe that the market economy itself has a self-regulating mechanism that will gradually filter out some useless things, so there is no need to worry about these shortcomings.

2. Under the trend of large model, Alibaba AI engineering practice

" New Programmer ": The development of large models with larger computing power and larger data sets seems to have become a trend. Now many practitioners do not pay special attention to the design details of the model. How can we balance the relationship between the fineness and scale of the AI ​​model?

Jia Yangqing: I think this phenomenon is very typical, and it has also appeared in the field of computer vision. Take AlexNet, a convolutional neural network that was a great success in the ImageNet Large-Scale Visual Recognition Challenge in 2012, as an example, the model has a total number of 60 million parameters. Its rise has made many AI practitioners have a relatively simple idea, that is, the larger and deeper the model or the more model parameters, the better the effect.

But by 2014, the deep neural network model GoogLeNet based on the Inception module can achieve the same or even better results on the basis of 6 million model parameters. Therefore, in the field of super-large models, in order to pursue the promotion effect, many people create a phenomenon that the larger the parameter scale, the better the simulation effect. Over time, when users become fatigued with model scale, they will find that details such as model structure and model interpretability become more important.

However, this phenomenon is also a typical development process of technological iteration in the field of scientific research, that is, the explosive technology attracts countless people to flock to it, and when everyone finds that this direction is too one-sided, it will return to its original position. This does not mean that the pursuit of large models is a bad thing, because the larger the model, the higher the requirements for the underlying model algorithm, which also helps to promote the birth of many innovations in the system.

"New Programmer": AI developers have always seen model conversion as a very troublesome thing. Do you think ONNX may become an industry standard to ease the work of model conversion?

Jia Yangqing: This is an interesting question. When designing ONNX at first, we hoped that it could serve as a standard among various frameworks, software and hardware service providers, so that developers can better realize the model connection.

After years of practice, we have found that the model is constantly changing, such as BERT, GPT-3 and other models are emerging, and these models will be closely coupled with the framework for designing the model. For example, many large-scale distributed models are now in the Implemented on the basis of TensorFlow. Therefore, in the latest application areas, model conversion has not become a core issue , and there is no need to convert TensorFlow's models to other frameworks.

In essence, ONNX works on some of the existing typical standard models, such as providing a good opportunity for hardware service providers to better cut into this field for computer vision. On the value chain, I don't think ONNX is supporting the latest model transitions, but making it more popular after the models are standardized. Therefore, I think ONNX will exist for a long time and become a standard in the industry, but it will not cover the whole industry, because new research will always be in a relatively fragmented state and do not need a standard.

"New Programmer": At the 2021 Ali Yunqi Conference, Ali Lingjie was officially released. You also mentioned that the essence of AI engineering includes cloud-native, large-scale and standard-benefiting, which will give AI developers a chance to What kind of changes will it bring?

Jia Yangqing: As a platform that allows enterprises and developers to use "out of the box", Ali Lingjie integrates Ali's big data + AI capabilities, and follows the principles of "large model (Scale), high efficiency (speed), easy-to-use" "Simplicity, Scenario" as the core 4S standard.  

  • Scale : Flexible scaling of big data, large models, and large applications. Because of the existence of cloud infrastructure, both AI training and model training are starting to become easier. In a cloud-native way, with the help of Alibaba's Max Compute platform, developers can use almost unlimited flexibility with zero startup costs, and computing power is no longer a bottleneck in AI development.  

  • Speed : The ultimate operation, development, and operation and maintenance efficiency. "How to improve the efficiency of developers' development and iteration through cloud native" is the original intention of Alibaba to propose this standard. At present, with the continuous development of GPU, CPU and new chips, users are not too concerned about the efficiency of hardware execution, but how to get rid of the manual installation of software and optimization methods in the past is what everyone cares about. The management of cloud-native containers, management of cloud-native operating environments, and cloud-native scheduling, etc., make the entire link from writing code and models to final model implementation and promotion faster.  

  • Simplicity : Standard, as easy to use as calling a function. Different from the full-stack requirements for AI developers in the past, now engaged in AI development, the division of labor among engineers is very clear. The meaning represented by Simplicity means that today, from the perspective of cloud and AI algorithm developers, the industry can already provide a series of standardized APIs, so that upper-layer application, data, and business engineers do not have to worry about the implementation details of AI. Develop AI applications or products directly like calling a function. 

  • Scenario : Born to the scene. AI capabilities without contextualization are useless. The Scale, Speed, and Simplicity mentioned above are all to solve the process of finally landing in different Scenarios. For developers, after deeply understanding the needs of various industries and different scenarios, they can be more targeted when AI algorithm iterations and AI applications are implemented.

"New Programmer": Based on the above, some capabilities can be encapsulated as APIs for developers to call directly. Under this trend, what capabilities should AI developers have in the future?

Jia Yangqing: Today AI developers should jump out of the AI ​​framework and the shackles of traditional thinking, and don’t limit their goals to simple optimization and parameter tuning modes. We don’t need to sculpt a new convolutional neural network model and further improve the accuracy. rate, but need to learn how to abstract the required computer vision problem into a deep learning problem in different scenarios.

In the future, model innovation will be a scientific research direction, but the greater opportunity lies in the upper layer of data modeling and model modeling.

3. What does open source bring to AI?

" New Programmer ": Today, we have ushered in an extremely favorable era of open source. On GitHub, the growth rate of developers from China has become the fastest in the world. How does open source benefit the development of AI?

Jia Yangqing: The development of the AI ​​field is inseparable from open source. The AI ​​and algorithms we have seen in recent years are all open source, and open source also makes it very easy for everyone to reproduce code and algorithms.

From the perspective of open source, the domestic open source mentality has developed very rapidly, and the enthusiasm of domestic developers for open source is not lower than that of foreign countries. However, I think the biggest challenge of open source today is some top-down mechanism design or mentality.

Nowadays, when many companies talk about open source, they often still stay in the process of opening up the code for everyone to admire. However, from the perspective of the global open source community, the most important point of open source should be to allow more people to participate in open source projects. Therefore, when the code is opened up, how to further build an open source ecological community to allow more developers to participate in the development and iteration together is also a path that the domestic open source community must take.

"New Programmer": In the new environment, what kind of open source strategy does Ali have, and how to encourage more people to participate in open source? In addition, in the process of open source application, many companies even set KPIs, what do you think?

Jia Yangqing: This is a good question. First of all, Ali has established an open source committee internally to ensure that the needs of different projects for open source can be seen within the enterprise. This means that open source does not simply open up the code, but also needs to have a relatively standardized model including the license of the code, the operation of the community, and the interaction of the community, and then the open source committee can also better promote the development within the company. Participants can participate more smoothly in open source.

As for how to design KPIs for open source, I think this is very difficult, because community building itself is an interest-driven and enthusiastic-driven issue of participants. Instead of thinking about how to set up reasonable KPIs for open source, it is better to think about how to help open source jump out of the KPI framework and let more people explore some things more openly. Therefore, to a certain extent, this problem is a cultural construction within a company, and not everything needs to be KPI-based. With open source, we need to allow time for engineers to make their own passion-driven judgments. If open source is KPIed, when open source developers are bound by standardized indicators, it may be the beginning of open source deformation in the promotion process.

"New Programmer": What is your opinion on the commercialization of open source?

Jia Yangqing: Open source and commercialization are a matter of opinion. Recently, many practitioners believe that open source software can raise a lot of money as long as it is made, but this is not the result of rational thinking.

Turning the clock back to five years ago, there is a saying that many people often say: Open source software is absolutely impossible to have commercial prospects. As for why there are such remarks, it is mainly because commercialization has not been realized that year.

Open source itself is an unrelated thing to commercialization. It's just that there are many open source projects. After the commercial application gradually has a very strong demand, some companies provide enterprise-level services on the basis of open source. This is actually a natural process. Since the application of open source software has certain requirements on the user's technology, if a company can directly solve the user's problem, at the same time, when the user himself does not want or does not have a team dedicated to doing basic things, there will be a company that provides such as Flexible, free operation and maintenance, including services such as enterprise-level technical support. To sum up, open source and commercialization can be measured at two levels. The first is how to achieve better services; the second is to see how much the enterprise can pay for enterprise-level capabilities to achieve a relatively balanced state.

However, for open source software itself, we still hope that it has a relatively strong purity, so that the industry can have a content space for technology sharing and technology iteration such as open source.

4. Breaking the traditional thinking that the end of the top programmer is no code to write!

" New Programmer ": From fighting on the front line of research and development to becoming a manager and entrepreneur, many people believe that in the process of practice, excellent programmers gradually have no time to write code and do scientific research. Do you think so?

Jia Yangqing: First of all, we must break the concept that very good programmers must not write code in the end, but must do management. In fact, the current development of the industry requires a variety of talents, which may be a very good architect or a manager, and many of them have very good system architects holding very high positions. Here, I also recommend the book " The Myth of Man-Month", which mentions different occupational divisions and why a team needs to have good manager and architect roles.

In fact, excellent programmers should pursue their own hearts, and in order to achieve one thing, they may wear different "hats" at different times, sometimes dealing with people, and sometimes dealing with code. To achieve a thing can never be promoted by a single point of skill. Therefore, what everyone needs to do is to find the direction they are good at, and at the same time, they can adapt to different occasions and needs to a certain extent, and play different roles such as management and system design.

"New Programmer": Is this also a trait that a good programmer should have?

Jia Yangqing: Yes. The system software used in the current industry, including the scale of the IT infrastructure, makes it impossible for one person to do things alone. Therefore, even if you are a good architect, you need to communicate with people, iterate together, and solve problems together, which are skills that any of us need to have today.

"New Programmer": For the younger generation of developers and practitioners who are engaged in AI, any guidance and advice on employment direction?

Jia Yangqing: Due to the wide coverage of the AI ​​field, for general-purpose AI talents, relatively successful AI talents are often those who walk on the road with curiosity and can define problems through broader needs. Taking large-scale models as an example, in daily practice, many developers often train single-point models in pre-training scenarios to improve the accuracy of the entire model. Instead of this approach, try to put aside single model optimization or single framework optimization and redefine a problem. For example, in the pre-training scenario, a model is independently trained through a large amount of existing but irrelevant data, so that it works well in multiple application scenarios, which evolves into some algorithm or system improvement.

5. AI New Challenges and New Opportunities

" New Programmer ": Can different deep learning frameworks be unified in the future?

Jia Yangqing: I don't think so. If a new framework emerges today, what we need to consider is what kind of problems it solves in the industry as a whole. Once, the emergence of TensorFlow solved the problem of large-scale systems; when super-large-scale systems were inconvenient to use, developers began to think about how to use a more Python-like and easier iterative way to allow everyone to develop and iterate algorithms, so PyTorch came into being. There are a lot of frameworks in the current market. If you just develop a framework that is slightly different than TensorFlow and PyTorch, it is of no value.

I don’t think you need to worry too much about whether the framework can be unified. Unification is not a purpose. It is a more important point to be able to solve our problems in upper-level AI applications and scientific research today.

"New Programmer": From the programmer's point of view, AI code completion has also become a very common function. Will there be a program that can learn from another program in the future to write programs?

Jia Yangqing: I think it is possible to learn algorithms from existing programs or write new ones, but don’t be overly optimistic. Some existing products in the market, such as GitHub Copilot, use their own computing power to retrieve a large amount of code, and then complete the function realization of the code. This is still somewhat different from AI actually writing programs.

The computing power and memory of AI algorithms are obviously higher than those of humans. Therefore, at the level of simple repetitive work, AI can well complement human capabilities and energy. This problem is to free people from simple repetitive tasks and allow us to think more intelligently. So, I think there is a lot of potential for AI in this area.

"New Programmer": Under the tide of the metaverse, what are the scenarios or applications that AI can promote the implementation of?

Jia Yangqing: In fact, there are too many concepts in the metaverse today. For example, many people think that it is composed of blockchain or other technologies. From what I understand, the metaverse contains two relatively important concepts:  

  • One is VR/AR, which breaks the traditional presentation form of text, video, voice and other traditional content, and brings a new way.  

  • Second, because of the different ways of presenting content, the tools for content creation will also undergo great changes. In the process of AI and the metaverse colliding and merging, because AI itself has tool attributes, many of its algorithms have gradually made the metaverse a reality.

In terms of content creation, the biggest bottleneck encountered in the development of VR/AR a few years ago was the lack of content on equipment. Now, with the development of AI, deep language models such as AliceMind and M6 have been brought, making content creation easier and more efficient.

At the tool layer, when using the VR device to turn the head or look at a certain position, the VR device needs to quickly capture the change in the position of our eyes and render at the same time. Among them, there will be an AI algorithm implementation of pupil recognition or pupil tracking. This algorithm becomes faster and more efficient, so that more time can be set aside for rendering during VR rendering, rather than more time for recognition. This makes the environment we see in VR/AR more realistic.

In the above two directions, AI can play a very good role in helping.

" New Programmer ": Facing the future, what do you think will make you feel the most fulfilling scene?

Jia Yangqing: To sum it up in one sentence, it is "AI landing", and the algorithms that come out of the laboratory can be more widely used in all walks of life.


" New Programmer 001-004 " is fully listed, talks with world-class masters, and reports on the innovation and creation of China's IT industry! 

Guess you like

Origin blog.csdn.net/programmer_editor/article/details/124099539