Ali Ma Tao: Redefining the Open Source Operating System in the Cloud Era | 人志


Author | Just

Listing | CSDN (ID: CSDNnews)

With the development of cloud computing, the continuous development of the Linux platform and the continuous improvement of the ecosystem, more and more enterprises and cloud service providers have adopted Linux as the preferred operating system for their data centers.

However, as the base of cloud infrastructure, the demand for customization and optimization of products and environments on the cloud has grown substantially. Based on this, the Alibaba operating system team developed Alibaba Cloud Linux to connect various products and customer applications of Alibaba Cloud to better utilize the capabilities of the cloud.

 

During the recent "Changsha 1024 Programmer's Festival", the CSDN "Character" column interviewed Ma Tao, a researcher from Alibaba Cloud and the head of the operating system team of the Basic Software Department. He introduced the impact of 5G, cloud computing and other technologies, cloud operations Challenges and opportunities faced by the system.

 

Ma Tao is also one of the founders of Alibaba Group's kernel team. He has been responsible for Linux and operating system kernel-related research and development in ORACLE and Alibaba. He has made more than 300 submissions on the mainline of the Linux kernel. At the same time, he has conducted in-depth research on distributed file systems and distributed storage for many years.

 

The following is the interview content, organized by CSDN:

 

CSDN: With the development of new technologies such as 5G and cloud computing, new hardware architectures are emerging one after another. What challenges does this pose to the underlying operating system? How does Ali's open source cloud operating system technically adjust to adapt to this change?

 

Tao Ma: Indeed, cloud computing and Internet scenarios have greatly changed the requirements for operating systems from the traditional era, and there are new challenges in terms of stability and performance.

 

First of all, cloud computing is an economy of scale, so the fixed cost of cloud infrastructure is huge. When these infrastructures are completed, the marginal cost required for further expansion is relatively small. Therefore, as the scale of cloud service providers expands, their average cost will show a rapid decline.

 

In the case of such a large-scale deployment, stability is very important, coupled with the complexity of the cloud business and the high load of the system, this puts forward higher requirements for the stability of the operating system. In Alibaba Cloud, we have millions of deployments, and all the challenges to stability are huge.

 

In addition, as the scale of cloud computing expands, every point of optimization and performance improvement will bring considerable economic benefits. However, it is difficult to achieve considerable performance gains from a single layer optimization in the cloud architecture, which requires full-stack collaboration from the hardware platform to the system software to the application layer.

 

The operating system here is just at the most critical point in the middle. It abstracts physical IT resources downwards and uses abstract resources upwards to support applications. Therefore, how to optimize cloud computing operating systems to provide full-stack integration and optimization capabilities has become a new opportunity for cloud operating systems and also the core competitiveness of cloud platforms.

 

CSDN: Giants like Huawei have also entered the game and proposed new operating systems. Then, what is the difference between Ali's adjustments on open source operating systems and other giants?

 

Tao Ma: I will mention here that I think the second challenge of operating systems in the cloud era is the rapid introduction of heterogeneous platforms and features.

 

With the increase in the demand for cloud computing capabilities from intelligence and the transformation of the IT industry, the rapid evolution of multi-CPU architectures has been promoted. I don't think any cloud will run on a single computing architecture in the future. Therefore, this requires the operating system to develop an open hardware ecosystem and fully support cross-platform architecture. At the same time, the iteration and capability evolution of each hardware platform is accelerating, but the version upgrade speed of some traditional operating system releases is slow. There is no way to bring the capabilities of the new hardware platform to the rich applications on the cloud in time.

 

You often find that a hardware feature may have existed for 1-2 years or even longer, and some operating systems that integrate these features have never been long overdue, which prevents applications from taking advantage of the bonuses of these hardware in time. The resources of the cloud platform are wasted, and the speed of new hardware iteration is greatly reduced. Therefore, we hope that through the cloud platform's operating system, customers can experience the technical dividends brought by various new hardware more quickly.

 

From this point of view, our operating system and Intel hardware manufacturers have different requirements for operating systems. They value their own hardware ecology, such as CPU. And we need to consider how the needs of cloud customers can be better and quickly met.

CSDN: What is your next focus on the open source operating system?

 

Tao Ma: In addition to the stability, performance, and heterogeneity mentioned above, cloud native is also a focus of our efforts. Cloud native accelerates the decoupling of applications and infrastructure, and fully releases the elasticity of the cloud. Cloud native allows developers to pay more attention to their own business applications, and the complexity of the infrastructure itself sinks to areas invisible to users. At the same time, for cloud-native infrastructure, its service boundary for applications is moving up, and its system software stack is constantly moving down, which poses many new requirements and challenges to the cloud-native system layer.

 

First, the granularity of resources will be more detailed. Compared with the business types in the traditional virtual machine era, cloud-native applications have different types of resource requirements, from traditional machine virtualization resources, namely virtual machine forms, to virtual OS resources (ie container forms), and then to virtual machines. For runtime resources (such as FaaS similar to the functional computing form), applications have higher requirements for resource abstraction and finer granularity.

 

Second, the elasticity speed needs to be faster. Cloud-native services are naturally accompanied by the need for rapid elasticity and short-term operation. Cloud infrastructure is required to provide resources in a short period of time and be able to withstand the pressure of frequent creation and destruction of resources.

 

Third, the performance requirements are higher. In the cloud-native scenario, users are more focused on application logic, and more system technology stacks sink to the infrastructure, while the system service boundary moves upward. This will give the bottom system more room for display, because some of the original layered boundaries can be broken, merged and optimized vertically.

 

Fourth, security isolation is stronger. As cloud-native acceleration evolves toward serverless computing architecture, applications and infrastructure are completely decoupled, users only need to care about their applications, and cloud-native systems continue to move downward. As a cloud native support technology, containers are different from the previous virtual machine computing architecture. In the serverless computing architecture, multi-tenant security on the cloud has become a new challenge.

 

In response to the new requirements and challenges of these cloud-native systems, our operating system must also undergo corresponding changes and make substantial innovations to meet the rapidly growing demand for cloud-native applications.

 

CSDN: Ecology is also a very important part of building an open source operating system. How does Ali build the corresponding ecology and community?

 

Tao Ma: The operating system is in a crucial middle layer. It connects with business upwards and hardware downwards. Therefore, we hope to build an open operating system ecosystem that supports multiple architectures around the Alibaba Cloud Linux open source operating system.

We hope to work with IHV partners to bring new hardware platform capabilities to applications, fully explore platform performance, and promote the rapid evolution and iteration of hardware platforms; hope to provide better quality operating system services suitable for cloud computing with OSV; hope Work with ISV to make applications better on the cloud, improve end-to-end performance, and promote business agility; hope to work with developers and users to create an open source operating system that is more suitable for cloud computing and build a cloud-native system base. Promote the development of cloud native.

 

CSDN: Starting from Ali's experience in making open source operating systems, what are the pits that Ali has stepped on when making these open source basic software?

 

Ma Tao: The biggest problems facing open source basic software are mainly two, one is ecological construction. Since 2010, the kernel team of the Taobao department of Alibaba Group has started to work on the open source of the operating system kernel. Although our work has made great achievements at home and abroad, other manufacturers have not been able to participate deeply, so the development has not been Lukewarm. After learning from the pain, we feel that we must do a good job in the ecology to ensure the healthy development of the basic software community.

 

CSDN: You have also worked in Oracle. Compared with open source operating systems at home and abroad, what are the advantages and disadvantages of the current development of domestic operating systems?

 

Tao Ma: I have worked for Oracle for more than 4 years, engaged in the research and development of operating systems. In contrast, the domestic talent pool in the field of operating systems is relatively weak, and both the number of employees and professional skills are still at a disadvantage. Of course, I am also very pleased to see that many domestic companies, including Alibaba, Tencent, etc., have begun to increase their investment in operating systems in recent years, and we are catching up. 

 

CSDN: What are the suggestions for domestic companies that do open source basic software?

 

Tao Ma: Basic software is relatively difficult to get started. How to attract more developers to participate in development is very important. Without a large number of developers, we will not be able to gather the wisdom of the group, and there will be no way to achieve long-term stable development. Here Linux Kernel attracts new developers through some very simple entry-level repairs, which is very worthy of learning domestic open source basic software.

 

CSDN: From a developer's perspective, the operating system is one of the three basic software. If you want to enter the game, you must have deep basic skills. Can you share some experience with the technical growth of developers on the road to the operating system?

 

Ma Tao: This topic is very interesting. I used to start the operating system from Oracle. At that time, I mainly contributed some OCFS2 (Oracle Cluster File System V2) code to the Linux kernel. My first kernel patch occurred in 2007. At that time, I discovered that a variable in a function of the kernel code was useless, so I wrote a patch that removed this variable and sent it to the kernel community, and it was soon Accepted. This incident made me realize that the Linux operating system community is very friendly to newcomers, and I also have the ability to do something for the operating system, and it has also contributed to my continued work in the operating system for more than ten years.

 

Interestingly, since it was the first time to write patch, I also mistyped my name and wrote Tao Mao in the committer column. So if you want to get into operating system research and development, the first step is not that difficult. As long as you have the will, you can definitely start with some small things and eventually become yourself.

 

After a good start, the second very important thing is interest. I always feel that interest is the foundation that drives our continuous learning and continuous improvement. I remember very clearly that since I entered the kernel development of the Linux operating system, I basically spent every Chinese New Year writing code, because during that period of time I had a relatively high degree of freedom, and on the other hand, I could interact with colleagues working abroad. Good communication.

 

In retrospect, the sense of participation and happiness from the code is a very big driving force for me to continue to do the research and development of the operating system.

 

CSDN: What do you think of the open source popularity of "domestic" basic software in recent years? What are the challenges and opportunities faced by domestic developers?

 

Tao Ma: With the development of cloud computing, IT infrastructure is facing a new transformation. In this wave of change, whether it is public cloud or hybrid cloud, emerging cloud computing vendors represented by Ali have performed well in the fierce global cloud computing competition. Cloud computing has brought new changes in business scenarios and new requirements for underlying hardware. Cloud native means a revolution in programming models. These are new opportunities and challenges for our basic software. At present, we and top international manufacturers are working together Taiwan Competitive, and strive to define the basic software suitable for the development of next-generation cloud computing. 

 

CSDN: If you want to promote the development of China's open source operating system or open source ecology, what kind of consensus and actions do you most hope to promote with domestic developers? If you cooperate with a developer community like CSDN, what do you hope both parties will do for developers?

 

Tao Ma: The operating system is one of the core of basic software and one of the core competitiveness. In the past, due to the long-term gap between the domestic low-level hardware technology and the international top level, our operating system research and development has been somewhat backward. However, with the vigorous development of cloud computing in recent years, we are in an era of rapid change. What kind of cloud operating system should look like is actually a very new topic and a huge opportunity for us.

 

I look forward to working with domestic developers to define operating system standards in the cloud era. I also hope to work with CSDN to train operating system developers in the cloud era and influence the development of the upper-level application software ecosystem in the cloud era through the cloud era operating system.

Related links:

"The biggest problem with domestic operating systems lies in solving "production relations"" |

Exclusive dialogue with Xie Baoyou: Make a domestic operating system similar to Linux |

"Our goal is to replace Android in the Internet of Things" |

更多精彩推荐
☞长沙 · 中国1024程序员节盛况空前,500 万程序员线上线下引爆星城

☞“国产操作系统最大难题在于解决「生产关系」” | 人物志

☞“我们的目标是取代物联网中的安卓” | 人物志
☞一口气看完45个寄存器,CPU核心技术大揭秘
☞对话阿里云:开源与自研如何共处?

☞AI 还原康乾盛世三代皇帝的样貌,简直太太太好玩了!

☞观点 | 回顾以太坊近期及中期扩容路线图,展望 rollup 作为中心的以太坊路线图

点分享点点赞点在看

Guess you like

Origin blog.csdn.net/csdnnews/article/details/109324231