From VMWare to Ali Shenlong, the 40-year evolution of virtualization technology

[CSDN Editor's Note] In recent years, more and more enterprises have moved their businesses to the cloud. Alibaba Cloud has launched a product that not only has the performance of a physical machine but also provides a virtual machine experience-Shenlong. What is the architecture of this server? What is special? At the "CSDN Online Summit-Alibaba Cloud Core Technology Competitiveness", Zhang Xiantao, head of the Alibaba Cloud elastic computing team with more than ten years of experience in the field of virtualization, deeply shared the exploration of Alibaba Cloud Intelligence in the new computing architecture-Shenlong The computing architecture, and related practices of the evolution of Alibaba Cloud computing architecture, hope to be able to inspire and benefit all technical people.

Copy the link or click "Read Original" to watch the video shared by Teacher Zhang Xiantao for free:

https://edu.csdn.net/huiyiCourse/detail/1176

Author | Zhang Xiantao, head of Alibaba Cloud Elastic Computing

Editor-in-Chief | Xi Yan

Selling | CSDN (ID: CSDNnews)

The following is the record of Zhang Xiantao's speech:

Hello everyone, my name is Zhang Xiantao and the name is Xu Qing. I joined Alibaba Cloud Intelligence in 2014 and I am currently in charge of the elastic computing team at Alibaba Cloud. Before joining Alibaba Cloud, I was engaged in virtualization related work at Intel. In the past ten years, I basically focused on the field of virtualization.

Today I want to share with you that during the past three or four years, Alibaba Cloud Intelligence has explored a new computing architecture-the Shenlong computing architecture and related practices of the evolution of Alibaba Cloud computing architecture.

The content I shared today is divided into four parts:

The first part is about the background of Shenlong's architecture-why Alibaba Cloud is going to be a Shenlong, and which business problems the cloud solves.

The second part introduces the evolution of the Shenlong architecture-what are the similarities and differences between the first, second, third, and fourth generations of Shenlong, and what business demands do they solve step by step, and what core values ​​do they bring to users;

The third point mainly introduces the practice of the Shenlong architecture-what value it can bring to cloud computing customers. For example, as mentioned by Mr. Shu Tong, the combination of the Shenlong container can provide better performance than the physical machine.

For a long time, the technical staff engaged in virtualization has always used the performance of the physical machine as the optimization goal. Research how to make the virtualized ability infinitely close to the physical machine, such as 90%, 91%, 92%, 93%, 95 %, 97%, 98% may not be able to continue-it can only be infinitely close to the physical machine. Today, the server with the new computing architecture of Shenlong, coupled with the Alibaba Cloud container, can provide performance that is tens of percent higher than that of the physical machine, and the revenue is very high.

Finally, I briefly introduce the future of Shenlong architecture.

 

The background of the Shenlong architecture

Before introducing the background of the Shenlong architecture, there are a few figures from a large project. I believe that many online users have participated in this hundreds of billions of projects, and may have the opportunity to participate once a year, which is Tmall Double Eleven.

 

In Tmall Double Eleven in 2019, we have created a lot of new historical records. For example, the daily transaction volume reached 268.4 billion, which is a big improvement over last year. The second number is 544,000 transactions per second, which represents the number of transactions that can be processed per second when the shopping cart is cleared at 0 o'clock. 1.292 billion parcels means almost one parcel per person.

Behind the number of days, all are done on the Alibaba Cloud public cloud. We reached a milestone event last year-the entire Alibaba economy business was fully moved to the public cloud, and 100% of the core transaction system was fully completed on Alibaba Cloud.

Before three or four years, before the Shenlong architecture that I talked about today, I couldn't do it. Because the cloud on Double Eleven is a great challenge to the public cloud platform, it is possible to use a physical machine to pile up offline. However, the use of public cloud infrastructure to bear Alibaba economies, such as e-commerce, finance, logistics-related transactions, and the double eleven days of such transactions, is a great challenge.

We did it today. Behind is the Shenlong server at work.

 

Let's take a quick look at this picture. The photo on the left is the Yunqi Conference in October 2017, when we released the first generation of Shenlong Server. Since this period, the Shenlong architecture has become one of the core competitiveness of Alibaba Cloud Intelligence.

Why can it take on a complex business like Alibaba's Double 11?

First, because Shenlong can make full use of the power of cloud infrastructure , which is the high elasticity shown here . A single server, or a cluster, is actually difficult to use the capabilities of cloud infrastructure offline, and the Shenlong cloud server can be integrated with the cloud infrastructure to provide highly resilient resources. For example, you need more storage, more network resources, and even more databases. You don't need to stop the machine or plug in the hard disk. The system will automatically scale resources according to your control commands.

The second is high stability , especially performance output, which needs to be very stable. This may not matter for laptops and PCs, but for servers used by enterprise users, the performance output needs to be a stable value.

why? Students who may have done business planning or capacity planning can understand that if the computing performance is fluctuating, it is difficult to plan how many servers the business needs to support. For example, for live online, how many servers do 8,000 people need, and how many servers do 10,000 people need? These require precise planning, so the performance requirements are very stable, the Shenlong server can be perfectly satisfied, and its performance during the Double Eleven is very stable.

The third is high performance . This is essential. If the performance is not high enough, or even physical or even virtual machines cannot be achieved, it is actually meaningless to make such a server, a new type of computing architecture.

These points are some considerations for us to do the Shenlong server. As the host said, I have actually been doing virtualization technology for more than ten years. When it comes to what Shenlong must talk about every time, it is virtualization technology.

History of virtualization technology

 

I started doing virtualization in 2004, and I was still studying for a PhD. At that time, there was not much research on virtualization technology in the entire industry and in the whole world. There may be less than one hundred people in the world . I was thinking about whether I could find a job after graduation, because the scope of this job selection is too small.

At that point of time, virtualization was basically researched by some universities, such as Stanford University, Cambridge University, Microsoft, VMWare and other companies, as well as my old owner Intel.

We look back on history. In fact, the history of virtualization is much earlier than in 2004, but it has always been done in scientific research institutions and large IT companies.

The first time point is 1974 . The earliest theory was established. In 1974, there was a paper "Formal Requirement for Virtualizable Third Generation Architecture", which laid the theoretical foundation for the rapid evolution of virtualization technology in the next 40 years: what kind of technology can be called virtualization, What conditions can meet the definition of virtualization.

The second point in time is 1997 . In 1997, a professor at Stanford University founded VMWare. The establishment of this company really put the theoretical research of virtualization technology into practice.

VMWare has made an innovation called binary translation technology.

Why do this?

It must be said that the Intel X86 architecture is not friendly to virtualization technology. Because of the evolution of the entire X86, from 4, 8, 16 to 32, 64. This process is actually different from some mainframes and virtualization. The mainframe was originally designed for multi-operating system operation, and it needs to be virtualized from the device level of the instruction set.

X86 is a gradual evolution process, starting with a very small PC market at the earliest. The PC does not require virtualization, so the X86 instruction set is not friendly to virtualization. To a certain extent, binary translation technology can dynamically scan the conditions that are not friendly to virtualization during the execution of instructions, and switch the execution of instructions one by one. The disadvantage of this method is that the performance is not very good.

At this stage, VMWare mainly runs virtualization on the PC, which is not the virtualization technology of cloud computing data center mentioned today. The implementation of the virtualization technology of the cloud computing data center is in 2005. In those years, Intel released VT-x and VT-d technologies, which were released by two chip companies, Intel and AMD. They feel that their X86 architecture is not friendly to virtualization, so they expand the instruction set and CPU design to make it better support virtualization technology.

This technology accelerates virtualization into data centers and can serve cloud computing.

In 2009, Alibaba Cloud was established. Cloud computing is unlikely to use commercial software such as VMWare, so it adopted the popular open source virtualization software Xen at that time. Later, we cut to KVM in 2014. At that time, Alibaba Cloud was deeply customized KVM and Xen, which were developed in product.

I participated in the development of KVM in 2007. Before 2014, in the cloud computing industry, everyone thought about how to make good use of the current CPU-provided capabilities, Xen, KVM and other virtualized software, without much change. In addition to the binary translation technology provided by VMWare, there is also not shown in the PPT, the paravirtualization technology proposed by Cambridge University-apart from these two, there is not much innovation.

In 2014 and 2015, when Alibaba Cloud faced the need to serve the big B market, we had to solve the cost problem and improve our service capabilities.

At that time, our virtualization technology could not meet the pace of cloud computing development. Especially at that point in time, we are also discussing how to move the business of the Alibaba economy to the cloud. This has encountered great problems. We need to change the virtualization technology.

So, we started to explore in 2015, and started to establish projects in 2016. By 2017, the X-dragon architecture was launched. It really uses the mode of software and hardware integration and software and hardware co-design, which changes the unfriendlyness of traditional virtualization technology and current computing architecture.

In the past, the design idea of ​​virtualization technology is based on the premise that both the server and the computing architecture have been determined. How to adapt the computing architecture through software.

When we do the dragon, we do the opposite. We feel that virtualization has developed very well today, so how can I design a new computing architecture to make virtualization run better? It represents a revolution in cloud computing, cloud data centers, and virtualization technologies. It also brings more firm confidence to the rapid development of cloud computing and Alibaba Cloud.

Defects of traditional virtualization architecture

To design such a new architecture, we need to take a brief look at the traditional virtualization architecture of cloud data centers, what are its advantages and disadvantages, and why do we need to change it.

This picture is a very typical virtualization organization structure diagram.

 

We can see that the bottom layer is a large cluster of physical machines. Each physical machine will actually run a hypervisor-that is, virtualized system software, and a host machine, such as Domain0 in Xen. This is basically the model.

In cloud computing, customers buy virtual machines. The computing power, stability, and resilience of virtual machines are accomplished by sinking virtualization software—not virtual machines. For example, in the virtualization of computing, the virtualization of a CPU, virtual memory, and virtual interrupts will be done by the Hypervisor software; virtual storage is basically done by the virtualization implementation module of the host, that is, it is done by software; the network is also, we will introduce Virtual networks such as virtual switches and virtual routers.

This is a very typical architecture. Everyone used this before the appearance of the Shenlong, and did not feel how bad it was. However, when Alibaba economies put their businesses and some large B-class customers on the cloud, we found many problems.

I will give you a brief introduction to these issues.

 

1. Resource competition and weak isolation: In this picture, the virtual machine is the customer's system, and the host machine is our virtualization management system. Coexisting in the same room will lead to competition for resources. The competition for resources between them will bring about fluctuations in the computing power of the virtual machine.

2. Loss of computing power and high cost: Both the host machine and the hypervisor consume CPU and memory resources, which makes it impossible to give all the resources of this machine to customers, which means increased costs. For example, if I bought a 32-core physical server, I can only give 16 or 20 cores to the customer, and the remaining 12 cores may be a waste. But it won't work without it because it helps virtualize storage and virtualize networks. The waste of resources leads to an increase in costs.

3. The performance bottleneck is obvious: as well as the virtualization of the entire storage and network, as mentioned above, all are implemented by software. The advantages of the software implementation scheme are flexibility and certain scalability, but the disadvantages are also obvious-performance and stability will be relatively poor. This is its biggest problem.

4. It is difficult to support bare-metal services: Because such a set of Hypervisor software is already running in the physical machine , it is difficult for us to provide bare-metal and provide storage network flexibility in the cloud. Even if cloud computing becomes more and more popular, some enterprises still have physical machine demands, and cloud vendors have no way to provide them. If the provision can only provide physical machine hosting, this is not cloud computing, but back to traditional IT.

The flaws in the architecture have brought many product challenges. We have to solve these problems.

With these questions, let's analyze first, what exactly do customers want?

 

After the analysis, we found that when a customer actually buys our computing products, he needs stronger computing performance, faster network access, higher storage read and write capabilities, higher storage IOPS, and higher network PPS. High, the bandwidth storage capacity of the network is higher. This means that I want to have a higher and better QoS (quality of service), the network and performance should not be high or low; it must be more secure; it must have a lower cost. This is what the customer wants.

We hold the appeal, and then compare the shortcomings of virtualization technology, carry out technological exploration and innovation, and see how to solve these aspects.

Shenlong was born at the historic moment. Shenlong was born for the cloud and is a virtualization technology that integrates hardware and software.

The key parts of the performance of the Shenlong server are all implemented by hardware with chips. The non-critical parts such as the control plane part are all done by software. Finally, it brings a perfect organic combination of flexibility and performance. Compared with traditional virtualization, it represents a new generation of cloud data center virtualization technology, which can really solve the problem we just said.

 

The evolution of Shenlong architecture

We enter the second part.

The first generation of Shenlong: the pioneer of bare metal virtualization

The first generation of Shenlong is more about how to support bare metal services after going to the cloud. The direct point is the physical machine, but it cannot be a traditional physical machine. It needs to be fully integrated with the cloud computing infrastructure. For example, it can make full use of pooled cloud storage resources, network resources, databases, etc.

 

In this case, the first generation of Shenlong was born. We call bare metal virtualization, bare metal virtualization. We are the first in the industry to release a similar product.

Its experience can be summarized in the following sentence, that is: beyond the performance of physical machines, there is a virtual machine experience.

What is the virtual machine experience? We know that the entire operation and maintenance of virtual machines are all automated, and all resources are pooled. This is actually a very good model. Traditional virtualization technology has advantages and problems. If we can solve the disadvantages and retain the advantages through new technologies-this is our original intention to do the first generation of virtualization. We did it.

Under this structure, we conducted in-depth exploration and designed the Shenlong MOC card.

In the MOC card, there is a Shenlong chip for high-speed data plane forwarding; there is a chip acceleration engine that stores EBS, network, and all things on the control plane, sinking into the chip; the management of the entire life cycle, all interfaces and The virtual machine remains consistent. For example, storage uses virtio-blk, the network is virtio-net interface, and other devices are consistent with the virtual machine.

In this way, it can be seamlessly compatible with the virtual machine, can be in the same VPC as the virtual machine, and can mount cloud disks, just like the virtual machine experience.

We have designed X-dragon Hypervisor and self-developed Shenlong chips to provide safe, reliable, and lossless computing services. Shenlong supports flexible storage, which can support 16 EBS cloud disks, and each EBS cloud disk can be up to 32G; it can also provide 31 ENS dynamic and flexible network interfaces, which can be dynamically inserted into some network cards. A physical machine, its network interface can be dynamically increased and decreased.

The chip-accelerated IO engine supports seamless access to VPC, seamless access to EBS storage, and supports free installation. It can be started directly with a chip, and bare metal services are delivered in a minute or two.

 

The characteristic of the first generation is that one chip actually solves all the problems. The physical machine can use cloud storage, VPC network, and mount local disks. All system management is also done in this card. We have added the ability of security chips to allow the perfect combination of flexibility, stability, performance and cost. The sinking of virtualization to this card is the essence.

The second generation of Shenlong: the practitioner of fusion virtualization

 

The second-generation Shenlong is called the practitioner of converged virtualization, and this is the technology used on a large scale by the sixth-generation instance of Alibaba Cloud Online.

Compared with the first generation, the entire Shenlong chip has been further enhanced. Not only can it support bare metal systems, it can also support virtual machines.

We designed an ultra-thin Hypervisor for the virtual machine, called Dragonfly, which is the hypervisor that allows the dragon to fly. It occupies almost no resources, can support many virtual machine systems, all computing resources can be given to customers, resources and performance are not damaged; and each virtual machine is isolated by hardware queues, virtual machines Although it is in the same room with the virtual machine, it does its own thing without any collinear interference.

We did it:

① The integration of technology and the pooling of resources mean that a set of software and hardware supports three sets of services: containers, virtual machines and bare metal.

② Dragonfly Hypervisor is super thin and light, which can make the occupation of the entire virtual machine resource close to zero.

③ Direct hardware IO devices, such as providing 512 network queues, providing up to 512 ENIs, storing 512 queues, and being able to download at least dozens of cloud disks. Its capabilities will be further enhanced.

④ The hardest thing to do is the hot migration of all components, including all the chips and FPGA components. After solving the problem of hot upgrade, I solved the problem of rapid iteration in the entire R & D.

 

The Dragonfly Hypervisor just mentioned has the following characteristics:

Ultra-thin and thin, zero resource loss. Memory usage is less than 1 MB per virtual machine, but CPU usage is less than 0.1%

Super stable, close to zero jitter. One million packets may have a jitter. The industry may do best is the 100,000 level. We do 1 million.

Super smooth, compatible with the original architecture, seamlessly compatible with KVM, can solve the problem of mutual migration between resource pools.


The third generation: the pursuit of extreme performance

 

The third-generation Shenlong architecture, it is the pursuit of ultimate performance. We released it at the Yunqi Conference last year. The overall performance is at the highest level in the industry. Compared with similar architectures of our peers, our performance is about 5 times higher, and key performance indicators such as storage and network are more than 5 times his.

The biggest change of the third generation Shenlong is

① All data plane paths are fully chip-based, storage, network and data are chip-based, and performance is greatly improved.

② Provides hardware-level and carrier-level QoS management, such as how many data packets per second and how much bandwidth are required for storage and the network, all of which are super accurate and accurate to one data packet. It turned out to be only available in carrier-grade equipment. We implemented it in the cloud data center server.

③ Enhanced converged network can provide a low-latency network close to bare metal.

④ Enhanced hardware queue, supporting 1,000 1024 storage queues and 1024 network queues, the isolation between queues and queues is further enhanced.

 

Three generations of Shenlong have expanded the boundaries of elastic computing products. The first generation was released in October 2017, the second generation was released in September 2018, and the third generation was released on the morning of September 26 last year.

Our entire network performance has achieved 25 million PPS in a single machine, which is equivalent to about 5 times that of a friend. Storage achieved 1 million IOPS. Existing elastic computing products are providing these features, and in some instances you can feel a significant improvement. Next we will fully cut to the third generation.

 

The development of Shenlong to today, last year, not only the business of the group economy was completely moved to Shenlong, but also the entire public cloud computing services were also moved to the Shenlong architecture. Any new generation of servers added by Alibaba and Alibaba Cloud are all based on the Shenlong architecture: X86 general examples-G series, T series and R series are fully upgraded. All instances related to heterogeneous computing and high performance computing have been cut to Shenlong. Provide the ability of the third-generation Shenlong just mentioned, for example, 1 million data packets will have a jitter.

The practice of Shenlong architecture

Finally, I briefly introduce some of Shenlong's current practices.

Just now Mr. Shutong introduced that there is a double sword effect between the dragon and the container.

We have designed a special network for containers called Terway network, which can naturally integrate with VPC (virtual private cloud). Each container will have a VPC IP, and all the network and storage between them is a hardware queue for relative isolation.

In the double eleven last year, many typical applications, container plus Shenlong showed a very good ability.

 

Shown here is a very critical application in our e-commerce, which is also an application that we use every time we place an order. Its overall QPS has increased by 30%, and RT has dropped by 96%.

Under extreme pressure, it still has such a good performance. Customer resource utilization, CPU utilization can be increased to 80%.

Why is there such an effect?

 

The blue line in the picture is a traditional physical machine, and the red line is the same specification-that is, CPU, memory and other things are the same-the physical machine of the Shenlong architecture.

When the two of them run the business on the cloud, we can see the blue one. Perhaps when the CPU utilization reaches 40%, the business delay will increase. If it exceeds 50% or 60%, the entire business will crash. But Shenlong, when the CPU utilization rate is as high as close to 100%, business delay growth is still very small.

On average, the average server can achieve a CPU utilization rate of 20% to 30%, which is pretty good. Shenlong can be reduced to 50% to 60%, with very linear and stable business performance. Many customers have started using our Shenlong server.

 

Another use of Shenlong is to move VMWare's private cloud to a public cloud. We know that there are many data centers offline, all of which use VMWare private cloud. Many customers will use Dragon to migrate the offline VMWare private cloud. Without Shenlong, this cannot be achieved because traditional cloud computing instances cannot support virtualization software such as VMWare.

If you use OpenStack, and the bottom uses KVM, if you want to move your entire offline OpenStack private cloud to the cloud, Shenlong bare metal server can also solve it.

 

Finally, after the outbreak of the new corona virus in recent days, Alibaba Cloud provided direct computing power to dozens of public scientific research institutions. Behind this is the high-performance computing capability built with Shenlong bare metal.


The future of Shenlong architecture

 

A few days ago, our paper on the Shenlong architecture was shared with ASPLOS 2020 at the top of the computer system.

The paper introduces the overall planning related to bare metal virtualization and what we will do in the future. Interested friends, you can go to this address to download our papers, there are also video explanations.

In the future, we will continue to innovate in the Shenlong architecture. If you have any questions or ideas in this field, please contact us.

thank you all!

 

【END】

More exciting recommendations

☞One -stop killer AI development platform is here! Say goodbye to switching scattered modeling tools

☞The idea of ​​intelligent transportation caused by the traffic jam of Beijing Fourth Ring Road

Please, do not ask me what is the heap!

☞The idea of ​​intelligent transportation caused by the traffic jam of Beijing Fourth Ring Road

☞Your company's virtual machine is still idle? Learn about continuous integration testing practices based on Jenkins and Kubernetes!

☞From Web1.0 to Web3.0: detailed analysis of the development and future direction of the Internet in these years

Every "watching" you order, I take it seriously

Published 1958 original articles · 40,000+ praises · 18.17 million views

Guess you like

Origin blog.csdn.net/csdnnews/article/details/105548972