Virtualization technology 268.4 billion behind: 11 All on double dragon | asked at the end of China's IT technology evolution

Author |  Ali cloud dragon team Yang Hang, Yao Jie

After a smooth flow through the double 11 peak 2019 days cat, Alibaba officially announced dual-core system 11 is 100% run on Ali cloud. China's only research from flying cloud operating system, the success of the world's largest Kang Zhu traffic peak!

Zero just over 1 minute and 36 seconds, the two-day total turnover of 11 cats will exceed 10 billion yuan, turnover is faster than last year. This year's peak is a record order is created a new world record, 54.4 million transactions / second, in 2009, the first 1360 times double 11.

It is reported that two months ago, Ali Baba had quietly completed this vast migration project, will move to the next line from data centers hundreds of thousands of physical servers on the cloud. However, Taobao, Lynx consumers and businesses have no perception of this "change to the aircraft engine" of the process.

As a result, Ali Baba became the world's first 100% of the core transaction system running on public cloud large Internet companies. With cloud computing giant Amazon, Microsoft, Google, we have yet to take that step.

11 double ultimate performance without the support of Ali cloud since the inquiry Dragon server architecture. 2019, 11 dual-core system is fully mounted on the dragon, which is the output of millions of CPU core computing power, all carrying amount of electricity providers, large data, ants, hungry yet, Lazada and other services at home and abroad.

Why This paper begins with 11 dual-core system Alibaba comprehensive selection dragon on the cloud began, and then share the double 11 "All on Dragon," the architecture and operation, and finally double dragon team preparing for lessons learned 11.

Comprehensive cloud on the great challenges

2019 double the size measured 11, a cluster of more than a million containers, the container number of nodes in a single cluster of over 10,000, the peak pen database of more than 540,000 per second, corresponding to 87 million queries per second, while the real-time calculation process messages per second peak of more than 25 million, a peak processing message system RocketMQ than 150 million messages per second.

Represented behind these data, the cloud is a huge challenge in the process of formation.

Dragon solve the traditional virtualization pain points

11 dual core system architecture cloud chose the dragon, because of its high performance, and support the second virtualization.

Before the mass migration to the Dragon Alibaba architecture dual core system 11, Ali cloud verification team at the 618/99 big promotion, container run Alibaba Group's electricity but 10 percent better performance than non-cloud physical machines in the cloud - 15%. 

During the stress test also showed that 11 double electricity supplier application under high load pressure, to achieve a 30% increase in QPS, but there was a marked decline rt, long-tailed rt decline is particularly evident.

Dragon has been able to have such a significant performance boost to mention is that the virtualization overhead network / storage hardware accelerators to offload, decreased by about 8% of the computing virtualization overhead.

The original point of pain following traditional x86 virtualization systems:

(1) Traditional virtualization system causes the CPU to calculate the characteristics of the missing, such as the Intel Xeon processor VT hardware-assisted virtualization capabilities will be virtualized system "consumed", so that customers can not deploy virtualization system once again in the public cloud VM instances . This leads to the consequence:

  1. Traditional OpenStack and VMware-based load can not be deployed in the public cloud

  2. Cloud native container security innovation unsustainable, because of its dependence Intel VT hardware-assisted virtualization capabilities output

    ISA processor features a complete IaaS public cloud computing resources must contain output VT hardware-assisted virtualization capabilities, including in order to accelerate innovation Kata, Firecracker, gVisor and other IaaS and cloud-native technology.

(2) traditional virtualization overhead of system resources is inevitable. Traditional KVM virtualization system, for example, that cloud disk block storage, network packet forwarding process, to be occupied Host host-side CPU and memory resources to do, which there is a certain resource consumption overhead.

(3) Traditional KVM virtualization system leads to IO performance bottlenecks

Storage virtualization and network virtualization software optimization is very close to the technical limits by DPDK and SPDK technology, but still can not chip hardware acceleration and performance / latency / quality comparable. Especially in the process of network throughput to 100GbE evolution, especially in the moment, the gap between processing power and bandwidth capabilities of Intel Xeon processor-switched networks gradually widening.

阿里云技术团队通过专用芯片来解决上述问题。基于神龙架构的裸金属服务器,其架构与传统KVM完全不同,对云原生浪潮下容器等产品适配程度极高。从具体技术特征维度来看,神龙有以下特点:

  • 存储和网络VMM以及ECS管控,和计算虚拟化分离部署;

  • 计算虚拟化进一步演化至NearMetal Hypervisor;

  • 存储和网络VMM通过芯片定制IP业务加速;

  • 并池支持弹性裸金属(支持安全容器)和ECS虚拟机生产。

双11:神龙+容器+Kubernetes

2019 年阿里巴巴双11 系统以云原生的方式上云,基于神龙服务器、轻量级云原生容器以及兼容 Kubernetes 的调度的新的 ASI(alibaba serverless infra.)调度平台。其中 KubernetesPod 容器运行时与神龙裸金属完美融合,Pod 容器作为业务的交付切面,运行在神龙实例上。

下面是 Pod 运行在神龙上的形态:

  • ASI Pod 运行在神龙裸金属节点上,将网络虚拟化和存储虚拟化 offload 到独立硬件节点 MOC 卡上,并采用 FPGA 芯片加速技术,存储与网络性能均超过普通物理机和 ECS;MOC 有独立的操作系统与内核,可为 AVS(网络处理)与 TDC(存储处理)分批独立的 CPU 核;

  • ASI Pod 由 Main 容器(业务主容器),运维容器(star-agent side-car 容器)和其它辅助容器(例如某应用的 Local 缓存容器)构成。Pod 内通过 Pause 容器共享网络命名空间,UTS 命名空间和 PID 命名空间(ASI关闭了 PID 命名空间的共享);

  • Pod 的 Main 容器和运维容器共享数据卷,并通过 PVC 声明云盘,将数据卷挂载到对应的云盘挂载点上。在 ASI 的存储架构下,每一个 Pod 都有一块独立的云盘空间,可支持读写隔离和限制磁盘大小;

  • ASI Pod 通过 Pause 容器直通MOC 卡上的 ENI 弹性网卡;

  • ASI Pod 无论内部有多少容器,对外只占用独立的资源,例如 16C(CPU)/60G(内存)/60G(磁盘)。

对于神龙创新性产品,团队一直在努力改进和优化它的相关方案,以提高运维的效率和机器本身的可靠性。这个是多方努力的结果,也是持续改进的结果,未来团队继续努力,不断追求极致的性能与稳定性。

更多「问底中国IT技术演进」专题精彩文章:

你点的每个“在看”,我都认真当成了喜欢

发布了1654 篇原创文章 · 获赞 4万+ · 访问量 1395万+

Guess you like

Origin blog.csdn.net/csdnnews/article/details/104079154