Taobao Open Cloud R&D PaaS New Exploration - Jushi Tower Cloud Hosting Technology

332a752e0eea736c209e526088b1d54a.gif

The Taobao open platform is an important open way for Taobao Tmall to communicate with the external ecology. Through open product technology, a series of basic services of the Ali economy, such as water, electricity, and coal, are delivered to our merchants, developers, community media and others. Other partners, promote the customization, innovation, and evolution of the industry, and ultimately contribute to the new commercial civilization ecosystem.

This article is the third article in this series, see the first two articles——

Part 1: Evolution of Open Gateway Architecture

Part 2: Official Data Space Design of Metadata-Driven Architecture

8620e463f7c4e2076d6a18b770be3e98.png

Origin of the problem

Jushi Tower is regarded as an "antique-level" technology platform on Taobao Tmall, and it has carried the mission of supporting Taobao's open ecology since its inception. It has come through several generations of technological change to the present.

After we implemented the "cloud-nativeization" of the three-party development ecology in 2019, basically the main business systems of most e-commerce ecosystem ISVs have completed cloud-nativeization, and each ISV will maintain several ACK clusters by itself. The responsibility is to help ISVs to operate and maintain clusters, and build a set of application operation and maintenance PaaS based on the capabilities of k8s to solve problems such as CICD of applications, monitoring operation and maintenance, and security management and control. This model has been well verified in the e-commerce scenario. With the help of cloud-native standardization and the prosperous cloud-native ecosystem, it has helped e-commerce ISVs complete technical architecture upgrades, and the platform side has taken this opportunity to complete application architecture standardization and stability. Enhancement of security capabilities. At present, the daily total scale of the Jushi Tower container cluster has reached the scale of 10,000 cores.

Also starting in 2019, the Taobao mini-program ecosystem began to flourish, and more and more third-party developers poured into the open platform to serve merchants and consumers of Qianniu and Taobao. These ecological systems are also unified on the container PaaS of Jushi Tower, and new problems have also emerged.

▐The   first mountain for small and medium-sized developers

Compared with traditional e-commerce ISVs, Mini Program ISVs are much smaller in size and have a much simpler application architecture. Usually 1-2 applications can complete all the functions required by a small program. The "container cluster + application" model is too complicated for developers of such small applications, and the learning cost is too high, and it will also cause waste of resources. It is especially serious for those developers who do not have the foundation of Jushi Tower and cloud foundation. It is like opening the door of Taobao's open platform and seeing a high mountain. They still need to learn related technologies and devote energy to climbing the peak.

  • Complexity of Application & Cluster Creation

A cloud application resident applet, the usual initialization path:

Apply for an open platform application --> Jushita purchases ECS resources --> Jushita creates a cluster & initializes --> Jushita creates and deploys an application --> Jushita purchases SLB --> connects the application to the Mini Program Cloud Gateway

For a skilled developer, the time statistics for the entire process are as follows:

Buy ECS

Cluster Creation & Initialization

Create and deploy the application

Purchase configuration SLB

Application Association Cloud Gateway

total time

minute level

hour class

hour class

minute level

hour class

65min

For developers who are not familiar with the platform, the time will be longer. It takes about 3-7 days to learn the cloud infrastructure and become familiar with the entire operation and maintenance framework.

  • extra cost to pay

In small program business scenarios, most k8s clusters are small clusters with a scale of less than 50 cores.

For a 50-core cluster (six 8-core 16G ECSs), 3.5-core 10G resources will be occupied by the cluster control component; for a 20-core cluster (five 4-core 8G ECSs), the cluster components will occupy it 3-core 8.5G resources are dropped; the smaller the cluster estimate, the fewer resources actually available to users, and the higher the business cost.

According to statistics, most small program developers have relatively small clusters, and need to pay a considerable proportion (1/10~1/5) of the cost for cluster system component overhead.

At the same time, developers also face difficulties in operation and maintenance during the big promotion (the need to expand capacity in advance and prepare resources in multiple availability zones), and complex operation and maintenance of daily cluster components.


   Shackles of open platform

In traditional e-commerce scenarios, Jushi Tower has defined a very complete set of open standard solutions. Let traditional e-commerce developers "enter the tower": Service providers can deploy services on Jushi Tower's IaaS or application PaaS through "entering the tower" to obtain better business system stability and business data security, and further improve Good service to Taobao merchants and consumers.

However, the mini-program ecological scenarios are richer, with more vertical business scenarios, such as chat corpus scenarios involving sensitive data, and mini-game scenarios that basically do not involve sensitive data. For more subdivided business scenarios, it is obviously inappropriate to use the same old control standards uniformly. So how to open different granularity management and control schemes for different vertical business scenarios and three-party applications of different scales is also an important proposition that Jushi Tower needs to solve next.

Solve the problem of platform business open standards in vertical business scenarios, and untie the shackles of platform business openness.

▐Do   not let entering the Jushi Tower become a card point for business

When more and more platform games are appearing in our various competitions, Jushi Tower cannot become a card point for third-party developers to settle in Taobao's open business. So how to solve the above problems? In the big business scenario of small programs, the "cloud hosting" solution is the answer given by Jushita.

Cloud hosting refers to the complete hosting of users' computing resources (containers), database resources, and network resources. Users no longer need to pay attention to the operation and maintenance of the underlying infrastructure, and only need to focus on the "cloud-native application" of business code development. plan. In this way, after the user opens the door of the open platform, what he faces is no longer a mountain, but a car, and the business can be run according to the code development method familiar to the user. In essence, cloud hosting is actually one of the forms of Serverless.

Through the platform's centralized hosting computing network architecture, the complexity of business traffic links is shielded (for example, users no longer need to purchase and manage SLBs for small program cloud business traffic), and centralized management of IaaS resource preparation reduces users' need for cluster operation and maintenance. And the awareness of component operation and maintenance, thereby reducing the user's learning and use costs, operation and maintenance costs and resource costs.

At the same time, with the help of k8s cloud's natural loosely coupled network plug-in capabilities, different network management and control solutions can be customized for different business scenarios to solve the platform's finer-grained network management and control needs in multi-business scenarios.

e4a7c0b10d05b2feca35d1c285692de6.png

Technical realization

0d4591ba7e4edd3b450b2d771c2e15bc.png

In general, cloud hosting divides the hosting of resources into two categories:

  1. Computing resource hosting, that is, the hosting of computing containers; using ASI (k8s) cluster operation and maintenance to manage resources;

  2. Hosting of cloud resources (databases, middleware) that applications depend on; the platform manages the purchase, use, and operation and maintenance of each resource.

Through two types of hosted resources and the management and control link model, we divide the overall network into four areas:

  1. ASI cluster VPC: It is actually a VPC on the cloud of the public cloud. It contains applications deployed by developers, Addon applications for business components, management and control components in the cluster, and necessary components for cluster operation;

  2. Tenant-managed VPC: The platform allocates a VPC for developers separately, and developers are not aware of this VPC . The main function of the VPC is to isolate user resources and to host user cloud resources.

  3. Alibaba Group Intranet: The management and control application of the deployment platform in the group intranet and the console application exposed to users will issue instructions to the ASI cluster through the network connection.

  4. Small program cloud gateway: the cloud gateway is used to connect mobile Taobao terminal traffic, and deliver the traffic to the user application pod of the hosting cluster through cross-network technology.

Through the above basic path, users can realize serverless application development in Jushi Tower, and at the same time do not need to perceive the complicated concept of "cloud", and use the cloud products hosted by the platform in the way of "resource cloud nativeization". Resources do packaging and integration.

In the above scheme, the main problems in the following aspects need to be solved:

  1. How to isolate multi-tenant applications and containers in the cluster;

  2. How to achieve single-tenant application and resource network communication while isolating, and realize network management and control of the platform for different business scenarios;

  3. Automatic elastic scaling solution for applications in serverless scenarios;

▐ Multi-tenant computing isolation model

When we chose the underlying container technology solution in the early stage, we considered a variety of different container cluster hosting solutions, including ACK, self-built, and Alibaba Cloud ASI. Later, considering the completeness of the capabilities for the two-party business, ASI was selected as the underlying k8s hosting solution.

However, Jushi Tower does not directly use ASI's multi-tenant cluster solution, but uses ASI's single-tenant cluster. Jushi Tower acts as a unified tenant to manage the cluster as a whole and divide resources among different developers (tenants) within the cluster.

In order to solve the problem that the runc container cannot completely achieve kernel-level isolation, the container at the bottom of Jushi Tower chose the kangaroo RunD container as a secure container solution.

602dcb0939d532f24654563a8b14189b.png

The RunD container is built on the X-Dragon Bare Metal Server. The allocation of tenant containers is actually linked to the switch. The following part of the network communication and control will introduce the allocation rules of the tenant container network in detail.

▐ Network communication and control

Network communication and management and control mainly need to address two requirements:

  1. Allocate network for pod to realize application network communication;

  2. For multi-service scenarios, realize hierarchical network management and control;

3dd33ba5e65deb951a961c4f265bc0c1.png

  • Split service management traffic and user traffic by mounting multiple ENIs on the Pod

41b64eb8f006e3d50eb1e5d50c695fcd.png

From an application perspective, we can divide traffic into two types:

  1. Platform business control traffic, such as business traffic imported from the front-end cloud gateway (north-south traffic)

  2. User application internal traffic, such as traffic access between applications of the same tenant, or access traffic between applications and rds (east-west traffic)

Combined with the requirement of cross-VPC two-way communication, we chose the solution of pod mounting multiple ENI network cards. Use the cross-tenant mounting of the ENI network card to realize the intercommunication of VPCs of different tenants, and use the edge switch of the ENI to manage the network access.

In the Jushi Tower scenario, the container's default ENI network card (eth0) is used for north-south business control traffic communication; the user's ENI network card (eth1) mounted by the container through role-playing is used for communication between user applications and with cloud resources. In this way, the switches and security groups corresponding to the two NICs can also be managed separately.

  • Network isolation between tenants

As mentioned above, each pod will have two ENI network cards, one is the management network card, which is used to import platform business traffic; the other is the user network card, which is used to communicate with the user's exclusive hosting vpc. So how to solve the problem of isolation of management and control network cards between multi-tenant applications? (There is no need to consider the isolation of user network cards, because it is naturally isolated through vpc)

There are two options:

Independent Switch + Security Group Solution

fd5b7b9b55a3bcd75868316d7f61f995.png


Allocate independent switches and security groups for tenants. Pods of the same tenant are produced from the same group of switches. By configuring switch network segment intercommunication rules in the exclusive security group, the application can only be limited to the same tenant.

This solution is simple and relatively easy to implement k8s service communication with tenant applications. But there is a fatal problem. Since vpc has an upper limit on the number of switches (120), each vpc can accommodate up to 40 tenants (also consider the issue of multi-availability zone disaster recovery), which does not meet the access requirements of the platform. The scale of the isv.

Shared Switch + Security Group

f4e9372d3b7f323286c1fd8cc121a733.png

The solution of shared switch + security group is that multiple tenants share a switch to allocate pod ip; at this time, in order to avoid pod mutual access between different tenants, it is necessary to add all pods to a unified enterprise security group, and use the enterprise security group By default, members cannot communicate with each other, and isolate pods from communicating with each other.

In fact, this method isolates the pod mutual access in all clusters, including the pod mutual access of the same tenant; the pod mutual access of the same tenant needs to be realized through the user ENI network card.

However, due to the default rules of kcm and kube-proxy, when the k8s service selects the ip, the eth0 network card ip will be selected by default, making the k8s service unusable. The solution to break through this limitation will be relatively complicated. You need to customize the endpointController of k8s to realize the dynamic mounting of the realServer of the service. At the same time, through the host's routing and iptables customization, the network packets will go out from the correct network exit.

  • Network management and control solution for multiple business scenarios

The article mentioned at the beginning that Jushi Tower needs to solve the problem of fine-grained control of network rules for different open scenarios. We summarize business scenarios into three types:

Scenes

data level

Public network access method

closed environment

high

By default, access to the public network is not allowed. If there are special circumstances, it needs to be called through an external link (the request will be audited)

controlled environment

middle

Only approved public network ip addresses can be accessed, and some encrypted sensitive data scenarios are available

open environment

Low

Public network access can be freely configured without data risk scenarios

For three typical scenarios, we have designed different network management and control solutions.

a1549331360eb771e5fd9b3ee299d6cb.png

Due to the convenience of dual network cards, we can close the user's own traffic in the user eni network card, and then control the network traffic by controlling the switch and security group of the vpc where the user eni is located.

For example, in a closed environment, we have customized the rules prohibiting all public network access in the security group rules of the user's ENI network card, which can directly block the public network access of user applications; in an open environment, we will not do anything special for the network Control and customization, users can even mount their own applications to slb to provide external services.


▐ Application Stability Silver Bullet——Automatic Retraction

When it comes to the serverless solution, it must be mentioned that it will automatically shrink. Application automatic scaling is a powerful tool to balance application stability and application cost. Similarly, in the Jushita cloud hosting environment, the cloud hosting supports automatic elastic scaling based on container indicators, such as CPU and memory water level; at the same time, we also support elastic scaling based on business water level, such as the elasticity based on the traffic of the applet cloud gateway.

Among them, based on the CPU memory water level, k8s native metric server is used as the data source, and the standard HPA service is provided to make flexible decisions according to the corresponding algorithm calculation;

Neither the official data source nor HPA can well support the elasticity of custom business indicator types, so we introduced KEDA as a pre-decision tool for elastic scaling. Through KEDA, we can drive any container in Kubernetes according to the number of events that need to be processed extension; at the same time, we have defined a dedicated data collector jst-externalscaler for customizing the collection of business indicator data.

Through the extended combination of official HPA+KEDA, the cloud-hosted application pod automatic elastic infrastructure is finally realized.

9cedb0ceb0e10e50ece22bf713409fa2.png

look forward

Cloud hosting is in the product matrix of Jushi Tower, which is more like an "autonomous region". In the form of "get rich first and get rich later", starting from the lightweight Taobao mini program scene, explore the infrastructure capabilities that are more friendly to Taobao service providers and merchants, redefine the "entry tower" standard, and finally drive the overall Jushita The third-party business on the Internet is evolving in a simpler and more agile direction. The Jushi Tower is an important and unique channel connecting the ecological opening and the cloud. Therefore, Jushi Tower determines data security, developer experience, and business delivery efficiency. How to solve the problem of "entering the tower" is an important proposition and test. In the future, Jushi Tower will continue to cultivate in the application developer ecosystem, combining low-code, function computing, cloud hosting and other means to create a business-oriented "application cloud-native" PaaS platform. Welcome to discuss together~

8663e94474bb73642673a407e8fb935b.png

team introduction

The big Taobao technology open platform is an important open way for Ali to communicate with the external ecology. Through open product technology, a series of basic services of the Ali economy, such as water, electricity, and coal, are delivered to our merchants, developers, and community media. And other partners to promote the customization, innovation and evolution of the industry, and ultimately contribute to the new business civilization ecosystem.

We are a technical team with strong technical ability and glorious history and tradition. In the Double Eleven battlefield over the years, the team has performed excellent results. It carries millions of business processes per second. 90% of the orders are pushed to the ERP system of the merchant in real time through the order push service to complete the e-commerce operation. The ERP-WMS scenario opened by Qimen has become the standard of the warehousing industry. With the continuous exploration and rapid development of the new retail business, we are eager for experts from all walks of life to join and participate in the technically challenging work such as core system architecture design, performance tuning, and open model innovation.

¤  Extended reading  ¤

3DXR Technology  |  Terminal Technology  |  Audio and Video Technology

Server Technology  |  Technical Quality  |  Data Algorithms

Guess you like

Origin blog.csdn.net/Taobaojishu/article/details/131566423