Review | Kubernetes SIG-Cloud-Provider-Alibaba First Webinar (including PPT download)

Author | Tang Zhimin, Xie Yaoyao

Full video review of the conference: https://www.bilibili.com/video/av88668762

On February 12, Alibaba Cloud and CNCF jointly held an online seminar, which for the first time fully introduced Alibaba Cloud's layout of the Kubernetes community, including 10 categories and more than 20 open source projects, providing complete Kubernetes lifecycle management. This article brings together the complete video review and data download of the conference, and organizes the questions that could not be answered in time at the conference. I hope it can be helpful to everyone~

Follow the "Alibaba Cloud Native" official account, and reply to the "meeting" in the background to download the PPT.

Lecturer.jpg

What is SIG Cloud Provider

Over time, more and more enterprises are using Kubernetes in production. The acceptance of Kubernetes is inseparable from its good design and thriving community. At present, there are about 20 interest groups (SIG) around Kubernetes, and SIG Cloud Provider is one of the important interest groups of Kubernetes, which is committed to promoting all cloud vendors to provide Kubernetes services with standard capabilities.

SIG-Cloud-Provider-Alibaba is the only sub-project of SIG Cloud Provider in China.

Cloud Provider SIG is the Kubernetes cloud vendor interest group, dedicated to making the Kubernetes ecosystem evolve in a vendor-neutral direction. It will be responsible for coordinating different vendors to meet the needs of developers with a unified standard as much as possible. Currently joining the Cloud Provider SIG includes 7 cloud vendors, including AWS, GCP, Alibaba Cloud, IBM Cloud, etc.

Why does Alibaba Cloud join SIG Cloud Provider?

1. Work with global cloud vendors to promote multi-cloud standards and feed back the best practices of Alibaba Cloud to the community

In the era of full cloud adoption, the cloud has reshaped enterprise IT architecture. Cloud native computing is a set of best practices and methodologies. How to build scalable, robust, and loosely coupled applications in public cloud, private cloud, and multi-cloud environments enables faster innovation and low-cost trial and error.

As an internationally influential cloud vendor, Alibaba Cloud also hopes to promote the further standardization of Kubernetes, and further cooperate with horizontal cloud vendors such as AWS, Google, and Azure to optimize the connection between cloud and Kubernetes, and unify the modularization and integration of different components. Standardized protocols.

2. Bring transparency and controllability, co-construction collaboration, and smooth evolution capabilities to Alibaba Cloud Kubernetes developers

For Kubernetes developers and users, we hope to build the best operating environment for Kubernetes based on Alibaba Cloud, and open source the Alibaba Cloud plug-ins built around Kubernetes. Alibaba Cloud Container Service ACK will also reuse these components as much as possible.

1.png

  • Transparent and controllable: For research developers, they can build their own Kubernetes clusters based on these plug-ins; for users of container service ACK, they can also understand the relevant implementation more transparently;
  • Co-construction and collaboration: If developers need to use Kubernetes on Alibaba Cloud in computing, network, storage and other fields, they can raise an issue or participate in the development of open source components to contribute together, and participate in the formulation of RoadMap;
  • Smooth evolution: Alibaba Cloud's Kubernetes open source plug-in provides Day 1 deployment capabilities, but puts forward higher requirements for enterprises' operation and maintenance, upgrades, and stability control. If you need expert services such as continuous upgrades, high availability guarantees, and error correction recommendations on Day 2, you can smoothly evolve to container service ACK.

Operation mechanism of SIG Cloud Provider Alibaba

  • Slack
  • bi-monthly meeting
  • Minutes of the meeting: Google Docs, YouTube
  • Conference language: Chinese, English

Introduction to Alibaba Cloud Kubernetes product family

2.png

Alibaba Cloud Kubernetes Open Source Suite Family Portrait

3.png

As an application operating system in the cloud-native era, Kubernetes has become the de facto standard. Alibaba Cloud has open sourced many projects in the process of Kubernetes practice, such as five major categories related to computing, storage, network, and security at the bottom layer and five major categories such as AI, application management, migration, and Serveless related to the upper-layer field. category that provides full-stack lifecycle management for user applications.

SIG-Cloud-Provider-Alibaba provides a bridge of communication for K8s cloud native best practices on Alibaba Cloud. Through interest groups, all participating individuals and organizations can understand the principles of CloudProvider and apply them to production practice to realize its business value.

See below for details.

CloudController

The internet

storage

elasticity

Safety

migrate

TO THE

ServiceBroker

Serverless

Application management

Introduction to some open source components

CloudController

CloudController refers to the cloud-controller-manager component (CCM) of K8s, which provides the ability to connect Kubernetes with basic services of various cloud vendors (including network load balancing, VPC routing, ECS, DNS, etc.). It is mainly implemented by four controllers: NodeController, ServiceController, RouteController, and PVLController.

**NodeController ** realizes the management of computing nodes, such as ECS node life cycle management, marking the nodes with availability zone, region, hostname and other identifiers, providing a full range of information for the orchestration system to schedule workloads on the computing pool. At the same time, it periodically polls the IP address of ECS and detects the status of ECS resources (whether it is released), etc., and dynamically updates node information to ensure that the orchestration system responds to computing node events in a timely manner.

**ServiceController ** implements application load balancing management. By monitoring the changes of Kubernetes Service objects, it automatically configures and manages cloud load balancing services (SLB configuration, monitoring configuration, virtual server group configuration) for applications, and dynamically changes according to application copy changes. A group of backend servers that adjust load balancing without manual intervention. On this basis, we define a set of rich annotations to customize the configuration of application load balancing. At the same time, we actively cooperate with the community to jointly promote the standardization of configuration, and at the same time, we extend the elastic network card on the service discovery model of K8s. The pass-through mode reduces the network level of service discovery and improves the overall application network performance by 10%.

4.png

High-performance network components Terway

Terway implements the Kubernetes CNI specification, which is optimized for Alibaba Cloud environment, and supports rich enterprise features. It supports VPC routing mode, ENI mode, ENI multi-IP and other modes, etc. It has excellent performance. Compared with native VPC, ENI mode is 10% higher than that of native VPC. about.

The deep integration of Terway and Alibaba Cloud's underlying IAAS network enables Pods as first-class citizens of cloud networks to seamlessly use network products such as CEN and SLB, and the use of elastic network cards to achieve zero network performance loss, so that the containerization process has no experience and performance degradation. At the same time, it supports advanced functions such as Kubernetes network policy and QoS flow control.

5.png

High-performance container storage CSI

The Alibaba Cloud CSI plug-in implements lifecycle management of container storage volumes in Kubernetes, and supports dynamic creation, mounting, and use of cloud data volumes. The current CSI implementation is based on K8S 1.14 or later; supported Alibaba Cloud storage: cloud disk, NAS, CPFS, OSS, LVM, etc.

6.png

High-performance log collection LogPilot

Log-Pilot is an efficient intelligent container log collection tool. It can not only conveniently collect standard output logs of containers, but also dynamically discover and collect log files inside containers. It adopts a declarative configuration method and can automatically It can dynamically configure container log collection by sensing the status of containers in the cluster, and it also has many advanced features, such as automatic log checkpoint and handle retention mechanism, support for automatic log data marking and custom tags, etc. Data is collected to various log storage backends, such as ElasticSearch, Kafka, Logstash, Redis, Graylog, etc.

7.png

Arena, a lightweight solution for machine learning

Arena is a lightweight machine learning solution based on Kubernetes, which supports the complete life cycle of data preparation, model development, model training, and model prediction, improving the work efficiency of data scientists. It is convenient for data scientists and algorithm engineers to quickly start using Alibaba Cloud's resources (including ECS ​​cloud server, GPU cloud server, distributed storage NAS, CPFS, object storage OSS, Elastic MapReduce, load balancing and other services) to perform data preparation, model development, model development Tasks such as training, evaluation, and prediction. And it can easily convert deep learning capabilities into service APIs to accelerate integration with business applications. While improving the efficiency of data scientists, the utilization of cluster GPU resources is improved through the visual management of GPU resources and the shared scheduling of devices.

8.png

Welcome to SIG Cloud Provider

This webinar first introduced Alibaba Cloud's community layout in Kubernetes. Limited by time and space, it is not possible to introduce the details of all open source components, but I hope to teach them how to fish so that developers who are interested in Kubernetes can find the corresponding open source projects. We welcome more developers to participate in the co-construction, whether it is to submit PR or Issue, or to make suggestions for Roadmap. In the future, SIG Cloug Provider Alibaba will also share principles and best practices for specific components.

Q & A

Q1: Can the Cloud Provider of Alibaba Cloud K8s add parameters for each function point to switch?

A1: You can implement specific function points by configuring annotations. For details, please refer to the documentation .

Q2: If we want to modify on the basis of Alibaba CCM, is there a version problem of K8s, because we want to use our own specific version of Kubernetes.

A2: Yes, CCM does not depend on K8s version.

Q3: Are open source CCMs directly used by Alibaba Cloud's Kubernetes-based container services? If yes, what adjustments were made internally before going live? Also, what is the specific format of provider_id?

A3: Yes, completely based on the open source version of CCM. The provider_id format is ${regionid}.${nodeid}.

Q4: Does CCM need the nodename of K8s to be the same as the instance id of Alibaba Cloud? Before the operation and maintenance said that it must be the same, but such a meaningless nodename is disgusting to use. <br />A4: Not required. Currently, only the providerid parameter needs to be configured.

Q5: How to accelerate the bottom layer of terway? kernel level or dpdk?

A5: Terway is divided into different network modes, and the network configurations of different modes are not connected.

  • In the exclusive ENI mode, the network card of the IAAS layer is directly used as the network card of the Pod. The host does not involve virtualization, and the user Pod can use DPDK to accelerate the application network. Outside the node relies on the high-performance IAAS network developed by Alibaba Cloud;
  • In the shared ENI mode, Ipvlan's lightweight virtualization solution is used as a means of intra-node virtualization, and the performance loss is extremely low compared to the Host network performance.

Q6: Do the underlying kernel parameters of POD allow namespaceization?

A6: Whether the underlying kernel parameters of the POD can be namespaced depends on the support of the kernel. Generally, on newer kernels such as the 4.19 kernel in Aliyun Linux2, most kernel parameters are allowed to be specified and modified on the Pod.

Q7: In terms of secure containers, what products does Alibaba have now?

A7: At present, Alibaba Cloud's container service has provided a security sandbox as an optional container engine for users, and some Alibaba Cloud serverless products such as SAE and ECI are also built on the security container.

Q8: Does Arena support multi-tenancy and virtual GPU?

A8: Arena reuses the existing user authorization and multi-tenant working mechanism of Kubernetes. Different users can assign different kubeconfigs and use them for authentication, while resource isolation and sharing through namespaces. From the Arena's point of view, users can only see the training and inference tasks in this namespace, and tasks in other namespaces are not visible.

The virtual GPU here refers to NVIDIA's virtual GPU technology. Currently, it is aimed at the virtual GPU that already supports P4 on Alibaba Cloud, and has been integrated with Alibaba Cloud's container service Kubernetes. You can experience it on Alibaba Cloud's container service. From Arena's point of view, virtual GPU is not a special GPU resource, and can realize the scheduling and arrangement of this resource.

Q9: Does the multi-container shared GPU solution support resource isolation? Can you limit the memory?

A9: First of all, thank you for your attention to our GPU sharing solution. Alibaba Cloud Container Service has contributed the only open source GPU sharing solution in the industry. At present, our solution is to achieve multi-container GPU sharing at the scheduling level, and can be combined with TensorFlow and other frameworks to achieve application-level GPU resource constraints. The current usage can be found in our documentation .

However, we are also developing a secure and high-performance GPU isolation solution with Alibaba Cloud's underlying team. I believe that in the near future, everyone will be able to experience a complete solution from GPU shared scheduling to isolation.

Q10: Does ExternalDNS currently support alicloud's DNS service, and what is the level of support?

A10: Currently, alicloud's DNS service privatezone is supported. It supports the resolution of services/Pods in the synchronous K8s cluster into the DNS service, reducing the loss caused by coredns deployed in the cluster.

Q11: What is the main difference between Ali's version of nginx ingress and the official community version?

A11: Alibaba Cloud has implemented more advanced functions on the basis of the community, such as dynamic update of NGINX Server configuration, support for mixed grayscale publishing strategies based on headers, cookies, request parameters and weights, etc.

Q12: What is the release cycle of Alibaba Cloud Kubernetes and these developed kits?

A12: The support for the major version of K8s is to update a stable version half a year. At the same time, bugfixes and security fixes will be made from time to time.

Q13: Ask if the edge version ACK @Edge commercial stable version has been released, and are there any relevant users using it?

A13: ACK @Edge can already be used in production environments, and users in online education, video, IoT, CDN and other fields and industries are already using it. The commercial version is expected to be launched before June 2020.

Q14: Has the host WORKER node encountered cGroup memory leaks, causing POD cannot allocatie memory? If so how to solve it?

A15: The cgroup driver used by the container service is the systemd cgroup driver, and this problem has not been encountered so far.

Q15: Is the CPU and memory resources of POD isolated from the host? How was it isolated?

A15: The kubelet can be used to reserve resources for the host, so that the resources of the Pod will be limited to the remaining resource space to achieve isolation.

Q16: aws has eckctl, does aliyun have corresponding tools? called ackctl ?

A16: See documentation .

Q17: How well does Alibaba Cloud support Windows containers?

A17: Window 1809 is currently supported, and 1903 will be supported soon. And support Linux cluster to add Windows nodes.

Q18: Can an open component be used alone to integrate into an existing K8s cluster?

A18: Yes. The existing K8s cluster completely meets the K8s Conformance test.

Live poster.png

" Alibaba Cloud Native focuses on micro-services, serverless, containers, Service Mesh and other technical fields, focuses on popular cloud-native technology trends, and implements large-scale implementation of cloud-native, and is the technology circle that understands cloud-native developers best."

{{o.name}}
{{m.name}}

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=324034191&siteId=291194637