K8s from ring to the skilled rip - Network Cluster Detailed

Author | Ali cloud east sound after-sale technical experts

REVIEW: Ali cloud K8S cluster network currently has two options: one is flannel program; the other is based on calico and elasticity of terway card eni program. Terway and flannel similar, different places that terway Pod elastic support network cards, as well as NetworkPolicy function. With flannel, for example, in-depth analysis realization Ali cloud cluster network K8S this article, the author based on the current version 1.12.6.

Aerial view

Overall, after aliyun K8S cluster network configuration, as shown below: a cluster CIDR, podCIDR VPC routing table, the network node, the node, the virtual bridge cni0 node, Veth other and connecting bridge portion Pod .

1

Similar we might have seen in many articles, but because configuration is too complicated, more difficult to understand. Here we can look at the logic behind these configurations.

Basically, we can configure these three cases to understand: the cluster configuration, and Pod Configuration node configuration. This corresponds with the three cases, in fact, is a cluster of three IP network segment division: first cluster CIDR, followed by distribution podCIDR (ie cluster CIDR subnet segment) for each node, in the last years for each Pod podCIDR assign their own IP.

2

Cluster network structures

The initial phase

Creating a cluster, cloud-based resources VPC and ECS, After creating VPC and ECS, we can basically get the allocation of resources in the following figure. We got a VPC, VPC of this segment is 192.168.0.0/16, we get a number of ECS, they allocate from VPC network segment to the IP address.

3

Cluster stage

On the basis of the initial resources on the above, we use the cluster to create a cluster console to get CIDR. This value will be passed to the cluster nodes provision script as a parameter, and is passed to the script node cluster configuration tool kubeadm. kubeadm finally write this parameter cluster controller yaml file kube-controller-manager.yaml of static Pod.

4

集群控制器有了这个参数,在节点 kubelet 注册节点到集群的时候,集群控制器会为每个注册节点,划分一个子网出来,即为每个节点分配 podCIDR。如上图,Node B 的子网是 172.16.8.1/25,而 Node A 的子网是 172.16.0.128/25。这个配置会记录到集群 node 的 podCIDR 数据项里。

节点阶段

经过以上集群阶段,K8S 有了集群 CIDR,以及为每个节点划分的 podCIDR。在此基础上,集群会下发 flanneld 到每个阶段上,进一步搭建节点上,可以给 Pod 使用的网络框架。这里主要有两个操作:

  • 第一个是集群通过 Cloud Controller Manager 给 VPC 配置路由表项。路由表项对每个节点有一条,每一条的意思是,如果 VPC 路由收到的目的地址是某一个节点 podCIDR 的 IP 地址,那么路由会把这个网络包转发到对应的 ECS 上;
  • 第二个是创建虚拟网桥 cni0 以及与 cni0 相关的路由。这些配置的作用是,从阶段外部进来的网络包,如果目的 IP 是 podCIDR,则会被节点转发到 cni0 虚拟局域网里。

注意:实际实现上,cni0 的创建,是在第一个使用 Pod 网络的 Pod 被调度到节点上的时候,由下一节中 flannal cni 创建的,但是从逻辑上来说,cni0 属于节点网络,不属于 Pod 网络,所以在此描述。

5

Pod 阶段

在前边的三个阶段,集群实际上已经为 Pod 之间搭建了网络通信的干道。这个时候,如果集群把一个 Pod 调度到节点上,kubelet 会通过 flannel cni 为这个 Pod 本身创建网络命名空间和 veth 设备,然后,把其中一个 veth 设备加入到 cni0 虚拟网桥里,并为 Pod 内的 veth 设备配置 IP 地址。这样 Pod 就和网络通信的干道连接在了一起。
这里需要强调的是,前一节的 flanneld 和这一节的 flannel cni 完全是两个组件。flanneld 是一个 daemonset 下发到每个节点的 pod,它的作用是搭建网络(干道),而 flannel cni 是节点创建的时候,通过 kubernetes-cni 这个 rpm 包安装的 cni 插件,其被 kubelet 调用,用来为具体的 pod 创建网络(分枝)。理解这两者的区别,有助于我们理解 flanneld 和 flannel cni 相关的配置文件的用途。比如 /run/flannel/subnet.env,是 flanneld 创建的,为 flannel cni 提供输入的一个环境变量文件;又比如 /etc/cni/net.d/10-flannel.conf,也是 flanneld pod(准确的说,是 pod 里的脚本 install-cni)从 pod 里拷贝到节点目录,给 flannel cni 使用的子网配置文件。

6

通信

以上完成 Pod 网络环境搭建。基于以上的网络环境,Pod 可以完成四种通信:本地通信;同节点 Pod 通信;跨节点 Pod 通信;以及 Pod 和 Pod 网络之外的实体通信。

7

其中本地通信,说的是 Pod 内部,不同容器之间的通信。因为 Pod 内网容器之间共享一个网络协议栈,所以他们之间的通信,可以通过 loopback 设备完成。

Pod communication between different nodes, is internal communication cni0 virtual bridge, which is equivalent to a Layer 2 LAN internal communication equipment.

Pod node communication across a little complicated, but it is straightforward, the packet transmission side, through the gateway cni0 bridge, to the transfer node, and then sent to the routing through the node VPC eth0. There will not be subjected to any packet operation. When the VPC receives a packet routing, routing table by querying it, the acknowledgment packet destination, and transmits the data packet to the node corresponding to the ECS. And go after the node, because flanneld created on the node routing cni0, the data packet will be sent to cni0 LAN destination, to the destination Pod.

The last case of communication entities, and non-Pod Pod network nodes need to go through the iptables rules do SNAT, and this rule is configured according to flanneld --ip-masq command line options do.

to sum up

These are set up and communication theory Ali cloud K8S cluster network. We communicate mainly through network structures and perspectives to analyze K8S cluster network. Which network structures including the initial stage, stage cluster node stage and Pod stage, so classification helps us to understand these complex configurations. And understand each configuration, the cluster communication theory easier to understand.

"Alibaba Cloud native micro-channel public number (ID: Alicloudnative) focus on micro service, Serverless, container, Service Mesh and other technical fields, focusing popular technology trends in cloud native, cloud native large-scale landing practice, do most understand cloud native developers technology public number. "

Guess you like

Origin www.cnblogs.com/alisystemsoftware/p/11635248.html