[Cloud Native] K8S Binary Construction II: Deploying CNI Network Components

1. K8S provides three major interfaces

insert image description here

1.1 Container runtime interface CRI

What problem did it solve?
Container images (files with application specifications) must be launched in a standardized, secure and isolated manner

  • Standardized because no matter where they operate, standard operating rules are required.

  • Safe, because you don't want anyone who shouldn't have access to it, to manipulate it.

  • Isolation, because you don't want the application to affect other applications, or be affected by other applications (for example, other applications on the same node crash and cause themselves to fail). Isolation basically acts as protection. Also, resource constraints such as CPU, storage, and memory need to be provided for the application

tool

  • docker
  • containerd
  • subman
  • cri-0

1.2 Cloud native network interface CNI

What is Cloud Native Networking

  • Create a virtual network dedicated to application communication on top of the existing network, called an overlay network.

what problem was solved

  • Provides a dedicated communication network for independent containers to communicate with each other privately
  • Use software to control, inspect and modify data flow. Manage and secure connections between containers. Meet the isolation requirements between containers
  • The programmability and declarative nature of cloud-native networking makes this possible if you want to extend container networking and network policy.

How to solve

  • Use tools such as Flannel, calico, cilium, etc.

1.3 Cloud Native Storage Interface CSI

what is storage

  • Storage, which is where the application's persistent data resides, is often referred to as a persistent volume. Having easy access to persistent volumes is critical for applications to run reliably. Generally, when we say persistent data, we mean any data that we want to make sure we don't disappear when the app is restarted.

what problem was solved

  • To store data, you need hardware (specifically, disks). Disks, like any other piece of hardware, are subject to infrastructure constraints. this is the first challenge

  • The second challenge is the storage interface. Previously, each infrastructure had its own storage solution with its own interfaces, which made portability very difficult.

  • A third challenge is that today's applications must provision storage in an automated fashion in order to benefit from the cloud's elasticity.

  • Cloud-native storage is tailor-made for this new cloud-native challenge

how to solve

Cloud-native storage tools that help

a) Provide cloud-native storage options for containers

b) standardize the interface between containers and storage providers

c) Provide data protection through backup and restore operations

tool

  • ceph
  • nfs
  • gfs
  • s3

2. Flannel network plug-in

insert image description here

2.1 Pod network communication in K8S

insert image description here

Container-to-container communication within a Pod

Containers in the same Pod (containers in the Pod do not cross hosts) share the same network command space, which is equivalent to being on the same machine, and can use the localhost address to access each other's ports
.

Communication between Pods in the same Node

Each Pod has a real global IP address. Different Pods in the same Node can directly use the IP address of the other Pod to communicate. Both Pod1 and Pod2 are connected to the same docker0 bridge through Veth, and the network segment is the same , so they can communicate directly

Communication between Pods on different Nodes

The Pod address is in the same network segment as docker0, and the docker0 network segment and the host network card are two different network segments, and the communication between different Nodes can only be carried out through the physical network card of the host machine. To achieve communication between Pods on different Nodes, it is necessary to find a way to address and communicate through the IP address of the physical network card of the host. Therefore, two conditions must be met: the IP of the Pod cannot conflict; the IP of the Pod is associated with the IP of the Node where it is located, and through this association, the Pods on different Nodes can communicate directly through the intranet IP address.

2.2Overlay Network

Overlay network, a virtual network technology mode superimposed on the two-layer or three-layer basic network, the hosts in the network are connected through virtual link tunnels (similar to VPN).

2.3VXLAN

Encapsulate the source data packet into UDP, and use the IP/MAC of the basic network as the outer packet header for encapsulation, and then transmit it on the Ethernet. After reaching the destination, the tunnel endpoint decapsulates it and sends the data to the target address.

2.4Flannel

The function of Flannel is to allow Docker containers created by different node hosts in the cluster to have unique virtual IP addresses for the entire cluster. Flannel is
a kind of Overlay network. It also encapsulates TCP source data packets in another network packet for routing, forwarding and communication. Currently, it supports three data forwarding methods: udp, vxlan, and host-GW.

3. The working principle of Flannel udp mode

  • After the data is sent from the source container of the Pod on node01, it is forwarded to the flannel.1 virtual network card through the docker0 virtual network card of the host, and the flanneld service listens on the other end of the flannel.1 virtual network card.
  • Flannel maintains a routing table between nodes through the Etcd service. The flanneld service of the source host node01 encapsulates the original data content into UDP and delivers it to the flanneld service of the destination node node02 through the physical network card according to its own routing table. After the data arrives, it is unpacked and then directly enters the flannel.1 virtual server of the destination node. The network card is then forwarded to the docker0 virtual network card of the destination host, and finally forwarded by docker0 to the target container just like the local container communication.

3.1 Flannel of ETCD provides instructions

  • Storage and management of IP address segment resources that can be allocated by Flannel
  • Monitor the actual address of each Pod in ETCD, and establish and maintain the Pod node routing table in memory
  • Since the udp mode is forwarded in the user mode, there will be one more packet tunnel encapsulation, so the performance will be worse than the vxlan mode forwarded in the kernel mode.

Four, vxlan mode

insert image description here

Vxlan is an overlay (virtual tunnel communication) technology that builds a virtual layer-2 network through a layer-3 network, which is different from the specific implementation of the udp mode:

  • (1) The udp mode is implemented in the user mode. The data will first pass through the tun network card to the application program, and then the application program will be tunneled and encapsulated, and then enter the kernel protocol stack once. However, vxlan is implemented in the kernel and only passes through the protocol once. Stack, assemble the vxlan package in the protocol stack
  • (2) The tun network card in udp mode is three-layer forwarding. Using tun is to build a three-layer network on top of the physical network, which belongs to ip in udp, vxlan mode is a two-layer implementation, and overlay is a two-layer frame, which belongs to mac in udp
  • (3) Since vxlan adopts the mac in udp method, its implementation will involve mac address learning, arp broadcast and other layer-2 knowledge. The udp mode mainly focuses on routing

4.1 Working principle of Flannel vxlan mode

  • Vxlan is implemented in the kernel. When the data packet is sent by the vxlan device, it will be marked with vlxan header information. After sending out, the peer unpacks the packet, and the flannel.1 network card sends the original packet to the destination server.

5. Deploy flannel

5.1 Operation on node01 node

#上传 cni-plugins-linux-amd64-v0.8.6.tgz 和 flannel.tar 到 /opt 目录中
cd /opt/
docker load -i flannel.tar

mkdir /opt/cni/bin
tar zxvf cni-plugins-linux-amd64-v0.8.6.tgz -C /opt/cni/bin

insert image description here

5.2 Operation on master01 node

#上传 kube-flannel.yml 文件到 /opt/k8s 目录中,部署 CNI 网络
cd /opt/k8s
kubectl apply -f kube-flannel.yml 

kubectl get pods -n kube-system
NAME                    READY   STATUS    RESTARTS   AGE
kube-flannel-ds-hjtc7   1/1     Running   0          7s

kubectl get nodes
NAME            STATUS   ROLES    AGE   VERSION
192.168.80.11   Ready    <none>   81m   v1.20.11

insert image description here
insert image description here

6. Calico

6.1 Comparison of K8s networking solutions

flannel scheme

It is necessary to encapsulate the data packet sent to the container on each node, and then use the tunnel to send the encapsulated data packet to the node node running the target Pod. The target node node is then responsible for removing the encapsulation, and sending the unencapsulated data packet to the target Pod. Data communication performance is greatly affected.

calico program

Calico does not use tunnels or NAT to achieve forwarding, but treats Host as a router in the Internet, uses BGP to synchronize routes, and uses iptables to implement security access policies to complete cross-Host forwarding.

6.2 How Calico works

  • Calico maintains the communication of each pod through the routing table. Calico's CNI plug-in will set a veth pair device for each container, and then connect the other end to the host network space. Since there is no bridge, the CNI plug-in also needs to configure a veth pair device for each container on the host. Routing rules for receiving incoming IP packets.

  • With such a veth pair device, the IP packet sent by the container will reach the host through the veth pair device, and then the host will send it to the correct gateway according to the next hop address of the routing rule, then reach the target host, and then arrive at target container.

  • These routing rules are maintained and configured by Felix, while routing information is distributed by Calico BIRD components based on BGP. Calico actually treats all the nodes in the cluster as border routers. They form a fully interconnected network and exchange routes with each other through BGP. These nodes are called BGP Peers.

  • At present, flannel and calico are more commonly used. The function of flannel is relatively simple, and it does not have the ability to configure complex network policies. Calico is an excellent network management plug-in, but at the same time with complex network configuration capabilities, it often means that its own configuration is more complicated. , so relatively speaking, relatively small and simple clusters use flannel. Considering future expansion, the future network may need to add more devices and configure more network policies, it is better to use calico.

6.3 Deploy Calico

Operate on the master01 node

#上传 calico.yaml 文件到 /opt/k8s 目录中,部署 CNI 网络
cd /opt/k8s
vim calico.yaml
#修改里面定义Pod网络(CALICO_IPV4POOL_CIDR),与前面kube-controller-manager配置文件指定的cluster-cidr网段一样
    - name: CALICO_IPV4POOL_CIDR
      value: "192.168.0.0/16"
  
kubectl apply -f calico.yaml

kubectl get pods -n kube-system
NAME                                       READY   STATUS    RESTARTS   AGE
calico-kube-controllers-659bd7879c-4h8vk   1/1     Running   0          58s
calico-node-nsm6b                          1/1     Running   0          58s
calico-node-tdt8v                          1/1     Running   0          58s

#等 Calico Pod 都 Running,节点也会准备就绪
kubectl get nodes

node02 node deployment

//在 node01 节点上操作
cd /opt/
scp kubelet.sh proxy.sh [email protected]:/opt/
scp -r /opt/cni [email protected]:/opt/

//在 node02 节点上操作
#启动kubelet服务
cd /opt/
chmod +x kubelet.sh
./kubelet.sh 192.168.80.12

//在 master01 节点上操作
kubectl get csr
NAME                                                   AGE  SIGNERNAME                                    REQUESTOR           CONDITION
node-csr-BbqEh6LvhD4R6YdDUeEPthkb6T_CJDcpVsmdvnh81y0   10s  kubernetes.io/kube-apiserver-client-kubelet   kubelet-bootstrap   Pending
node-csr-duiobEzQ0R93HsULoS9NT9JaQylMmid_nBF3Ei3NtFE   85m  kubernetes.io/kube-apiserver-client-kubelet   kubelet-bootstrap   Approved,Issued

#通过 CSR 请求
kubectl certificate approve node-csr-BbqEh6LvhD4R6YdDUeEPthkb6T_CJDcpVsmdvnh81y0

kubectl get csr
NAME                                                   AGE  SIGNERNAME                                    REQUESTOR           CONDITION
node-csr-BbqEh6LvhD4R6YdDUeEPthkb6T_CJDcpVsmdvnh81y0   23s  kubernetes.io/kube-apiserver-client-kubelet   kubelet-bootstrap   Approved,Issued
node-csr-duiobEzQ0R93HsULoS9NT9JaQylMmid_nBF3Ei3NtFE   85m  kubernetes.io/kube-apiserver-client-kubelet   kubelet-bootstrap   Approved,Issued

#加载 ipvs 模块
for i in $(ls /usr/lib/modules/$(uname -r)/kernel/net/netfilter/ipvs|grep -o "^[^.]*");do echo $i; /sbin/modinfo -F filename $i >/dev/null 2>&1 && /sbin/modprobe $i;done

#使用proxy.sh脚本启动proxy服务
cd /opt/
chmod +x proxy.sh
./proxy.sh 192.168.80.12

#查看群集中的节点状态
kubectl get nodes

Guess you like

Origin blog.csdn.net/wang_dian1/article/details/132081570