Calico of k8s network plug-in

Introduction to Calico

Calico official documentation: https://projectcalico.docs.tigera.io/getting-started/kubernetes/quickstart

Calico is an open source network and network security solution for network connections between containers, virtual machines, and hosts. It is a pure three-layer virtualized network solution that uses each node as a virtual router. , and treat the Pod on each node as a terminal device behind the node router and assign an IP address to it. The routers of each node generate routing rules through the BGP protocol, so as to realize the communication between Pods on unreachable nodes. As shown below:
insert image description here

Compared with Flannel, a significant advantage of Calico is the support for network policies, which allows users to define access control rules to control data packets entering and leaving Pods, thereby imposing security policies for communication between Pods.

BGP is a decentralized autonomous routing protocol that achieves reachability between autonomous systems by maintaining IP routing tables or "prefixes". It is usually used as a vector routing protocol for large-scale data centers to maintain routing information between different autonomous systems. The Linux kernel supports BGP natively, so a Linux host can be configured as a border gateway.

Calico regards the network composed of Pods on each node as an autonomous system (AS), and each node is equivalent to the border gateway of the autonomous system. Nodes exchange routing information and generate routing rules through the BGP protocol. But not all network environments can support BGP, and the BGP routing model requires all nodes to be in the same Layer 2 network, so Calico also supports the Overlay network model based on IPIP and VXLAN.

In addition, similar to the network model when Flannel's VXALN backend enables Directrouting, Calico also supports the mixed use of routing and Overlay network models. The BGP routing model is used for high-performance communication on the Layer 2 network, and IPIP or VXLAN is used to cross Layer 2 network nodes. Pod packet forwarding, as shown in the following figure:
insert image description here

Calico Architecture

insert image description here
As shown in the above figure, the components of Calico mainly include Felix, etcd storage system, BIRD and BGP route reflector, etc. The functions of each component are as follows:

  • Felix: Felix is ​​a daemon process running on each node. It is mainly responsible for several core tasks such as interface management, routing planning, ACL rules and status reporting, thereby generating a connection mechanism for each endpoint (vm or container)
    1. Interface management: Responsible for creating network interfaces and configuring interface information to the kernel to ensure that the kernel can handle the traffic of each endpoint, especially to ensure that the node's own MAC can be used to respond to the ARP request of each workload on the current working node, and Open the forwarding function for the interface managed by Felix, and the interface management also monitors the changes of each interface to ensure that the rules can be applied correctly
    2. Routing planning: Responsible for generating routing information in the kernel FIB (Forwarding Information Base) for each endpoint on the current working node to ensure that the message arriving at the current node can be correctly forwarded to the endpoint
    3. ACL rules: Negative generates ACL rules in the Linux kernel to allow only compliant traffic between endpoints and ensure that traffic cannot bypass security rules
    4. Status Report: Responsible for providing relevant data on the health status of the network, especially reporting errors and problems on nodes managed by Felix. These reports and data will be stored in etcd for use by other components or administrators
  • etcd storage system: Using etcd, Calico can have a system with a clear state, and it is easy to expand to cope with the increase of access pressure, so as to avoid itself becoming a system bottleneck. In addition, etcd is also the communication bus of calico components
  • BIRD: BGP protocol client, responsible for loading the routing information generated by Felix into the kernel and announcing it to the entire network
  • BGP route reflector: Calico's BGP routing model adopts the node-to-node mesh mode by default. As the number of nodes increases, the number of connections between nodes will increase rapidly, which will bring greater impact to the cluster network. pressure. Therefore, it is generally recommended that large-scale clusters use the BGP route reflector mode for route learning, and the point-to-point communication of BGP is converted into a single-way communication model with the central node. In addition, for redundancy considerations, multiple BGP route reflectors should be configured in the production environment. For Calico, the BGP client program can also be configured as a route reflector in addition to being used as a client.

In addition, Calico can abstract key configurations into resource types, and allow users to customize resource objects on demand to complete the system configuration. These resource objects are stored in Datastore, which can be independent etcd or etcd used by k8s clusters. There are more than a dozen types of Calico-specific resources, including IPPool (IP address pool), NetworkPolicy (network policy), BGPConfiguration (BGP configuration parameters), etc. Similar to the definition of Kubernetes API resources, the configuration format of these resources is also defined in JSON using first-level fields such as apiVersion, kind, metadata, and spec, and can be managed using the calicoctl client tool. It also supports kubelet with the help of CRD. resource management.

Calico deployment

When actually deployed on the k8s cluster, Calico is divided into two components, calico-node and calico-kube-controllers, which read the resource definitions related to themselves through Datastore to complete the configuration

  • calico-node: The agent program that Calico runs on each node in the k8s cluster, responsible for providing daemon processes such as Felxi, bird4, bird6 and confd
  • calico-kube-controllers: Calico's custom controllers running on the k8s cluster, which is a plug-in for Calico to collaborate with k8s

Download the deployment file from the official website:
https://raw.githubusercontent.com/projectcalico/calico/v3.24.5/manifests/calico.yaml

curl https://raw.githubusercontent.com/projectcalico/calico/v3.24.5/manifests/calico.yaml -O

Modify the deployment file to adapt to the k8s cluster environment
First, modify the value of the CALICO_IPV4POOL_CIDR variable in the deployment file, and set it to the pod-cidr set during the initial k8s cluster, for example: Then modify the value
insert image description here
of the CALICO_IPV4POOL_BLOCK_SIZE variable and specify Calico to assign addresses to nodes segment mask length, default 26
insert image description here

Calico's deployment file uses the IPIP tunnel mode by default, so keep the default here and no modification will be made. If you want to use pure BGP routing mode or hybrid mode, you can modify the value of the variable CALICO_IPV4POOL_IPIP, the available values ​​are as follows:

  • Always: only use IPIP tunnel network, the default value
  • Never: Do not use IPIP tunnel network
  • CrossSubnet: enable hybrid network mode
    insert image description here

If you want to use the VXLAN tunnel network instead of the IPIP tunnel network, you can modify the value of the variable CALICO_IPV4POOL_VXLAN. The available values ​​and logic are consistent with the variable CALICO_IPV4POOL_IPIP.

For more configuration and introduction of variables, please refer to the official website introduction: https://projectcalico.docs.tigera.io/reference/node/configuration

Apply the deployment file to the cluster, wait for the Calico-related Pod to run successfully and no error is reported, which means the Calico deployment is successful.

kubectl apply -f calico.yaml

insert image description here

IPIP tunnel network

Calico working in IPIP mode will create a tunl0 interface on each node as the tunnel ingress and egress device to encapsulate IPIP tunnel packets. Calico will create a pair of veth devices for each Pod resource, one end of which is used as the network interface of the Pod, and the other end (named calixxx) stays in the network namespace of the host machine, and it does not use a virtual bridge. As shown in the figure below:
insert image description here
The IPIP tunnel network also relies on BGP to maintain routing information of nodes. After the deployment is complete, Calico will generate routing information on each node to the Pod subnet of other nodes through the BGP protocol. For example, the routing information on node-01 below is generated by the BIRD on each node notifying other nodes in the network in a point-to-point manner and learning the notifications of other nodes.
insert image description here

For each Pod, Calico will generate a dedicated routing entry on the node to ensure that packets destined for the Pod IP can be delivered through the calixxx interface on the node, because Calico does not use a virtual network like Flannel The bridge forwards packets. The relevant routing entries are as follows:
insert image description here

Pod communication process analysis
In the cluster, an nginx-pod (10.244.89.5) is running on node-02, and the nginx-pod is requested on the client-pod (10.244.169.3) on node-01. The process is roughly as follows:

  1. The client-pod sends a request, and sends the message to the corresponding calixxx interface on the host according to the default route in the Pod. At this time,
    src-ip : 10.244.89.5 dst-ip : 10.244.169.3
    src-mac : b2:65:39: 08:87:b0 dst-mac : ee:ee:ee:ee:ee:ee

    Capture packets on the calixxx interface as follows:insert image description here

    At this time, we found a problem. We checked the route in the client-pod and found that the default route was 169.254.1.1, as shown in the figure below. But the address 169.254.1.1 does not exist, what is the situation?
    insert image description here
    According to common sense of the network, when the destination address of the data packet is not the local machine, it will query the gateway according to the routing table, send an ARP request to query the gateway MAC after querying the gateway, and then modify the target MAC of the data, so no matter whether this address exists or not, as long as ARP The request can get its MAC.

    Therefore, check the MAC corresponding to 169.254.1.1 in the Pod, as shown in the figure below:
    insert image description here

    The MAC address corresponding to 169.254.1.1 is ee:ee:ee:ee:ee:ee. This is the MAC address of the calixxx interface on the host, but this is not logical. The calixxx interface on the host does not have this address. Why is the MAC obtained by ARP is it a mac?

    insert image description here

    This is actually because the calixxx interface has enabled the ARP proxy function of the network card, as shown in the figure below:

    insert image description here
    Proxy ARP is a variant of the ARP protocol. When the ARP request target crosses the network segment, the gateway device receives the ARP request and returns it to the requester with its own MAC address. This is Proxy ARP (Proxy ARP). Therefore, when the ARP request sent in the Pod reaches the calixxx interface, the calixxx interface directly returns its own MAC.

  2. The data packet is forwarded to the tunl0 interface for IPIP encapsulation according to the host routing table

    The packet capture of the tunl0 interface is as follows:
    insert image description here
    at this time, the MAC header of the outer layer should be removed, and then IPIP encapsulation

  3. The encapsulated IPIP data packet is sent through the host interface ens33, at this time
    inner-src-ip : 10.224.89.5 inner-dst-ip : 10.224.169.3
    outer-src-ip : 192.168.211.11 outer-dst-ip : 192.168. 211.12
    src-mac : 00:0c:29:51:81:05 dst-mac : 00:0c:29:99:52:7b
    insert image description here

  4. node-02 decapsulates the message after receiving it, and then sends it to ngix-pod

For the configuration and use of other modes of Calico, please refer to the official website: https://projectcalico.docs.tigera.io/about/about-calico

Guess you like

Origin blog.csdn.net/weixin_43266367/article/details/128018625