Docker network implementation program study notes

Preface

Now, no matter what new technology, new frame appears, that there is 2a problem is that we are not open around, including the operating system, including not open around this 2problem, but also a very basic question - 网络and 存储.
Recall the following frameworks learned before, recall the following operating system principles, is it impossible to bypass these two points? These two points are the basis of all procedures and are also Dockerkey issues to be solved. Today we will learn Dockerabout network solutions together .

Network solution in Docker

In Dockerthe network problems are 3kinds of solutions. as follows:

  • Flannel
  • Weave
  • Calico

Their purpose is nothing more than to solve the same problem: how to make containers communicate with each other? This problem is escalated again, and kuberneteslooking at the cluster, it solves the same problem. However kubernetes, it has some constraints, all network implementations must follow CNIstandards (standards kubernetesdefined by the community or companies to facilitate expansion). CNIThe norms can be summarized into three chapters and four goals . as follows:

Three chapters:

  • podWith poddirect communication, no need to display useNAT
  • nodeWith poddirect communication, no need to display useNAT
  • podThe visible IPaddress is indeed podowned by others when communicating with it, without display conversion

Four goals:

  • Container-like communication
  • podAnd podcommunication between
  • podAnd servicecommunication between
  • External servicecommunication

Also, in general, major cloud platforms will combine their own network solutions to achieve a solution, and ignore these for now. Let's study the above-mentioned 3solutions one by one .

Flannel

flannelIt was developed by the coreosteam and was iporiginally designed to solve the re-planning address usage rules of all nodes in the cluster , so that containers on different nodes can be obtained 同一个内网and 不重复的IP地址, and containers on different nodes can 内网IPcommunicate directly !

The overall structure is as follows:
Insert picture description here

  • Flannel uses etcd to store configuration data and subnet allocation information. After flannel is started, the background process first retrieves the configuration and the list of subnets in use, then selects an available subnet, and then tries to register it. etcdAlso store this corresponding to each host ip. flannelUsing etcda watchmechanism to monitor /coreos.com/network/subnetsthe following information for all change elements, and to maintain it in accordance with a routing table.
  • Each host is configured with an ip segment and the number of subnets. For example, you can configure an overlay network usage 10.100.0.0/16segment, and each host has /24a subnet. So 主机aacceptable 10.100.5.0/24, 主机Bacceptable 10.100.18.0/24package. flannelUsed etcdto maintain ipthe mapping between the assigned subnet to the actual address. For the data path, flanneluse udpto encapsulate ip数据报and forward to the remote host. The udpprotocol was chosen because it can penetrate the firewall.

A complete communication process is as follows

  1. The data packet is sent from the source container and docker0forwarded toflannel0
  2. After the source flanneldlistens to this data packet, to whom should the data packet be parsed flanneld? A source flanneldfrom etcdinside query routing information of the destination address
  3. The source host flanneldthen UDPencapsulates the data packet using the protocol, and then delivers the data packet to the opposite end according to the routing tableflanneld
  4. The opposite end flanneldreceives the data packet, unpacks it and sends it to the flannel0network card, and then forwards it to the docker0network card
  5. Finally docker0routed to the final container

Flannel concluded that
flannel it is essentially one kind "覆盖网络(overlay network)", which means that a network running on an Internet (application layer network) does not rely on ip addresses to transmit messages, but uses a mapping mechanism to map ipaddresses and identifiersresources to locate resources. That is, TCPdata packaging in another network packet forwarding and routing inside communication, now supports UDP, VxLAN, AWS VPCand GCErouting data forwarding mode.

Weave

weaveIt was developed by the weaveworkscompany, and its purpose is to solve Dockercross-host communication and connect containers on multiple nodes to form a local area network. No KV Storestorage required .

A weavenetwork is composed of a series peers( WRoute), which WRouteare stored on different hosts. Connect between different hosts WRoutethrough weave connectcommands. This means that you can specify the network topology of the cluster yourself.

The overall architecture diagram is as follows:
Insert picture description here

  • There is one deployed on each deployed Dockerhost (either a physical machine or a virtual machine) WRouter, and it can also be deployed in the form of a container. weaveThe network is composed of these weave routerspeer endpoints ( peer), and the weavenetwork topology can be customized through the command line.
  • A connection WRoutewill be established between each2
  1. A tcpconnection for handshake and exchange of network topology information. The default 6783port.
  2. A udpconnection for information transfer on the data plane. The default 6783port.
  • On the data side, it weaveis udprealized through encapsulation L2 Overlay. Data encapsulation support 2modes
  1. sleeve mode: User mode, through pcapthe device in Linux bridgethe intercepted data packets by the wRoutercomplete UDPpackage, supports L2 trafficencryption, supports Partial Connection, but significant loss of performance
  2. fastpath mode: Kernel mode, through OVSthe odppackaging Vxlan, MPLS, WRouternot directly involved in forwarding, this approach can significantly improve the throughput, does not support encryption

sleeve modeThe pattern is as follows:
Insert picture description here

fastpath modeThe pattern is as follows:
Insert picture description here

The above is really weavethe basic information and general structure under the introduction , let's come to a more detailed flowchart, as follows:
Insert picture description here

  • weaveEach container is required to have two network cards, one is connected to the bridge to handle L2traffic, and the other is connected docker0to
  • weave-bridge: The weavecreated bridge, one end of the bridge is connected to the container, and one end isweave
  • docker0: dockerNative network, used to communicate with the host container, Docker0behind it is still iptables natimplementation

The steps for communication between containers in the above figure are as follows:

  1. container1By veth1passing traffic to the host's weave-bridgenetwork bridge
  2. WRouteUse pcapintercepted data packets and exclude data traffic directly forwarded by the kernel through the bridge (traffic within this subnet, local container, container and host). The captured packets are WRouteencapsulated into UDPdata packets and transferred to all other hosts
  3. On other hosts, WRoutedecapsulate the packet after receiving it, then pcapinject the packet into the bridge interface, and then vethforward the traffic to the container through the bridge

Calico

calicoThink of the protocol stack of each operating system as a router, and then regard all containers as terminals connected to this router, run standard routing protocols between routers BGP协议, and let them learn the topology of the network by themselves How to forward.

Therefore, the calicosolution is a pure three-tier solution (not required Overlay), which means that the three layers of each node ensure the three-tier connectivity between the two containers on the local machine and between the two containers across the host. Need etcdto store network metadata.

The overall structure is as follows:
Insert picture description here

  • Felix: calico agent, Responsible for configuring routing and ACL(access control), etc.
  • etcd: Network metadata storage to ensure calicoconsistent network status
  • BGP Client(BIDR): Mainly responsible for the Felixwrite kerneland distribute routing information to the current Caliconetwork to ensure workloadthe effectiveness of communication between
  • BGP Route Reflector(BIRD): The deployment of large-scale use, get rid of all the nodes interconnected meshmode, one or more BGP Route Reflectorto complete the centralized route distribution
  • calico-ipam: Mainly used for kubernetes-cniplug-ins, not written here

It will run two main programs on each node. One is Felixthat it will monitor ECTDthe storage of the center and get events from it. For example, the user adds one to this machine IPor allocates a container. Next, a container will be created on this machine, and it will 网卡、IP、MACbe set up, and then write an entry in the routing table of the kernel, indicating that this IPshould go to this network card. BGP ClientIt is a standard routing program. It will obtain from the kernel which IProutes have changed, and then BGPspread to the entire other hosts through the standard routing protocol, so that the outside world knows this IPis here, and you get here when you route. Come.

CalicoNetwork method:

calicoThere are two types of network methods, as follows:

  1. IPinIP: It is equivalent to a basic network bridge. Just oneip隧道
  2. BGP: Border Gateway Protocol, a core decentralized autonomous routing protocol on the Internet

Calico concludes:

Because it Calicois a pure three-layer implementation, it can avoid the operation of data packet encapsulation related to the two-layer scheme. There is nothing in the middle NAT, and there is no any overlay, so its forwarding efficiency may be the highest among all schemes, because of its The package goes directly to the original TCP/IP协议栈, and its isolation is also easy to do because of this stack. Because it TCP/IP协议栈provides a complete set of firewall rules, it can achieve more complex isolation logic through the rules of `IPTABLES.

to sum up

The advantages and disadvantages of the three options:

Program advantage Disadvantage
Flannel 1. Simple and stable
2. No need for NAT
3. Overlay mode and pure 3-layer mode
1. DNS function is not provided, and the containers can only be accessed through ip
2. etcd is required
3. Network policy is not supported
Weave 1. Support hostname access
2. No need for NAT
3. Exchange network information by yourself without storage
4. Support encryption
1. Overlay network
2. Complex configuration, weave connect or weave launch is required to join the network
Calico 1. Pure three-layer, no overlay, high efficiency
2. Support hostname access
3. No need for NAT
4. Perfect network strategy
1. Need to store
2. Because it is in the third layer, currently supports tcp and udp
3. There are requirements for the underlying network, and the second layer MAC address is required to communicate

Guess you like

Origin blog.csdn.net/Free_time_/article/details/107647236