Docker-Bridge Network 02 Container and external communication

This section introduces the communication between the container and the outside in the bridge network mode.

1. Foreword 2. Container access to the outside 2.1 Access to the external network 2.2 Principle 2.3 Summary of a picture 2.4 Packet capture 3. External access to the container 3.1 Create an nginx container and access from the outside 3.2 Principle 3.3 Summary of a picture 3.4 Packet capture 3.5 docker-proxy 4 .summary

1 Introduction

The previous section introduced the communication process between containers. The service is deployed in the container. We must access the service from the outside. So, how does the container communicate with the outside?

2. Container access to the outside

In this section, we still use the bbox1 container for experiments. First, make sure that the virtual machine Host1 can access the Internet. In the "Preparing the Docker Environment" section, I introduced how the virtual machine can directly ping Baidu.
To clarify, the external network does not necessarily refer to the Internet, and external and internal are relative concepts. Any network other than the container network can be called an external network, such as the network where Host1 is located.

2.1 Access to the Internet

[root@docker1 ~]# docker exec -it bbox1 sh
/ # ping www.baidu.com
PING www.baidu.com (180.101.49.11): 56 data bytes
64 bytes from 180.101.49.11: seq=0 ttl=51 time=21.308 ms
64 bytes from 180.101.49.11: seq=1 ttl=51 time=24.613 ms

2.2 Principle

Recall the current network connection of the container:

Let's try to analyze that bbox1 sent a ping packet. According to bbox1's routing table, the packet was sent to docker0. How are docker0 and enp0s3 connected?
Linux itself has routing and forwarding functions, so in fact the Linux system itself can be used as a router. Execute the command cat /proc/sys/net/ipv4/ip_forward, you can see ip_forward = 1, indicating that the system has enabled routing forwarding.
Then after docker0 sends out the data packet, it should be forwarded according to the routing table of the Host. The host's default route points to enp0s3, so enp0s3 received the packet.
I personally understand that at this time the source IP of the data packet on enp0s3 is bbox1, the destination IP is baidu, and it should be able to reach baidu; but when baidu sends back the response packet, it does not know where this private network IP is. No first-level route can be found, so enp0s3 has to perform SNAT before sending a message here.
I don't quite understand the specific return process. Some students who understand clearly have trouble explaining it specifically. Thanks.
When it comes to SNAT, we must mention the Linux kernel's big killer-IPtables.
Execution in the virtual machine iptables -t nat -vnL, view the nat table.

Chain POSTROUTING (policy ACCEPT 21 packets, 1479 bytes)
 pkts bytes target     prot opt in   out      source            destination         
    3   202 MASQUERADE  all  --  *   !docker0  172.17.0.0/16     0.0.0.0/0 

After docker0 sends out the message, it will be forwarded through the routing of the linux kernel and reach enp0s3. Before reaching enp0s3, it will go through the iptables POSTROUTINGchain for SNAT, which is the above rule. The main idea is to replace the source IP and MAC with the IP and MAC of enp0s3 after routing by performing MASQUERADE (masquerade) on the packets with the source IP of 172.17.0.0/16 and arbitrary destination IP.

2.3 A picture summary

  1. bbox1 ping Baidu, the message is sent to docker0;
  2. docker0 sends out the message, after the routing and forwarding of the kernel and SNAT processing, the source IP and MAC of the message are replaced with the IP and MAC of enp0s3 to reach enp0s3;
  3. Enp0s3 sends a message to visit Baidu.

2.4 Packet capture

The current network equipment and addresses are as follows:

device IP MAC
Container bbox1 172.17.0.2 02:42:ac:11:00:02
网桥docker0 172.17.0.1 02:42:a8:64:6c:32
虚拟机网卡enp0s3 192.168.0.11 08:00:27:70:b6:ef

在bbox1内部执行ping www.baidu.com

  • 在docker0上抓包:
[root@docker1 ~]# tcpdump -nei docker0 icmp
listening on docker0, link-type EN10MB (Ethernet), capture size 262144 bytes
12:49:11.669684 02:42:ac:11:00:02 > 02:42:a8:64:6c:32, ethertype IPv4 (0x0800), length 98: 172.17.0.2 > 180.101.49.11: ICMP echo request, id 9728, seq 0, length 64
12:49:11.697374 02:42:a8:64:6c:32 > 02:42:ac:11:00:02, ethertype IPv4 (0x0800), length 98: 180.101.49.11 > 172.17.0.2: ICMP echo reply, id 9728, seq 0, length 64
  • 在enp0s3上抓包:
[root@docker1 ~]# tcpdump -nei enp0s3 icmp
listening on enp0s3, link-type EN10MB (Ethernet), capture size 262144 bytes
12:49:11.669700 08:00:27:70:b6:ef > 48:0e:ec:3b:b3:41, ethertype IPv4 (0x0800), length 98: 192.168.0.11 > 180.101.49.11: ICMP echo request, id 9728, seq 0, length 64
12:49:11.697340 48:0e:ec:3b:b3:41 > 08:00:27:70:b6:ef, ethertype IPv4 (0x0800), length 98: 180.101.49.11 > 192.168.0.11: ICMP echo reply, id 9728, seq 0, length 64

从enp0s3的报文可以看出,源IP和MAC已经是enp0s3的了。

3.外部访问容器

3.1 创建nginx容器并从外部访问

创建一个nginx容器,执行docker run -it -d --name=nginx01 -p 8081:80 nginx该命令将容器的80端口映射到主机的8081端口。
从本虚拟机上curl 127.0.0.1:8081,能返回nginx欢迎信息。从另一台虚拟机上curl 192.168.0.11:8081,也能返回nginx欢迎信息。说明外部网络能访问nginx容器。

3.2 原理

使用了IPtables的DNAT功能。
执行iptables -t nat -vnL查看IPtables规则,可以发现目的端口8081被替换为172.17.0.4:80。

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0           
    0     0 DNAT       tcp  --  !docker0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8081 to:172.17.0.4:80

3.3 一张图总结

  1. 在Host2里面curl 192.168.0.11:8081,报文到达Host1的enp0s3;
  2. Host1的enp0s3发出报文后,经由内核的转发及DNAT处理,将目的IP替换成nginx01的IP和端口;
  3. docker0收到报文后,根据目的mac找到对应端口,送出报文到nginx01。

3.4 抓包

使用tcpdump,在Host1的网卡及docker0上抓包验证。

  • Host1的enp0s3
[root@docker1 ~]# tcpdump -nei enp0s3 tcp port 8081
listening on enp0s3, link-type EN10MB (Ethernet), capture size 262144 bytes
13:25:14.725366 08:00:27:d4:6f:d1 > 08:00:27:70:b6:ef, ethertype IPv4 (0x0800), length 74: 192.168.0.12.48392 > 192.168.0.11.tproxy: Flags [S], seq 1433025043, win 29200, options [mss 1460,sackOK,TS val 4294829465 ecr 0,nop,wscale 7], length 0
13:25:14.725559 08:00:27:70:b6:ef > 08:00:27:d4:6f:d1, ethertype IPv4 (0x0800), length 74: 192.168.0.11.tproxy > 192.168.0.12.48392: Flags [S.], seq 404119170, ack 1433025044, win 28960, options [mss 1460,sackOK,TS val 59419808 ecr 4294829465,nop,wscale 7], length 0
  • Host1的docker0
[root@docker1 ~]# tcpdump -nei docker0 tcp
listening on docker0, link-type EN10MB (Ethernet), capture size 262144 bytes
13:26:07.014766 02:42:a8:64:6c:32 > 02:42:ac:11:00:04, ethertype IPv4 (0x0800), length 74: 192.168.0.12.48396 > 172.17.0.4.http: Flags [S], seq 1839700092, win 29200, options [mss 1460,sackOK,TS val 4294881751 ecr 0,nop,wscale 7], length 0
13:26:07.015357 02:42:ac:11:00:04 > 02:42:a8:64:6c:32, ethertype IPv4 (0x0800), length 74: 172.17.0.4.http > 192.168.0.12.48396: Flags [S.], seq 3312305882, ack 1839700093, win 28960, options [mss 1460,sackOK,TS val 59472098 ecr 4294881751,nop,wscale 7], length 0

可以看出,docker0上收到的报文,目的IP及MAC已经是nginx01的了。

3.5 docker-proxy

注意:对于外部访问容器,除了iptables DNAT处理外,还有一种方式docker-proxy。网上很多文章写得并不全面,只说了docker-proxy。其实通过上面的分析,只通过DNAT就可以完成外部访问容器了,那docker-proxy什么时候起作用呢?

[root@docker1 ~]# ps -ef|grep proxy
root      4073   927  0 13:07 ?        00:00:00 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8081 -container-ip 172.17.0.4 -container-port 80
root      4143  1521  0 13:32 pts/0    00:00:00 grep --color=auto proxy

查看进程可以发现,我们的host上也开启了docker-proxy,将host的0.0.0.0:8081转发到容器172.17.0.4:80。
关于docker-proxy和DNAT何时起作用,有一篇文章分析的很好,分享给大家《docker-proxy存在合理性分析》。

4.小结

  • 容器访问外部,由iptables SNAT实现
  • 外部访问容器,由iptables DNAT实现,另外在一些场景下,通过docker-proxy进行转发

下一节,我们介绍bridge network的自定义网络。 点击此处回到docker系列文章目录

 

原创文章,如果转载,请声明出处!

-----------------------------------------------------------------------------------------------

本人微信公众号同步更新云计算、容器、网络、编程等文章,欢迎参观!

 

 

Guess you like

Origin www.cnblogs.com/sunqingliang/p/12731601.html