[Cloud Native] Introduction to Pause Containers

Pause container

Pause container, also known as Infra container

We know that there is such a parameter in the configuration of kubelet:

KUBELET_POD_INFRA_CONTAINER=--pod-infra-container-image=registry.access.redhat.com/rhel7/pod-infrastructure:latest

The above are the configuration parameters in openshift, and the default configuration parameters in kubernetes are:

KUBELET_POD_INFRA_CONTAINER=--pod-infra-container-image=gcr.io/google_containers/pause-amd64:3.0

The Pause container can be defined by yourself. The code of the official gcr.io/google_containers/pause-amd64:3.0 container can be found on Github, written in C language.

Pause Container Features

  • The image is very small, currently around 700KB
  • Always in Pause state

Pause container background

A thing like Pod itself is a logical concept. How does it work on the machine? This is a problem we want to explain.

Since the Pod needs to solve this problem, the core is how to share some resources and data most efficiently among multiple containers in a Pod.

Because the containers were originally separated by Linux Namespace and cgroups, the real problem now is how to break this isolation and share certain things and certain information. This is the core problem that Pod's design is to solve.

So the specific solution is divided into two parts: network and storage.

Pause containers are born to solve network problems in Pods.

Pause container implementation

How do multiple containers in a Pod share the network? Here is an example:

For example, now there is a Pod, which contains a container A and a container B, and the two of them will share the Network Namespace. The solution in Kubernetes is this: it will create an additional small Infra container in each Pod to share the entire Pod's Network Namespace.

The Infra container is a very small image, about 700KB, and it is a container written in C language that is always in the "pause" state. With such an Infra container, all other containers will be added to the Network Namespace of the Infra container through Join Namespace.

So all containers in a Pod have exactly the same network view. That is: the network devices, IP addresses, Mac addresses, etc. they see, and the information related to the network, are all in one copy, and this one comes from the Infra container created for the first time by the Pod. This is a solution for Pod to solve network sharing.

In the Pod, there must be an IP address, which is the address corresponding to the Network Namespace of the Pod and also the IP address of the Infra container. So what everyone sees is a copy, and all other network resources are a Pod, and are shared by all containers in the Pod. This is how pod networking is implemented.

Since there needs to be an intermediate container, the Infra container must be the first to start in the entire Pod. And the life cycle of the entire Pod is equivalent to the life cycle of the Infra container, and has nothing to do with containers A and B. This is why in Kubernetes, it is allowed to update a certain image in the Pod individually, that is, after doing this operation, the entire Pod will not be rebuilt or restarted. This is a very important design.

The role of the Pause container

When we check the node node, we will find that there are many pause containers running on each node, for example as follows.

$ docker ps
CONTAINER ID        IMAGE                                                                                                                    COMMAND                  CREATED             STATUS              PORTS               NAMES
2c7d50f1a7be        docker.io/jimmysong/heapster-grafana-amd64@sha256:d663759b3de86cf62e64a43b021f133c383e8f7b0dc2bdd78115bc95db371c9a       "/run.sh"                3 hours ago         Up 3 hours                              k8s_grafana_monitoring-influxdb-grafana-v4-5697c6b59-76zqs_kube-system_5788a3c5-29c0-11e8-9e88-525400005732_0
5df93dea877a        docker.io/jimmysong/heapster-influxdb-amd64@sha256:a217008b68cb49e8f038c4eeb6029261f02adca81d8eae8c5c01d030361274b8      "influxd --config ..."   3 hours ago         Up 3 hours                              k8s_influxdb_monitoring-influxdb-grafana-v4-5697c6b59-76zqs_kube-system_5788a3c5-29c0-11e8-9e88-525400005732_0
9cec6c0ef583        jimmysong/pause-amd64:3.0                                                                                                "/pause"                 3 hours ago         Up 3 hours                              k8s_POD_monitoring-influxdb-grafana-v4-5697c6b59-76zqs_kube-system_5788a3c5-29c0-11e8-9e88-525400005732_0
54d06e30a4c7        docker.io/jimmysong/kubernetes-dashboard-amd64@sha256:668710d034c4209f8fa9a342db6d8be72b6cb5f1f3f696cee2379b8512330be4   "/dashboard --inse..."   3 hours ago         Up 3 hours                              k8s_kubernetes-dashboard_kubernetes-dashboard-65486f5fdf-lshl7_kube-system_27c414a1-29c0-11e8-9e88-525400005732_0
5a5ef33b0d58        jimmysong/pause-amd64:3.0  

The pause container in kubernetes mainly provides the following functions for each business container:

  • Serves as the basis for Linux namespace sharing in pods;
  • Enable the pid namespace and start the init process.

The role of the pause container can be seen from this example. First, see the figure below.
insert image description here
We first run a pause container on the node.

docker run -d --name pause -p 8880:80 --ipc=shareable jimmysong/pause-amd64:3.0

Then run an nginx container, nginx will create a proxy for localhost:2368.

$ cat <<EOF >> nginx.conf
error_log stderr;
events { worker_connections  1024; }
http {
    access_log /dev/stdout combined;
    server {
        listen 80 default_server;
        server_name example.com www.example.com;
        location / {
            proxy_pass http://127.0.0.1:2368;
        }
    }
}
EOF
$ docker run -d --name nginx -v `pwd`/nginx.conf:/etc/nginx/nginx.conf --net=container:pause --ipc=container:pause --pid=container:pause nginx

Then create an application container for ghost, which is a blogging software

$ docker run -d --name ghost --net=container:pause --ipc=container:pause --pid=container:pause ghost

Now visit http://localhost:8880/ to see the ghost blog interface

analyze

The pause container maps internal port 80 to port 8880 of the host. After the pause container sets up the network namespace on the host, the nginx container joins the network namespace. We see that --net is specified when the nginx container starts =container:pause, the ghost container is also added to the network namespace, so that the three containers share the network and can communicate directly with each other using localhost, –ipc=container:pause --pid=container:pause is three The container is in the same namespace, and the init process is paused. At this time, we enter the ghost container to check the process status.

# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   1024     4 ?        Ss   13:49   0:00 /pause
root         5  0.0  0.1  32432  5736 ?        Ss   13:51   0:00 nginx: master p
systemd+     9  0.0  0.0  32980  3304 ?        S    13:51   0:00 nginx: worker p
node        10  0.3  2.0 1254200 83788 ?       Ssl  13:53   0:03 node current/in
root        79  0.1  0.0   4336   812 pts/0    Ss   14:09   0:00 sh
root        87  0.0  0.0  17500  2080 pts/0    R+   14:10   0:00 ps aux

In the ghost container, you can see the processes of the pause container and the nginx container at the same time, and the PID of the pause container is 1. In Kubernetes, the process with PID=1 of the container is the business process of the container itself.

Guess you like

Origin blog.csdn.net/ljx1528/article/details/131441575