kubernetes Pause container


1 Introduction

Pause container, also known as Infra container, this article will explore the function and principle of the container.

We know that there is such a parameter in the kubelet configuration:

KUBELET_POD_INFRA_CONTAINER=--pod-infra-container-image=registry.access.redhat.com/rhel7/pod-infrastructure:latest

The above are the configuration parameters in openshift, the default configuration parameters in kubernetes are:

KUBELET_POD_INFRA_CONTAINER=--pod-infra-container-image=gcr.io/google_containers/pause-amd64:3.0

The Pause container can be defined by yourself. For the code of the officially used gcr.io/google_containers/pause-amd64:3.0 container, see Github , which is written in C language.

2. Pause container features

The image is very small, currently around 700KB and
is always in the Pause state

3. Pause Container Background

A thing like a Pod is itself a logical concept. How does it work on the machine? That's a question we're going to explain.

Since the Pod wants to solve this problem, the core lies in how to share some resources and data most efficiently among multiple containers in a Pod.

Because containers were originally separated by Linux Namespace and cgroups, the actual solution now is how to break this separation and share certain things and certain information. This is the core problem that Pods are designed to solve.

So the specific solution is divided into two parts: network and storage .

Pause containers are born to solve network problems in Pods.

4. Pause container implementation

How do multiple containers in a Pod share the network? Here is an example:

For example, there is a Pod that contains a container A and a container B, and the two share the Network Namespace. The solution in Kubernetes is as follows: it will create an additional Infra container in each Pod to share the Network Namespace of the entire Pod.

Infra container is a very small image, about 700KB. It is a container written in C language and is always in the "pause" state. With such an Infra container, all other containers will be added to the Network Namespace of the Infra container through the Join Namespace.

So all the containers in a Pod see exactly the same network view. That is, the network devices, IP addresses, Mac addresses, etc., and network-related information they see are actually all a copy, and this copy comes from the Infra container that the Pod created for the first time. This is a solution for Pod to solve network sharing.

Inside the Pod, there must be an IP address, which is the address corresponding to the Network Namespace of the Pod and the IP address of the Infra container. So what everyone sees is one copy, and all other network resources are one copy of a Pod, and are shared by all the containers in the Pod. This is how the Pod's network is implemented.

Since there needs to be an intermediate container, the Infra container must be the first to start in the entire Pod. And the life cycle of the entire Pod is equivalent to the life cycle of the Infra container, independent of containers A and B. This is why in Kubernetes, it is allowed to update a certain image in the Pod individually, that is, when doing this operation, the entire Pod will not be rebuilt or restarted, which is a very important design.

5. The role of the Pause container

When we check the node nodes, we will find that there are many pause containers running on each node, such as the following.

$ docker ps
CONTAINER ID        IMAGE                           COMMAND ...
...
3b45e983c859        gcr.io/google_containers/pause-amd64:3.0    "/pause" ...
...
dbfc35b00062        gcr.io/google_containers/pause-amd64:3.0    "/pause" ...
...
c4e998ec4d5d        gcr.io/google_containers/pause-amd64:3.0    "/pause" ...
...
508102acf1e7        gcr.io/google_containers/pause-amd64:3.0    "/pause" ...    

The pause container in kubernetes mainly provides the following functions for each business container:

  • Serve as the basis for Linux namespace sharing in pods;
  • Enable the pid namespace and start the init process.

insert image description here

6. Shared namespace

In Linux, when you run a new process, the process inherits its namespace from the parent process. The way to run a process in a new namespace is to "unshare" the namespace with the parent process, thereby creating a new namespace. Here's an example of using the unsharetool to run a shell in a new PID, UTS, IPC, and mount namespace.

sudo unshare --pid --uts --ipc --mount -f chroot rootfs /bin/sh

After a process is running, you can add other processes to the process' namespace to form pods. setnsNew processes can be added to existing namespaces using system calls.

Containers in a pod share namespaces between them. Docker lets you automate this process a bit, so let's look at an example of how to create a pod from scratch using pause containers and a shared namespace. First, we need to start the pause container with Docker so we can add the container to the pod.

docker run -d --name pause -p 8080:80 gcr.io/google_containers/pause-amd64:3.0

Then we can run containers for our pods. First, we'll run nginx. This will set up nginx to proxy requests to localhost on port 2368.

Note that we also map the host port 8080 to port 80 on the pause container instead of the nginx container, because the pause container sets the initial network namespace that nginx will join.

$ cat <<EOF >> nginx.conf
error_log stderr;
events { worker_connections  1024; }
http {
    access_log /dev/stdout combined;
    server {
        listen 80 default_server;
        server_name example.com www.example.com;
        location / {
            proxy_pass http://127.0.0.1:2368;
        }
    }
}
EOF
$ docker run -d --name nginx -v `pwd`/nginx.conf:/etc/nginx/nginx.conf --net=container:pause --ipc=container:pause --pid=container:pause nginx

Then create an application container for ghost , which is a blogging software.

$ docker run -d --name ghost --net=container:pause --ipc=container:pause --pid=container:pause ghost

Visit now http://localhost:8880/to see the ghost blog interface.
Parsing
pauseThe container maps the internal 80port to the host's 8880port. pauseAfter the container has set up the network namespaceon , the nginx container is added to the network namespace. We saw that the nginx container was specified when it was started --net=container:pause, and the ghost container was also added to the network. In the network namespace, the three containers share the network, and can communicate directly with each other using localhost, --ipc=contianer:pause --pid=container:pausethat is, the three containers are in the namespacesame initprocess pause, and then we enter the ghost container to view the process.

# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   1024     4 ?        Ss   13:49   0:00 /pause
root         5  0.0  0.1  32432  5736 ?        Ss   13:51   0:00 nginx: master p
systemd+     9  0.0  0.0  32980  3304 ?        S    13:51   0:00 nginx: worker p
node        10  0.3  2.0 1254200 83788 ?       Ssl  13:53   0:03 node current/in
root        79  0.1  0.0   4336   812 pts/0    Ss   14:09   0:00 sh
root        87  0.0  0.0  17500  2080 pts/0    R+   14:10   0:00 ps aux

In the ghost container, you can see the processes of the pause and nginx containers at the same time, and the PID of the pause container is 1. In Kubernetes, PID=1the process of the container is the business process of the container itself.


Reference link:

insert image description here

Guess you like

Origin blog.csdn.net/xixihahalelehehe/article/details/124457212