Article directory
1 Introduction
Pause container, also known as Infra container, this article will explore the function and principle of the container.
We know that there is such a parameter in the kubelet configuration:
KUBELET_POD_INFRA_CONTAINER=--pod-infra-container-image=registry.access.redhat.com/rhel7/pod-infrastructure:latest
The above are the configuration parameters in openshift, the default configuration parameters in kubernetes are:
KUBELET_POD_INFRA_CONTAINER=--pod-infra-container-image=gcr.io/google_containers/pause-amd64:3.0
The Pause container can be defined by yourself. For the code of the officially used gcr.io/google_containers/pause-amd64:3.0 container, see Github , which is written in C language.
2. Pause container features
The image is very small, currently around 700KB and
is always in the Pause state
3. Pause Container Background
A thing like a Pod is itself a logical concept. How does it work on the machine? That's a question we're going to explain.
Since the Pod wants to solve this problem, the core lies in how to share some resources and data most efficiently among multiple containers in a Pod.
Because containers were originally separated by Linux Namespace and cgroups, the actual solution now is how to break this separation and share certain things and certain information. This is the core problem that Pods are designed to solve.
So the specific solution is divided into two parts: network and storage .
Pause containers are born to solve network problems in Pods.
4. Pause container implementation
How do multiple containers in a Pod share the network? Here is an example:
For example, there is a Pod that contains a container A and a container B, and the two share the Network Namespace. The solution in Kubernetes is as follows: it will create an additional Infra container in each Pod to share the Network Namespace of the entire Pod.
Infra container is a very small image, about 700KB. It is a container written in C language and is always in the "pause" state. With such an Infra container, all other containers will be added to the Network Namespace of the Infra container through the Join Namespace.
So all the containers in a Pod see exactly the same network view. That is, the network devices, IP addresses, Mac addresses, etc., and network-related information they see are actually all a copy, and this copy comes from the Infra container that the Pod created for the first time. This is a solution for Pod to solve network sharing.
Inside the Pod, there must be an IP address, which is the address corresponding to the Network Namespace of the Pod and the IP address of the Infra container. So what everyone sees is one copy, and all other network resources are one copy of a Pod, and are shared by all the containers in the Pod. This is how the Pod's network is implemented.
Since there needs to be an intermediate container, the Infra container must be the first to start in the entire Pod. And the life cycle of the entire Pod is equivalent to the life cycle of the Infra container, independent of containers A and B. This is why in Kubernetes, it is allowed to update a certain image in the Pod individually, that is, when doing this operation, the entire Pod will not be rebuilt or restarted, which is a very important design.
5. The role of the Pause container
When we check the node nodes, we will find that there are many pause containers running on each node, such as the following.
$ docker ps
CONTAINER ID IMAGE COMMAND ...
...
3b45e983c859 gcr.io/google_containers/pause-amd64:3.0 "/pause" ...
...
dbfc35b00062 gcr.io/google_containers/pause-amd64:3.0 "/pause" ...
...
c4e998ec4d5d gcr.io/google_containers/pause-amd64:3.0 "/pause" ...
...
508102acf1e7 gcr.io/google_containers/pause-amd64:3.0 "/pause" ...
The pause container in kubernetes mainly provides the following functions for each business container:
- Serve as the basis for Linux namespace sharing in pods;
- Enable the pid namespace and start the init process.
6. Shared namespace
In Linux, when you run a new process, the process inherits its namespace from the parent process. The way to run a process in a new namespace is to "unshare" the namespace with the parent process, thereby creating a new namespace. Here's an example of using the unshare
tool to run a shell in a new PID, UTS, IPC, and mount namespace.
sudo unshare --pid --uts --ipc --mount -f chroot rootfs /bin/sh
After a process is running, you can add other processes to the process' namespace to form pods. setns
New processes can be added to existing namespaces using system calls.
Containers in a pod share namespaces between them. Docker lets you automate this process a bit, so let's look at an example of how to create a pod from scratch using pause containers and a shared namespace. First, we need to start the pause container with Docker so we can add the container to the pod.
docker run -d --name pause -p 8080:80 gcr.io/google_containers/pause-amd64:3.0
Then we can run containers for our pods. First, we'll run nginx. This will set up nginx to proxy requests to localhost on port 2368.
Note that we also map the host port 8080 to port 80 on the pause container instead of the nginx container, because the pause container sets the initial network namespace that nginx will join.
$ cat <<EOF >> nginx.conf
error_log stderr;
events { worker_connections 1024; }
http {
access_log /dev/stdout combined;
server {
listen 80 default_server;
server_name example.com www.example.com;
location / {
proxy_pass http://127.0.0.1:2368;
}
}
}
EOF
$ docker run -d --name nginx -v `pwd`/nginx.conf:/etc/nginx/nginx.conf --net=container:pause --ipc=container:pause --pid=container:pause nginx
Then create an application container for ghost , which is a blogging software.
$ docker run -d --name ghost --net=container:pause --ipc=container:pause --pid=container:pause ghost
Visit now http://localhost:8880/
to see the ghost blog interface.
Parsing
pause
The container maps the internal 80
port to the host's 8880
port. pause
After the container has set up the network namespace
on , the nginx container is added to the network namespace. We saw that the nginx container was specified when it was started --net=container:pause
, and the ghost container was also added to the network. In the network namespace, the three containers share the network, and can communicate directly with each other using localhost, --ipc=contianer:pause --pid=container:pause
that is, the three containers are in the namespace
same init
process pause
, and then we enter the ghost container to view the process.
# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 1024 4 ? Ss 13:49 0:00 /pause
root 5 0.0 0.1 32432 5736 ? Ss 13:51 0:00 nginx: master p
systemd+ 9 0.0 0.0 32980 3304 ? S 13:51 0:00 nginx: worker p
node 10 0.3 2.0 1254200 83788 ? Ssl 13:53 0:03 node current/in
root 79 0.1 0.0 4336 812 pts/0 Ss 14:09 0:00 sh
root 87 0.0 0.0 17500 2080 pts/0 R+ 14:10 0:00 ps aux
In the ghost container, you can see the processes of the pause and nginx containers at the same time, and the PID of the pause container is 1. In Kubernetes, PID=1
the process of the container is the business process of the container itself.
Reference link: