1. The core technology of container technology
First of all, the container technology is not the technology of Docker, but the technology of the Linux kernel.
1 Relive the process
1.1 Procedure
Suppose, now you want to write a small program for calculating addition. The input for this program comes from a file, and the result after the calculation is output to another file.
Since the computer only recognizes 0 and 1, no matter which language is used to write this code, it needs to be translated into a binary file in some way before it can run in the computer operating system.
In order to make these codes run normally, we often have to provide it with data, such as the input files needed by our addition program. These data plus the binary file of the code itself, placed on the disk, is what we usually call a "program", also called an executable image of the code.
1.2 Process
First of all, the operating system finds from the "program" that the input data is stored in a file, so these data will be loaded into the memory on standby. At the same time, the operating system reads the instruction to calculate the addition. At this time, it needs to instruct the CPU to complete the addition operation. The CPU cooperates with the memory to perform addition calculations, and uses registers to store values and memory stacks to store executed commands and variables. At the same time, there are open files in the computer, and various I/O devices are constantly calling to modify their own state. In this way, once the "program" is executed, it changes from the binary file on the disk to the data in the computer memory, the value in the register, the instruction in the stack, the opened file, and the status of various devices. A collection of information. The sum of the computer's execution environment after such a program is running is the process.
The core function of the current bare-handed container technology is to create a "boundary" for it by restricting and modifying the dynamic performance of the process.
For most Linux containers such as Docker, Cgroups technology is the main method used to create constraints, and Namespace technology is the main method used to modify the process view.
2 Namespace Namespace isolation
The name space is used to directly isolate the process, which is to ensure that the process can only see the resources that we have specified.
For example, do the following experiment to get an immersive experience.
mkdir -p container/{
lib64,tmp}
cp `ldd /usr/bin/bash | grep -P '/lib64/.*so.\d+' -o` container/lib64
cp -n `ldd /usr/bin/ls | grep -P '/lib64/.*so.\d+' -o` container/lib64/
cp -n `ldd /usr/bin/pwd | grep -P '/lib64/.*so.\d+' -o` container/lib64
cp /usr/bin/{
bash,ls,pwd} container/
Then execute the following command
chroot container /bash
chroot
The function of the command is to help you "change root file system", that is, change the root directory of the process to the location you specify.
After the re-execution /pwd
and /ls
command
Strange things happen so often, and more importantly, for the chrooted process, it will not feel that its root directory has been "modified".
This is the effect of a Mount Namespace of many Linux Namespaces.
In fact, Mount Namespace was invented based on the continuous improvement of chroot. It is also the first Namespace in the Linux operating system.
Of course, in order to make the root directory of the container look more "real", we generally mount a file system of a complete operating system under the root directory of the container, such as the directory structure of Centos7. In this way, after the container is started, we execute "ls /" in the container to view the contents of the root directory, which is all the directories and files of Centos7.
The Namespace technology actually modifies the "view" of the application process to the entire computer, that is, its "line of sight" is restricted by the operating system and can only "see" certain specified content. But for the host, these "isolated" processes are not much different from other processes.
The file system mounted on the root directory of the container and used to provide an isolated execution environment for the container process is the so-called "container image." It also has a more professional name, called: rootfs (root file system)
Note: In the Linux kernel, there are many resources and objects that cannot be Namespaced. The most typical example is time.
2 Cgroups resource limit
After introducing the "isolation" technology of containers, let's look at the "limitations" of containers.
Linux Cgroups is an important function in the Linux kernel used to set resource limits for processes.
The full name of Linux Cgroups is Linux Control Group. Its main function is to limit the upper limit of resources that a process group can use, including CPU, memory, disk, network bandwidth, and so on.
In addition, Cgroups can also set priorities, audit processes, and suspend and resume processes.
At present, I will only focus on talking with you about the "restricted" capabilities that are most closely related to containers, and I will introduce you to Cgroups through a set of practices.
In Linux, Cgroups exposed to the user interface is a file system operation, i.e., the way it is organized in directories and files of the operating system /sys/fs/cgroup
of the path.
mount -t cgroup
As you can see, there are many subdirectories such as cpuset, cpu, and memory under /sys/fs/cgroup, which are also called subsystems. These are the types of resources that can be restricted by Cgroups on my machine. And under the resource type corresponding to the subsystem, you can see the specific methods that can be restricted for this type of resource. For example, for the CPU subsystem, we can see the following configuration files. This command is:
[root@localhost ~]# ls /sys/fs/cgroup/cpu
cgroup.clone_children cpuacct.usage_percpu cpu.stat
cgroup.event_control cpu.cfs_period_us notify_on_release
cgroup.procs cpu.cfs_quota_us release_agent
cgroup.sane_behavior cpu.rt_period_us tasks
cpuacct.stat cpu.rt_runtime_us
cpuacct.usage cpu.shares
[root@localhost ~]#
Attention cpu.cfs_period_us
andpu.cfs_quota_us
These two parameters need to be used in combination, can be used to limit the length of the process in cpu.cfs_period_us
a period of time, it can only be assigned to a total amount of pu.cfs_quota_us
CPU time.
Next, we will use them to conduct a small experiment to be on the scene.
Now enter the /sys/fs/cgroup/cpu directory:
[root@localhost ~]# cd /sys/fs/cgroup/cpu
Then create a directory container
[root@localhost cpu]# mkdir container
Then observe the contents of this directory
[root@localhost cpu]# ls container/
cgroup.clone_children cpuacct.usage_percpu cpu.shares
cgroup.event_control cpu.cfs_period_us cpu.stat
cgroup.procs cpu.cfs_quota_us notify_on_release
cpuacct.stat cpu.rt_period_us tasks
cpuacct.usage cpu.rt_runtime_us
[root@localhost cpu]#
This directory is called a "control group". You will find that the operating system will automatically generate the resource limit file corresponding to the subsystem in the newly created container directory.
Now, we execute such a script in the background:
while : ; do : ; done &
[root@localhost cpu]# jobs -l
[1]+ 1752 Running while :; do
:;
done &
Then observe the CPU usage through the top command
As you can see in the output, the CPU usage has been 100%
[root@localhost cpu]# cat /sys/fs/cgroup/cpu/container/cpu.cfs_quota_us
-1
[root@localhost cpu]# cat /sys/fs/cgroup/cpu/container/cpu.cfs_period_us
100000
[root@localhost cpu]#
Next, we can set restrictions by modifying the contents of these files. For example, write 20 ms (20000 us) to the cfs_quota file in the container group
echo 20000 > container2/cpu.cfs_quota_us
Combined with the previous introduction, you should be able to understand the meaning of this operation, it means that in every 100 ms of time, the process restricted by the control group can only use 20 ms of CPU time, which means that this process can only use 20% of CPU bandwidth. Next, we write the PID of the restricted process into the tasks file in the container group, and the above settings will take effect for the process:
echo 1752 > /sys/fs/cgroup/cpu/container/tasks
View the top command again
As you can see, the computer’s CPU usage immediately dropped to 20%