Container technology-the essence of Docker

1. The core technology of container technology

First of all, the container technology is not the technology of Docker, but the technology of the Linux kernel.

1 Relive the process

1.1 Procedure

Suppose, now you want to write a small program for calculating addition. The input for this program comes from a file, and the result after the calculation is output to another file.

Since the computer only recognizes 0 and 1, no matter which language is used to write this code, it needs to be translated into a binary file in some way before it can run in the computer operating system.

In order to make these codes run normally, we often have to provide it with data, such as the input files needed by our addition program. These data plus the binary file of the code itself, placed on the disk, is what we usually call a "program", also called an executable image of the code.

1.2 Process

First of all, the operating system finds from the "program" that the input data is stored in a file, so these data will be loaded into the memory on standby. At the same time, the operating system reads the instruction to calculate the addition. At this time, it needs to instruct the CPU to complete the addition operation. The CPU cooperates with the memory to perform addition calculations, and uses registers to store values ​​and memory stacks to store executed commands and variables. At the same time, there are open files in the computer, and various I/O devices are constantly calling to modify their own state. In this way, once the "program" is executed, it changes from the binary file on the disk to the data in the computer memory, the value in the register, the instruction in the stack, the opened file, and the status of various devices. A collection of information. The sum of the computer's execution environment after such a program is running is the process.

The core function of the current bare-handed container technology is to create a "boundary" for it by restricting and modifying the dynamic performance of the process.

For most Linux containers such as Docker, Cgroups technology is the main method used to create constraints, and Namespace technology is the main method used to modify the process view.

2 Namespace Namespace isolation

The name space is used to directly isolate the process, which is to ensure that the process can only see the resources that we have specified.

For example, do the following experiment to get an immersive experience.

mkdir -p container/{
    
    lib64,tmp}
cp  `ldd /usr/bin/bash | grep -P '/lib64/.*so.\d+' -o` container/lib64
cp -n `ldd /usr/bin/ls | grep -P '/lib64/.*so.\d+' -o` container/lib64/
cp -n `ldd /usr/bin/pwd | grep -P '/lib64/.*so.\d+' -o` container/lib64
cp /usr/bin/{
    
    bash,ls,pwd}  container/

Then execute the following command

chroot  container  /bash

chroot The function of the command is to help you "change root file system", that is, change the root directory of the process to the location you specify.

After the re-execution /pwdand /lscommand

Strange things happen so often, and more importantly, for the chrooted process, it will not feel that its root directory has been "modified".

This is the effect of a Mount Namespace of many Linux Namespaces.

In fact, Mount Namespace was invented based on the continuous improvement of chroot. It is also the first Namespace in the Linux operating system.

Of course, in order to make the root directory of the container look more "real", we generally mount a file system of a complete operating system under the root directory of the container, such as the directory structure of Centos7. In this way, after the container is started, we execute "ls /" in the container to view the contents of the root directory, which is all the directories and files of Centos7.

The Namespace technology actually modifies the "view" of the application process to the entire computer, that is, its "line of sight" is restricted by the operating system and can only "see" certain specified content. But for the host, these "isolated" processes are not much different from other processes.

The file system mounted on the root directory of the container and used to provide an isolated execution environment for the container process is the so-called "container image." It also has a more professional name, called: rootfs (root file system)

Note: In the Linux kernel, there are many resources and objects that cannot be Namespaced. The most typical example is time.

2 Cgroups resource limit

After introducing the "isolation" technology of containers, let's look at the "limitations" of containers.

Linux Cgroups is an important function in the Linux kernel used to set resource limits for processes.

The full name of Linux Cgroups is Linux Control Group. Its main function is to limit the upper limit of resources that a process group can use, including CPU, memory, disk, network bandwidth, and so on.

In addition, Cgroups can also set priorities, audit processes, and suspend and resume processes.

At present, I will only focus on talking with you about the "restricted" capabilities that are most closely related to containers, and I will introduce you to Cgroups through a set of practices.

In Linux, Cgroups exposed to the user interface is a file system operation, i.e., the way it is organized in directories and files of the operating system /sys/fs/cgroupof the path.

mount -t  cgroup  

As you can see, there are many subdirectories such as cpuset, cpu, and memory under /sys/fs/cgroup, which are also called subsystems. These are the types of resources that can be restricted by Cgroups on my machine. And under the resource type corresponding to the subsystem, you can see the specific methods that can be restricted for this type of resource. For example, for the CPU subsystem, we can see the following configuration files. This command is:

[root@localhost ~]# ls  /sys/fs/cgroup/cpu
cgroup.clone_children  cpuacct.usage_percpu  cpu.stat
cgroup.event_control   cpu.cfs_period_us     notify_on_release
cgroup.procs           cpu.cfs_quota_us      release_agent
cgroup.sane_behavior   cpu.rt_period_us      tasks
cpuacct.stat           cpu.rt_runtime_us
cpuacct.usage          cpu.shares
[root@localhost ~]#

Attention cpu.cfs_period_usandpu.cfs_quota_us

These two parameters need to be used in combination, can be used to limit the length of the process in cpu.cfs_period_usa period of time, it can only be assigned to a total amount of pu.cfs_quota_usCPU time.

Next, we will use them to conduct a small experiment to be on the scene.

Now enter the /sys/fs/cgroup/cpu directory:

[root@localhost ~]# cd /sys/fs/cgroup/cpu

Then create a directory container

[root@localhost cpu]# mkdir container

Then observe the contents of this directory

[root@localhost cpu]# ls container/
cgroup.clone_children  cpuacct.usage_percpu  cpu.shares
cgroup.event_control   cpu.cfs_period_us     cpu.stat
cgroup.procs           cpu.cfs_quota_us      notify_on_release
cpuacct.stat           cpu.rt_period_us      tasks
cpuacct.usage          cpu.rt_runtime_us
[root@localhost cpu]#

This directory is called a "control group". You will find that the operating system will automatically generate the resource limit file corresponding to the subsystem in the newly created container directory.

Now, we execute such a script in the background:

while : ; do : ; done &
[root@localhost cpu]# jobs -l
[1]+  1752 Running                 while :; do
    :;
done &

Then observe the CPU usage through the top command

As you can see in the output, the CPU usage has been 100%

[root@localhost cpu]# cat /sys/fs/cgroup/cpu/container/cpu.cfs_quota_us
-1
[root@localhost cpu]# cat /sys/fs/cgroup/cpu/container/cpu.cfs_period_us
100000
[root@localhost cpu]#

Next, we can set restrictions by modifying the contents of these files. For example, write 20 ms (20000 us) to the cfs_quota file in the container group

 echo 20000 > container2/cpu.cfs_quota_us

Combined with the previous introduction, you should be able to understand the meaning of this operation, it means that in every 100 ms of time, the process restricted by the control group can only use 20 ms of CPU time, which means that this process can only use 20% of CPU bandwidth. Next, we write the PID of the restricted process into the tasks file in the container group, and the above settings will take effect for the process:

echo 1752 > /sys/fs/cgroup/cpu/container/tasks 

View the top command again

As you can see, the computer’s CPU usage immediately dropped to 20%
Insert picture description here

Guess you like

Origin blog.csdn.net/qq_22648091/article/details/115310046