Docker: The Charming Open Source Container Engine

Docker is an open source application container engine that allows developers to package their applications and dependencies into a portable image, and then publish it to any popular Linux or Windows operating system machine, and can also implement virtualization. Containers use a sandbox mechanism completely, and there will be no interfaces between them. More importantly, the performance overhead of containers is extremely low.

This article mainly introduces: the introduction of Docker technology, the basic concept and composition of Docker, and the development history of deployment technology.



1. Introduction to Docker technology

1. Introduction to Docker

Docker is an open source application container engine that allows developers to package their applications and dependencies into a portable image, and then publish it to any popular Linux or Windows operating system machine, and can also implement virtualization. Containers use a sandbox mechanism completely, and there will be no interfaces between them. More importantly, the performance overhead of containers is extremely low.

img

The Docker open source project was born in early 2013 as a side project within dotCloud. It is implemented based on the Go language launched by Google. The project later joined the Linux Foundation, complied with the Apache 2.0 protocol, and the project code is maintained on GitHub. Docker has been widely concerned and discussed since its open source, so that dotCloud company was later renamed as Docker Inc.

2. Virtualization technology

In computer technology, virtualization (Virtualization) is a resource management technology, which abstracts and converts various physical resources of the computer, such as servers, networks, memory, and storage, and presents them after transformation, breaking the inconsistency between physical structures. Cutting the barriers allows users to use these resources in a better way than the original configuration.

3. Advantages of Docker

As an emerging virtualization method, Docker has many advantages compared with traditional virtualization methods. First of all, the startup of the Docker container can be realized in seconds, which is much faster than the traditional virtual machine method. Secondly, Docker has a high utilization rate of system resources, and thousands of Docker containers can run simultaneously on a single host.

In addition to running the applications in the container, it basically consumes no additional system resources, making the performance of the application very high, and at the same time, the system overhead is as small as possible. In the traditional virtual machine mode, 10 virtual machines need to be started to run 10 different applications, while Docker only needs to start 10 isolated applications.

Specifically, Docker has great advantages in the following aspects:

  1. Faster delivery and deployment: Docker can quickly create containers, quickly iterate applications, and make the entire process visible, making it easier for other members of the team to understand how applications are created and work. Docker containers are light and fast! The startup time of the container is at the second level, which saves a lot of development, testing, and deployment time;
  2. More efficient virtualization: The running of Docker containers does not require additional hypervisor support. It is kernel-level virtualization, so it can achieve higher performance and efficiency;
  3. Easier migration and scaling: Docker containers can run on almost any platform, including physical machines, virtual machines, public clouds, private clouds, PCs, servers, and more. This compatibility allows users to migrate an application directly from one platform to another;
  4. Easier management: With Docker, only small modifications are required to replace a large amount of update work in the past. All changes are distributed and updated incrementally for automated and efficient management.

Second, the basic concept of Docker

1. Docker components

Architecture diagram of Docker:

image-20230111193611356

A complete Docker mainly consists of the following parts:

component
1 Docker host The Docker host (Host) can be a physical machine or a virtual machine for running Docker service processes and containers
2 Docker server Docker server (Server) is also called Docker Daemon, which is actually the server of Docker. It is responsible for listening to Docker API requests (such as Docker Client) and managing Docker objects (Docker Objects), such as images, containers, networks, data volumes, etc.
3 Docker client The Docker client (Client) is the main way for users to interact with Docker. When a Docker command is entered in the terminal, the corresponding effect will be generated on the server and the result will be returned to the client. In addition to connecting to the local server, Docker Client can connect to the remote server by changing or specifying DOCKER_HOST
4 Docker image Docker images (Images) can be understood as templates for creating Docker containers
5 Docker container A Docker container (Container) is an application or a group of applications that run independently, and is an entity when the image is running
6 Docker repository The Docker warehouse (Registry) is a warehouse for storing images, similar to version control systems such as git and svn

2. Docker image

The Linux operating system includes two parts, the Linux kernel and the user space. When the Linux system starts, it will mount the root file system to provide user space support. The Docker image is similar to such a root file system. The Docker image contains the file system and content required to start the container, so the image is mainly used to create and start the Docker container.

The Docker image contains a layer-by-layer file system called Union File System (Union FS, Union File System). The joint file system can mount multiple directories together to form an entire virtual file system. The directory structure of the virtual file system is just like the ordinary Linux directory structure. Docker forms a Linux virtual environment through these file systems, coupled with the host's kernel.

In the joint file system, each layer of the file system is called a layer. The joint file system can set three permissions for each file system: read-only (readonly), read-write (readwrite) and write-out (writeout-able). .

However, the file system of each layer in the Docker image is read-only. When building the image, starting from the most basic operating system, each construction operation is equivalent to making a layer of modification and adding a layer of files System, such a layer-by-layer online superimposition, the modification of the upper layer will overwrite the visibility of the position at the bottom layer. When we use mirroring, we only see a complete whole, and we don't know and don't need to know how many layers of file systems there are.

3. Docker container

A Docker container is equivalent to a runtime instance copied from a template, and a Docker container can be created, copied, suspended, and deleted. Each Docker container uses the image as the base layer when running, and creates the storage layer of the current container on the basis of the image. The life cycle of the container storage layer is consistent with the container, so when a container is deleted, the container The data in the storage layer will also be deleted accordingly.

4. Docker Warehouse

The Docker warehouse is a centralized warehouse for storing and distributing Docker images. The docker pull and docker push commands of the client interact directly with the Docker warehouse.

A Docker warehouse can contain multiple warehouses, and each warehouse can contain multiple tags, which correspond to each image. In general, a warehouse may contain images of different versions of the same software (such as Nginx 1.18 and 1.20, etc.), and the labels are similar to the incompatible versions of the software. The client can specify the specific version of the software image <仓库名>:<标签>in the format of If not specified, will be latestused as the .

According to whether the image is public or not, Docker warehouses can be divided into public warehouses and private warehouses. As the official public warehouse of Docker, Docker hub has saved a large number of commonly used images, which can be used directly by users; if you don’t want to use the public warehouse, you can also use Image registry, a private warehouse deployment tool officially provided by Docker.


3. Deployment technology development history

1. The era of physical machines

Before virtual machines appeared in the business environment, applications were often deployed on physical machines, but both Windows servers and Linux servers lack corresponding technical means to ensure that multiple applications can run stably and safely on one server at the same time. Therefore, The disadvantages of this deployment method are: idle resources are difficult to reuse, physical resources need to be repurchased when deploying heterogeneous systems, and a large number of small and medium-capacity machines will increase operation and maintenance costs.

Under such circumstances, how to reduce infrastructure management costs has become an urgent need.

2. The era of VMware

In order to solve the above problems, VMware launched their product --- virtual machine. The appearance of virtual machine allows users to run multiple isolated systems independently on a physical machine. Through the abstraction of resources, host resources can be Effective reuse is very beneficial to enterprise IT management.

image-20230111193456231

However, virtual machines also bring some problems:

  • The operation of a large number of independent systems will take up a lot of extra overhead and consume host machine resources, and resource competition may seriously affect system response;
  • In addition, every time a new virtual machine is run, the environment needs to be reconfigured, which is basically the same as that on a physical machine. Repeated environment configuration operations will consume the working time of development and operation and maintenance personnel.

At this time, the demand focuses on how to reduce resource consumption during virtualization, while ensuring isolation and shortening the application launch cycle, which guides the development of container technology.

3. The era of containerization

Modern container technology originated from Linux, which is the product of long-term and continuous contributions by many people. Since 2000, various Unix-like operating system manufacturers have successively launched container-related projects.

After Google's Cgroups contributed to Linux kernel 2.6.24 in 2008, they created LXC (Linux Containers), which enabled multiple independent Linux environments (containers) to run on the same kernel. For a complete independent operating environment, three key points need to be included: environment isolation, resource control and file system. In LXC, the corresponding capabilities are realized through Namespace, Cgroups, and rootfs:

  • Namespace—environmental isolation: LXC encapsulates the global resources of the kernel, and each Namespace has an independent resource, so that different processes in their respective Namespaces will not interfere with each other when using the same resource, and will not affect resources under other Namespaces. process isolation;
  • Cgroups - resource control: LXC controls resources through Cgroups, limiting and isolating the use of system resources by a group of processes. Before the emergence of Cgroups, the OS could only limit the resources of a process, but Cgroups can group processes arbitrarily, and how to group them is defined by the user, so as to realize resource scheduling management for a Namespace;
  • rootfs - file system: rootfs is mounted on the root directory of the container and is used to provide a file system for the container process to execute after isolation. rootfs contains files, configurations and directories involved in an operating system. When the Linux operating system kernel starts, the kernel will first mount a read-only rootfs. After the system detects its integrity, it decides whether to switch it to read-write mode .

image-20230111194405307

After building a container through LXC, a host can run multiple isolated applications. At the same time, the shared kernel makes each container very lightweight, which solves the disadvantage of excessive virtual machine resource consumption when running a large number of isolated applications.

However, although LXC solves the problem of application isolation, it is only a lightweight container technology and does not solve the problem of inconsistent software delivery standards on various platforms, such as different software delivery tools, inconsistent application operation specifications, and complex environment dependencies. to the configuration overhead. These problems made the promotion of container technology still relatively limited until the emergence of Docker.

4. The era of Docker

Early Docker was developed based on LXC, so Docker containers also have similar features to LXC, requiring less resources to start. But unlike LXC, in addition to running containers, Docker is also a platform for packaging, distributing and running applications. Docker allows the application and its dependent operating environment to be packaged together, and the packaged "container" (image) can be distributed to any node for execution, without the need to deploy the configuration environment. In this way, Docker solves the problem of environment configuration when developing and deploying applications, standardizes application delivery and deployment, reduces the complexity of deployment testing and the coupling of development and maintenance, greatly improves the convenience of container transplantation, and facilitates building automation deployment delivery process.

Both Docker and virtual machines are products of the development of resource virtualization, but they differ in architecture:

  • The virtual machine virtualizes the host hardware resources through the Hypervisor, and then builds the guest operating system, which is managed by the host's hypervisor;
  • Docker runs directly on the host kernel, and the application performs independent tasks in the user space of the main operating system. It does not need to build an environment from the operating system, endowing the application with independence from delivery to deployment to operation and maintenance.

img

The startup time of a virtual machine may be on the order of minutes, while the creation of a Docker container is on the order of seconds. For the use of hard disks, Docker is generally at the MB level, which is much smaller than the GB-level disk usage of the virtual machine containing the operating system. For the operating system, the number of Docker containers that can support running is far more than that of virtual machines.

The advantages of Docker container technology in terms of rapid deployment, environment standardization, and isolation have been generally recognized by developers, but it is not enough to measure these based on a complete PaaS platform. First of all, Docker provides an isolated environment called "container", but it is still difficult for Docker to deal with scenarios where multiple containers are topologically related; secondly, although containers solve the problem of application delivery specifications, it is difficult to achieve complete application hosting; In addition, with the expansion of the infrastructure scale, the development center will inevitably be distributed, and the scheduling problem needs to be solved.

In July 2014, Docker announced the acquisition of the stand-alone container orchestration software Fig (later named Docker Compose), and launched its own container cluster orchestration project Docker Swarm in December of the same year. Docker Swarm has successfully built two capabilities:

  1. Multi-container orchestration capability, supports declaration of multi-container applications through YAML files, and defines the relationship between containers;
  2. Distributed scheduling capability, allowing containers to be scheduled across cluster nodes.

Docker Compose and Dockers Swarm can basically meet the needs of developers for the PaaS platform, and they are the cornerstone of Docker's development plan for the platform. So far, the future development of Docker seems to be optimistic, but with the addition of Google, the situation of the container market has changed significantly.

5、Kubernetes

In 2014, Google open sourced a project called Kubernetes (K8S for short), which is a container cluster management system open sourced by Google's internal Borg project. Kubernetes inherits Google's rich experience and genes in large-scale cluster operation and maintenance, and can provide complex and large-scale container orchestration management services. In 2015, Google released the first commercial version of Kubernetes, which represented Kubernetes' entry into production-level container scale management, and also meant that it began to compete with Docker for the future layout of the PaaS platform.

img

A Kubernetes cluster consists of two types of nodes: Master Node and Worker Node. Kubernetes adopts a declarative design, and any operation command communicates with the Master through a declarative API. Master Node can respond to API statements for cluster management and container scheduling. The container runs on the Worker Node, and the Worker is responsible for responding to the instructions of the Master and performing maintenance operations such as starting and stopping the container.

In addition to the tight container relationship, a production-oriented orchestration system must support more container relationships, so Kubernetes also provides many objects such as Deployment stateless multi-copy relationship, StatefulSet stateful multi-copy relationship, Job one-time long task, etc. Diverse programming needs. The basic scheduling unit of these objects is Pod, and the controller controls the state of Pod objects to realize the declared orchestration relationship.

Kubernetes equips developers and engineers with the management tools and infrastructure they need to quickly tackle large projects. Kubernetes clusters can manage everything from load testing or creating staging environments, to moving business and live applications into production.

Guess you like

Origin blog.csdn.net/weixin_45187434/article/details/128650068