5 minutes to understand Docker

Although I have heard the name of Docker for a long time, my talent is dull and I have never been able to figure out what it is.

The introduction of the official website is as follows:

Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications....

In fact, after reading this sentence, I still do not understand what it is, so I will explain it slowly below. But to make a long story short, imagining him as an ultra-lightweight virtual machine implemented in a novel way is also roughly correct. Of course, there is still a huge difference from VM in terms of implementation principle and application, and the professional name is Application Container.

Why use a container?

So what does an application container look like? A finished application container looks like a virtual machine with a set of specific applications installed. For example, if I want to use MySQL now, I will find a container with MySQL installed, run it, and then I can use MySQL.

So why don't I just install MySQL directly, why do I need such a weird concept of this container? That being said, but if you really want to install MySQL, you may have to install a bunch of dependent libraries, set it according to your operating system platform and version, and sometimes compile from the source code and report a bunch of inexplicable errors, but not So well dressed. And if your machine hangs, everything has to be restarted, and the configuration may have to be redone. But with a container, you are equivalent to having a virtual machine that can be run. As long as you can run the container, the configuration of MySQL will be saved. And once you want to switch machines, just take the container up and put it on another machine. Hardware, operating system, operating environment, etc. don't need to be considered.

A big use in the company is to ensure that the offline development environment, test environment and online production environment are consistent. Back then, things like this were often encountered in Baidu. The development had done a good job of testing and testing, and usually gave a bunch of code and an online order that introduced the online steps. As a result, the code could not run on the test machine, and the development ran to see the problem. After a while, the configuration file forgot to submit, and the online command was written incorrectly. I found a bug and put it up, let’s take a look at the development, why did I forget to write this command on the online list. Similar things will happen when I go online. The version of your software is different from the one on my machine... At Amazon, one developer directly held the above three positions, and there was a set of automated deployment mechanisms. So there will be fewer problems, but people are still scared when it goes live.

If you use a container, then the development is directly developed in the container. When testing, the entire container is tested. After testing, the changes can be changed in the container before going online. With containers, the entire development, test, and production environments can be highly consistent.

In addition, containers have the same isolation as VMs. The data and memory spaces between containers are isolated from each other, which can ensure a certain degree of security.

Then why not use a VM?

So since containers and VMs are so similar, why not just use VMs and create the concept of a container? Docker containers have several advantages over VMs:

Startup is fast, containers typically start within a second, while VMs typically take longer
High resource utilization, an ordinary PC can run thousands of containers, try running thousands of VMs
The performance overhead is small, and the VM usually requires additional CPU and memory to complete the functions of the OS, which occupies additional resources

Why do similar functions have such a huge gap in performance? In fact, this is related to their design concept. The design diagram of the VM is as follows:

The hypervisor of the VM needs to virtualize the hardware and carry its own operating system. Naturally, it has a relatively large overhead in terms of startup speed, resource utilization and performance. And the design diagram of Docker is like this:

Docker has almost nothing virtualized, and directly reuses the OS of the Host host. The scheduling and isolation weight at the Docker Engine level is suddenly reduced by several grades. Docker's container uses LXC , management uses namespaces to control and isolate permissions, cgroups to configure resources, and aufs to further improve the resource utilization of the file system.

One of the aufs is a very interesting thing, a kind of UnionFS . His thinking is somewhat similar to git, and changes to the file system can be regarded as a commit layer by layer. In this way, multiple containers can share their file system levels. Below each container is the shared file system level, and above it is the level of their respective file system changes, which greatly saves storage requirements. , and can also speed up container startup.

Next step

With the previous introductions, you should have some understanding of what Docker is. Docker is written in Go language, the source code is hosted on github, and only 1W lines are required to complete these functions. If you want to try it, you can read the official introduction , it should be easier to get started. The blogger is also a novice. If there is any mistake, please feel free to correct me.

http://www.csdn.net/article/2014-07-02/2820497-what's-docker