With so many Dockerfiles, which one is the real best practice?

The syntax of the Dockerfile is very simple, but how to speed up the image building speed and how to reduce the size of the Docker image is not so intuitive and needs to accumulate practical experience. This article can help you quickly master the skills of writing Dockerfile.

Target

faster build speed
Smaller Docker image size
Fewer Docker image layers
Take full advantage of the mirror cache
Increase Dockerfile readability
Make Docker containers easier to use

Summarize

write .dockerignorea document
Containers only run a single application
Combine multiple RUN instructions into one
Do not use latest for the label of the base image
Delete redundant files after each RUN instruction
Choose an appropriate base image (alpine version is best)
Set WORKDIR and CMD
Use ENTRYPOINT (optional)
Use exec in entrypoint script
COPY and ADD give priority to the former
Reasonably adjust the order of COPY and RUN
Set default environment variables, map ports and data volumes
Use LABEL to set image metadata
Add HEALTHCHECK
multi-stage build

FROM ubuntuADD . /appRUN apt-get update  RUN apt-get upgrade -y  RUN apt-get install -y nodejs ssh mysql  RUN cd /app && npm install# this should start three processes, mysql and ssh# in the background and node app in foreground# isn't it beautifully terrible? <3CMD mysql & sshd & npm start

Build the image:

docker build -t wtf .

Can you find all the errors in the above Dockerfile? No? Then let us improve it step by step.

optimization

1. Write the .dockerignore file

.git/node_modules/

2. Containers only run a single application

Technically, you can run multiple processes in a Docker container. You can run the database, front-end, back-end, ssh, and supervisor all in the same Docker container. However, this will cause you a lot of pain:

Very long build times (after modifying the frontend, the entire backend also needs to be rebuilt)
very large image size
The logs of multiple applications are difficult to handle (stdout cannot be used directly, otherwise the logs of multiple applications will be mixed together)
It is very wasteful of resources when scaling out (different applications need to run different numbers of containers)
Zombie process problem - you need to choose the right init process

Therefore, it is recommended that you build a separate Docker image for each application, and then use Docker Compose to run multiple Docker containers.

Now, I remove some unnecessary installation packages from Dockerfile, also, SSH can be replaced with docker exec. Examples are as follows:

FROM ubuntuADD . /appRUN apt-get update  RUN apt-get upgrade -y# we should remove ssh and mysql, and use# separate container for database RUN apt-get install -y nodejs  # ssh mysql  RUN cd /app && npm installCMD npm start

3. Combine multiple RUN instructions into one

Docker images are layered, and the following knowledge points are very important:

Each instruction in a Dockerfile creates a new image layer.
Image layers will be cached and reused
When the instructions of the Dockerfile are modified, the copied file changes, or the variables specified when building the image are different, the corresponding image layer cache will become invalid.
After the mirror cache of a certain layer becomes invalid, the mirror layer cache after it will be invalid
The image layer is immutable. If we add a file in one layer and then delete it in the next layer, the file will still be included in the image (but the file will not be visible in the Docker container).

Docker images are like onions. They all have many layers. In order to modify the inner layer, the outer layer needs to be deleted. Keeping this in mind, the rest will be easy to understand.

Now, we combine all the RUN instructions into one . At the same time, apt-get upgradedelete it, because it will make the image build very uncertain (we only need to rely on the update of the base image)

FROM ubuntuADD . /appRUN apt-get update \      && apt-get install -y nodejs \    && cd /app \    && npm installCMD npm start

Remember, we can only combine instructions with the same frequency of change. If you put node.js installation and npm module installation together, you need to reinstall node.js every time you modify the source code, which is obviously inappropriate. Therefore, the correct way of writing is this:

FROM ubuntuRUN apt-get update && apt-get install -y nodejs  ADD . /app  RUN cd /app && npm installCMD npm start

4. Do not use latest for the label of the base image

When the image does not specify a tag, the latest tag will be used by default. Therefore, the FROM ubuntu command is equivalent to FROM ubuntu:latest. At that time, when the image was updated, the latest tag would point to a different image, and building the image might fail. If you really need to use the latest version of the base image, you can use the latest tag, otherwise, it is best to specify a certain image tag.

The example Dockerfile should use 16.04 as the label.

FROM ubuntu:16.04  # it's that easy!RUN apt-get update && apt-get install -y nodejs  ADD . /app  RUN cd /app && npm installCMD npm start

5. Delete redundant files after each RUN command

Suppose we updated the apt-get source, downloaded, unpacked and installed some packages, they are all kept in /var/lib/apt/lists/ directory. However, these files are not required in the Docker image to run the application. We'd better remove them as it will make the Docker image bigger.

In the example Dockerfile, we can delete the files in the /var/lib/apt/lists/ directory (they were generated by apt-get update).

FROM ubuntu:16.04RUN apt-get update \      && apt-get install -y nodejs \    # added lines    && rm -rf /var/lib/apt/lists/*ADD . /app  RUN cd /app && npm installCMD npm start

6. Select the appropriate base image (alpine version is best)

In the example, we chose ubuntu as the base image. But we only need to run the node program, is it necessary to use a common base image? The node mirror should be a better choice.

FROM nodeADD . /app  # we don't need to install node # anymore and use apt-getRUN cd /app && npm installCMD npm start

A better choice is the alpine version of the node image. Alpine is a minimal Linux distribution, only 4MB, which makes it very suitable as a base image.

FROM node:7-alpineADD . /app  RUN cd /app && npm installCMD npm start

apk is Alpine's package management tool. It's a bit different than apt-get, but very easy to get started. In addition, it has some very useful features, such as no-cache and --virtual options, which can help us reduce the size of the image.

7. Set WORKDIR and CMD

The WORKDIR command can set the default directory, which is where the RUN / CMD / ENTRYPOINT commands are run.

The CMD command can set the default command for container creation to be executed. In addition, you should write the command in an array, and each element in the array is each word of the command (refer to the official document).

FROM node:7-alpineWORKDIR /app  ADD . /app  RUN npm installCMD ["npm", "start"]

8. Use ENTRYPOINT (optional)

The ENTRYPOINT directive is not necessary as it adds complexity. ENTRYPOINT is a script that is executed by default and receives the specified command as an argument. It is commonly used to build executable Docker images. entrypoint.sh is as follows:

#!/usr/bin/env sh_# $0 is a script name, # 2, $3 etc are passed arguments# 1case "$CMD" in    "dev" )    npm install    export NODE_ENV=development    exec npm run dev    ;;  "start" )    _# we can modify files here, using ENV variables passed in _    # "docker create" command. It can't be done during build process.    echo "db: $DATABASE_ADDRESS" >> /app/config.yml    export NODE_ENV=production    exec npm start    ;;   * )    _# Run custom command. Thanks to this line we can still use _    # "docker run our_image /bin/bash" and it will work    exec {@:2}    ;;esac

Example Dockerfile:

FROM node:7-alpineWORKDIR /app  ADD . /app  RUN npm installENTRYPOINT ["./entrypoint.sh"]  CMD ["start"]

The image can be run with the following command:

_# 运行开发版本_docker run our-app dev _# 运行生产版本_docker run our-app start _# 运行bash_docker run -it our-app /bin/bash

9. Use exec in entrypoint script

In the previous entrypoint script, I used the exec command to run the node application. Without exec, we cannot shut down the container smoothly, because the SIGTERM signal will be swallowed by the bash script process. The process started by the exec command can replace the script process, so all signals will work normally.

Here is an extended introduction to the stop process of the docker container:

(1). For the container, init the system is not necessary. When you docker stop mycontainer stop the container through the command, the docker CLI will send the TERM signal to the process whose PID is 1 in mycontainer.

If PID 1 is the init process - then PID 1 will forward the TERM signal to the child process, then the child process starts to shut down, and finally the container terminates.
If there is no init process - then the application process in the container (the application specified by ENTRYPOINT or CMD in the Dockerfile) is PID 1, and the application process is directly responsible for responding to the TERM signal.

At this time, there are two situations:
- The application does not handle SIGTERM - If the application does not listen to SIGTERM the signal, or does not implement logic in the application to handle the SIGTERM signal, the application will not stop and the container will not terminate.
- The container takes a long time to stop - after running the command docker stop mycontainer , Docker will wait 10s, and if 10s the container has not terminated after that, Docker will bypass the container application and send SIGKILL directly to the kernel, and the kernel will forcibly kill the application, thereby terminating the container.

(2). If the process in the container does not receive SIGTERM the signal, it is likely because the application process is not PID 1, PID 1 is the shell, and the application process is only a child process of the shell. The shell does not have the function of the init system, so it will not forward the signal of the operating system to the child process. This is also a common reason why the application in the container does not receive the SIGTERM signal.

The root of the problem comes from the Dockerfile, for example:

FROM alpine:3.7COPY popcorn.sh .RUN chmod +x popcorn.shENTRYPOINT ./popcorn.shCMD ["start"]

The ENTRYPOINT instruction uses shell mode , so that Docker will run the application in a shell, so the shell is PID 1.

The solutions are as follows:

Scenario 1: ENTRYPOINT instruction using exec mode

Instead of using shell mode, it is better to use exec mode, for example:

FROM alpine:3.7COPY popcorn.sh .RUN chmod +x popcorn.shENTRYPOINT ["./popcorn.sh"]

So PID 1 is ./popcorn.sh, it will be responsible for responding to all signals sent to the container, as to ./popcorn.sh whether it can actually catch system signals, that is another matter.

For example, assuming the above Dockerfile is used to build the image, popcorn.sh the script prints the date every second:

#!/bin/sh
while truedo    date    sleep 1done

Build the image and create the container:

docker build -t truek8s/popcorn .docker run -it --name corny --rm truek8s/popcorn

Open another terminal to execute the command to stop the container and time it:

time docker stop corny

Because the logic of capturing and processing signals popcorn.sh is not implemented , it takes about 10s to stop the container. SIGTERMTo solve this problem, it is necessary to add signal processing code to the script, so that it SIGTERM will terminate the process when it catches the signal:

#!/bin/sh# catch the TERM signal and then exittrap "exit" TERMwhile truedo    date    sleep 1done

Note: The following command is equivalent to the ENTRYPOINT command in shell mode:

ENTRYPOINT ["/bin/sh", "./popcorn.sh"]

Solution 2: Use the exec command directly

If you just want to use shell the ENTRYPOINT instruction of the mode, it is not impossible, just append the startup command to exec the end, for example:

FROM alpine:3.7COPY popcorn.sh .RUN chmod +x popcorn.shENTRYPOINT exec ./popcorn.sh

Solution 3: Use the init system

If the application in the container cannot handle the signal by default and cannot modify the code, solutions 1 and 2 will not work at this time, and you can only add a system SIGTERM in the container . initThere are many kinds of init systems. Tini is recommended here. It is a lightweight init system dedicated to containers. It is also very simple to use:

Install tini
will tini be set as the default application for the container
will be the parameter popcorn.sh of tini

The specific Dockerfile is as follows:

FROM alpine:3.7COPY popcorn.sh .RUN chmod +x popcorn.shRUN apk add --no-cache tiniENTRYPOINT ["/sbin/tini", "--", "./popcorn.sh"]

now

tini 就是 PID 1，它会将收到的系统信号转发给子进程 popcorn.sh

10. COPY and ADD give priority to the former

The COPY command is very simple and is only used to copy files into the image. ADD is relatively complicated and can be used to download remote files and decompress compressed packages (refer to official documents).

FROM node:7-alpineWORKDIR /appCOPY . /app  RUN npm installENTRYPOINT ["./entrypoint.sh"]  CMD ["start"]

11. Reasonably adjust the order of COPY and RUN

We should put the least changed part at the front of the Dockerfile , so that the image cache can be fully utilized.

When building a mirror image, docker will execute it once according to the order of the instructions in the dockerfile. When each instruction is executed, docker will go to the cache to check whether there is an existing image that can be reused, instead of creating a new image copy.

If you don't want to use the build cache, you can use the docker build parameter option --no-cache=true to disable the build cache. When using a mirrored cache, it is necessary to figure out when the cache is suitable to take effect and when it will fail. The most basic rules for building a cache are as follows:

If the referenced parent image is in the build cache, the next command will be compared with all child images derived from the parent process. If there is a child image using the same command, then the cache hits, otherwise the cache is invalid.
In most cases, Dockerfileit is sufficient to pass the instructions and subimages in the comparison.

But some instructions require further inspection.
For ADDthe and COPYdirectives, the contents of the files are checked and a checksum is calculated for each file.

However, the most recent modification and access time of the file are not considered in the checksum.

During the construction process, docker will compare the existing images. As long as the file content and metadata change, the cache will be invalid.
Except for ADDthe and COPYdirectives, the image cache does not check the files in the container to see if it hits the cache. For example, when processing RUN apt-get -y updatea command, the updated file in the container is not checked for a cache hit, only the same command string is checked in this case.

In the example, the source code will change frequently, and the NPM module needs to be reinstalled every time the image is built, which is obviously not what we want to see. So we can copy package.json first, then install NPM modules, and only copy the rest of the source code last. This way, even if the source code changes, there is no need to reinstall the NPM module.

FROM node:7-alpineWORKDIR /appCOPY package.json /app  RUN npm install  COPY . /appENTRYPOINT ["./entrypoint.sh"]  CMD ["start"]

Similarly, in the case of a Python project, we can also copy the requirements.txt first, then perform pip install requerements.txt, and finally perform the COPY code.

ROM python:3.6# 创建 app 目录WORKDIR /app# 安装 app 依赖COPY src/requirements.txt ./RUN pip install -r requirements.txt# 打包 app 源码COPY src /appEXPOSE 8080CMD [ "python"， "server.py" ]

12. Set default environment variables, map ports and data volumes

Most likely some environment variables are required when running a Docker container. It is a good way to set default environment variables in Dockerfile. Also, we should setup mapped port and data volume in Dockerfile. An example is as follows:

dockerfile FROM node:7-alpine ENV PROJECT_DIR=/app WORKDIR
PROJECT_DIR   RUN npm install   COPY .
MEDIA_DIR   EXPOSE $APP_PORT ENTRYPOINT ["./entrypoint.sh"]   CMD ["start"] ``` [ENV](https://docs.docker.com/engine/reference/builder/#env)指令指定的环境变量在容器中可以使用。如果你只是需要指定构建镜像时的变量，你可以使用[ARG](https://docs.docker.com/engine/reference/builder/#arg)指令。

13. Use LABEL to set image metadata

Using the LABEL directive, you can set metadata for an image, such as the image creator or image description. The old Dockerfile syntax used the MAINTAINER directive to specify the image creator, but it has been deprecated. Sometimes, some external programs need to use the metadata of the image, for example, nvidia-docker needs to use com.nvidia.volumes.needed. An example is as follows:

FROM node:7-alpine  LABEL maintainer "[email protected]"  ...

14. Add HEALTH CHECK

When running a container, you can specify the --restart always option. In this way, when the container crashes, the Docker daemon will restart the container. This option is useful for long-running containers. However, what if the container is indeed running, but it is not available (in an infinite loop, misconfiguration)? Using the HEALTHCHECK command allows Docker to periodically check the health of the container. We just need to specify a command that returns 0 if everything is ok and 1 otherwise. If you are interested in HEALTHCHECK, you can refer to this blog. An example is as follows:

FROM node:7-alpine  LABEL maintainer "[email protected]"ENV PROJECT_DIR=/app  WORKDIR $PROJECT_DIRCOPY package.json $PROJECT_DIR  RUN npm install  COPY . $PROJECT_DIRENV MEDIA_DIR=/media \      NODE_ENV=production \    APP_PORT=3000VOLUME $MEDIA_DIR  EXPOSE $APP_PORT  HEALTHCHECK CMD curl --fail http://localhost:$APP_PORT || exit 1ENTRYPOINT ["./entrypoint.sh"]  CMD ["start"]

The curl --fail command returns a non-zero status when the request fails.

15. Multi-stage build

Reference document "https://docs.docker.com/develop/develop-images/multistage-build/

In the era when docker did not support multi-stage builds, we usually used the following two methods when building docker images:

Method A. Write all the build processes in the same Dockerfile, including the compilation, testing, packaging and other processes of the project and its dependent libraries. There may be the following problems:

Dockerfile can be particularly bloated
Very deep mirror
There is a risk of source code leakage

Method B. After compiling and testing the project and its dependent libraries externally, copy it to the build directory to execute the build image.

Method B is slightly more elegant than method A, and can well avoid the risks of method A, but we still need to write two or more sets of Dockerfile or some scripts to automatically integrate the two stages, for example, how many If each project is related and dependent on each other, we need to maintain multiple Dockerfiles, or need to write more complex scripts, resulting in high post-maintenance costs.

To solve the above problems, Docker v17.05 began to support multistage builds . Using multi-stage builds we can easily solve the aforementioned problems and only need to write a Dockerfile.

You can use multiple FROM statements in a Dockerfile. Each FROM instruction can use a different base image and signals the start of a new build phase. You can easily copy the files from one stage to another, and just keep what you need in the final image.

By default, build phases have no commands, we can refer to them by their index, the first FROM instruction starts from 0, we can also use AS instruction to name the build phase.

Case 1

FROM golang:1.7.3WORKDIR /go/src/github.com/alexellis/href-counter/RUN go get -d -v golang.org/x/net/htmlCOPY app.go .RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .
FROM alpine:latestRUN apk --no-cache add ca-certificatesWORKDIR /root/COPY --from=0 /go/src/github.com/alexellis/href-counter/app .CMD ["./app"]

After docker build passing the build, the end result is an Image of the same size as before, but with significantly less complexity. You don't need to create any intermediate images, and you don't need to temporarily extract any compilation results to the local system.

How does it work? The key lies in COPY --from=0 this instruction. The second FROM instruction in the Dockerfile starts a new build phase with alpine:latest as the base image, and by COPY --from=0 copying only the previous phase's build files into this phase. The Go SDK and any intermediate layers produced in the previous build phase are discarded in this phase instead of being saved in the final Image.

Build a python application using multiple stages.

Case 2

By default, build phases are unnamed. You can refer to them by an integer value, which defaults to starting at the 0th FROM instruction. For administrative convenience, you can also name your build phases by adding as NAME to the FROM directive. The following example accesses a specific build phase by naming the build phases and using the name in the COPY instruction.

The advantage of this is that even if the instructions in the Dockerfile are reordered later, the COPY instruction can still find the corresponding build phase.

FROM golang:1.7.3 as builderWORKDIR /go/src/github.com/alexellis/href-counter/RUN go get -d -v golang.org/x/net/htmlCOPY app.go    .RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .
FROM alpine:latestRUN apk --no-cache add ca-certificatesWORKDIR /root/COPY --from=builder /go/src/github.com/alexellis/href-counter/app .CMD ["./app"]

Case 3

stop at a specific build phase

When building an image, you don't necessarily need to build every stage in the entire Dockerfile, you can also specify the stages that need to be built. For example: you only build the stage named builder in your Dockerfile

$ docker build --target builder -t alexellis2/href-counter:latest .

This feature is suitable for the following scenarios:

Debug a specific build phase.
In the Debug phase, enable all program debugging modes or debugging tools, and try to be as streamlined as possible in the production phase.
During the Testing phase, your application uses test data, but during the Production phase it uses production data.

Case 4

Use an external image as a build stage

When using multi-stage builds, you don't just copy from the image created in the Dockerfile. You can also use COPY --from directives to copy from a separate Image, using local Image names, tags or tag IDs available locally or from a Docker registry.

COPY --from=nginx:latest /etc/nginx/nginx.conf /nginx.conf

Case 5

Treat the previous stage as a new stage

When using the FROM instruction, you can continue by referencing where the previous stage left off. Similarly, using this method can also facilitate different roles in a team. How to use a pipeline-like method to provide basic images level by level, it is also more convenient and quick to reuse the basic images of other people in the team. For example:

FROM alpine:latest as builderRUN apk --no-cache add build-baseFROM builder as build1COPY source1.cpp source.cppRUN g++ -o /binary source.cppFROM builder as build2COPY source2.cpp source.cppRUN g++ -o /binary source.cpp

# ---- 基础 python 镜像 ----FROM python:3.6 AS base# 创建 app 目录WORKDIR /app# ---- 依赖 ----FROM base AS dependencies  COPY gunicorn_app/requirements.txt ./# 安装 app 依赖RUN pip install -r requirements.txt# ---- 复制文件并 build ----FROM dependencies AS build  WORKDIR /appCOPY . /app# 在需要时进行 Build 或 Compile# --- 使用 Alpine 发布 ----FROM python:3.6-alpine3.7 AS release  # 创建 app 目录WORKDIR /appCOPY --from=dependencies /app/requirements.txt ./COPY --from=dependencies /root/.cache /root/.cache# 安装 app 依赖RUN pip install -r requirements.txtCOPY --from=build /app/ ./CMD ["gunicorn", "--config", "./gunicorn_app/conf/gunicorn_config.py", "gunicorn_app:app"]