Tips for optimizing Docker image

Insert picture description here

Docker image is the main image in the Docker execution program. They are the "blueprint of the container" and provide instructions on how to generate the container. In this article, I will introduce some often overlooked concepts that will help optimize the development and build process of Docker images.

How do you build a Docker image?

Let's start with the Docker build process. Docker build is triggered by using the docker build command in the Docker CLI tool.

The docker build command builds a Docker image according to the instructions specified in the Dockerfile. Dockerfile is a text document that contains all the orderly commands for the user to assemble the image.

The Docker image consists of a read-only layer. Each layer represents a Dockerfile instruction. These layers are stacked together, and each layer is an increment of the previous layer. I think these layers are a form of caching. Update only the layers that have changed, instead of updating each change.

The following example describes the contents of the Dockerfile:

FROM ubuntu:18.04
COPY . /app
RUN make /app
CMD python /app/app.py

Each instruction in this file represents a separate layer in the Docker image. The following is a brief description of each instruction:

FROM uses ubuntu:18.04 to create a layer of Docker image

COPY add files from the directory where the Docker client is located

RUN Use the make command to build your application

CMD specifies what command to run in the container

When these four commands are executed during the build process, they will create layers in the Docker image.

If you want to learn more about mirrors and layers, you can read about them here.

Optimize the image building process

Now that we have introduced the Docker build process, I would like to share some optimization suggestions to help build images effectively.

  1. Temporary container

The image defined by the Dockerfile should generate short-lived containers.

In this case, a temporary container means that it can be destroyed, rebuilt, and replaced with a new container. Temporary containers can be considered disposable. Each instance is new and has nothing to do with the previous container instance.

When developing Docker images, you should use as many ad hoc modes as possible.

  1. Don't install unnecessary packages

Try to avoid installing unnecessary files and software packages.

Docker images should be kept streamlined, which helps improve portability, shorten build time, reduce complexity and reduce file size. For example, in most cases, do not install a text editor on the container, and do not install any non-essential applications or services.

  1. Implement .dockerignore file

The .dockerignore file is used to declare files and directories that will not be included in the mirror. This helps avoid packaging unnecessary large or sensitive files and avoid adding them to public mirrors.

If you also want to exclude files that are not related to the build without rebuilding the source code base, use a .dockerignore file. It supports exclusion mode similar to .gitignore files.

  1. Multi-line parameters to be sorted

Try to simplify future changes by sorting multiple rows of parameters as much as possible, which helps avoid package duplication and makes the list easier to update. This also makes the PR easy to read and view. Adding spaces before the backslash \ also helps.

  1. Decoupling applications

Applications that depend on other applications are considered "coupled."

In some cases, they are hosted on the same host or compute node, which is common in non-container deployments, but for microservices, each application should exist in its own separate container. Decoupling applications into multiple containers makes it easier to scale and reuse containers horizontally. For example, a decoupled Web application may contain three separate containers, each of which has its own unique image: one for managing the Web application, one for managing the database, and one for managing the cache. .

Limiting each container to one process is a good rule of thumb. Use your best judgment to keep the container as clean and modular as possible.

If the containers depend on each other, you can ensure that these containers can communicate by using the Docker container network.

  1. As few layers as possible

In Docker build, only RUN, COPY and ADD instructions create layers. Other instructions will create a temporary intermediate image and will not increase the size of the build in the end.

You can also copy the required components to the final image, which allows you to include other tools or debugging information during the build phase without increasing the size of the final image.

  1. Use build cache

When building an image, Docker will gradually and sequentially execute each instruction in the Dockerfile. In each instruction, Docker searches its cache for the image to be used instead of creating a new image. This is the basic rule Docker follows:

All sub-mirrors derived from this base mirror are compared with the mirrors already in the cache to see if one of them was built using exactly the same instructions. If they are not the same, the cache is invalidated and rebuilt.

For ADD and COPY instructions, the contents of the mirror file will be checked and compared with the existing mirror. If anything in the file (such as content and metadata) is changed, the cache will be invalidated.

Except for the ADD and COPY commands, the cache does not look at the files in the container to determine whether the cache matches. For example, when using the RUN apt-get -y command, it will not check whether the updated file in the container exists in the cache.

After the cache is invalidated, all subsequent Dockerfile commands will generate a new image, and the cache will not be used.

Optimize Docker image construction in the CI pipeline

First of all, all the optimization concepts mentioned earlier are valid for building a mirror in the CI pipeline. If the Dockerfile changes, then using cache is still the best way to reduce build time.

When building a Docker image is a regular process of the CI pipeline, the Docker layer cache (DLC) function can be used to speed up the build. DLC is a great feature. DLC will save the created image layer in the task, and then reuse the unchanged image layer in subsequent builds instead of rebuilding the entire image every time.

DLC can be used with machine executor or remote Docker environment (setup_remote_docker). It should be noted that DLC is only useful when creating Docker images using commands such as docker build and docker compose.

If you want to learn more about DLC, you can read about it in the documentation.

to sum up

In this article, I introduced the optimization techniques used to build Docker images, which will help you effectively develop Docker images and speed up the CI pipeline.

Insert picture description here

Guess you like

Origin blog.csdn.net/liuxingjiaoyu/article/details/112346554