Docker image slimming

Docker is a development platform for developing, delivering and running applications. It can separate applications and infrastructure to ensure that the development, testing, and
deployment environments are completely consistent, so as to achieve the purpose of rapid delivery. However, in actual projects, the modules or services in the project are subdivided,
resulting in too many deployed images (50+) and too large (the packaged and compressed images reach 50G+), which brings a lot of deployment The hidden dangers of privatization, especially the privatization deployment (deployment by copying the image through the mobile media). This article starts with a number of articles about mirror image slimming, and conducts practical verification, combined with the official Dockerfile best practice, summarizes 4 methods of image compression and multiple techniques of daily practice.

Image build

Construction method

There are two ways to build the mirror, one is through the docker buildconstructed image Dockerfile in instruction execution, the other is by docker committhe presence of a mirror image of the container package. Usually we use the first method to build containers, the difference between the two is like batch processing and single-step execution.

Volume analysis

Docker mirror image is composed of many layers (Layers) The composition of (up to 127 layers), Dockerfile specified in each image layer is created, but only RUN, COPY, ADDincreasing the volume of the mirror will . This command can docker history image_idview the size of each floor.
Here we take the official alpine:3.12 as an example to see its mirroring layer.

FROM scratch
ADD alpine-minirootfs-3.12.0-x86_64.tar.gz /
CMD ["/bin/sh"]

Alpine mirroring layer
Dockerfile contrast and image layers of history found in ADDthe command layer occupies 5.57M size, and CMDcommand layer does not occupy space.

Mirror layer as Gitonce every submission Commit, save for the difference between the previous version and the current version of the image. So when we use the
docker pullcommand to pull an image from a public or private Hub, it will only download the layers we don't already own.
This is a very efficient way to share images, but it can sometimes be misused, such as repeated submissions.
Repeat case submission
As can be seen from the figure above, the basic image alpine:3.12 occupies 5.57M, and the idps_sm.tar.gz file occupies 4.52M. However, the command RUN rm -f ./idps_sm.tar.gzdoes not reduce the image size, image size of a base and two mirror ADDconfiguration file.

Weight loss method

Understand the reasons for the increase in volume in mirroring construction, then you can prescribe the right medicine: reduce the number of layers or reduce the size of each layer .

Mirror slimming

For the actual operation of mirror slimming, take packaging redis mirrors as an example. Before packaging, we first pull the official redis mirrors and
find that the size of the mirror labeled 6 is 104M, and the size of the mirror labeled 6-alpine is 31.5M. The packaging process is as follows:

  1. Select the base image, update the software source, install the packaging tool
  2. Download the source code and package and install
  3. Clean up unnecessary installation files

According to the above process, we write the following Dockerfile , using the mirror command docker build --no-cache -t optimize/redis:multiline -f redis_multiline .the packaged mirror size 441M.

FROM ubuntu:focal

ENV REDIS_VERSION=6.0.5
ENV REDIS_URL=http://download.redis.io/releases/redis-$REDIS_VERSION.tar.gz

# update source and install tools
RUN sed -i "s/archive.ubuntu.com/mirrors.aliyun.com/g; s/security.ubuntu.com/mirrors.aliyun.com/g" /etc/apt/sources.list 
RUN apt update 
RUN apt install -y curl make gcc

# download source code and install redis
RUN curl -L $REDIS_URL | tar xzv
WORKDIR redis-$REDIS_VERSION
RUN make
RUN make install
 
# clean up
RUN rm  -rf /var/lib/apt/lists/* 

CMD ["redis-server"]

RUN instruction merge

Instruction merging is the simplest and most convenient way to reduce the number of mirroring layers. The space-saving principle of this operation is to clean up the "cache" and tool software in the same layer.
It is still necessary to package redis. The Dockerfile of the instruction combination is as follows, and the packaged image size is 292M.

FROM ubuntu:focal

ENV REDIS_VERSION=6.0.5
ENV REDIS_URL=http://download.redis.io/releases/redis-$REDIS_VERSION.tar.gz

# update source and install tools
RUN sed -i "s/archive.ubuntu.com/mirrors.aliyun.com/g; s/security.ubuntu.com/mirrors.aliyun.com/g" /etc/apt/sources.list &&\
    apt update &&\
    apt install -y curl make gcc &&\

# download source code and install redis
    curl -L $REDIS_URL | tar xzv &&\
    cd redis-$REDIS_VERSION &&\
    make &&\
    make install &&\

# clean up
    apt remove -y --auto-remove curl make gcc &&\
    apt clean &&\
    rm  -rf /var/lib/apt/lists/* 

CMD ["redis-server"]

Use docker historyAnalysis optimize / redis: multiline and optimize / redis: singleline mirror, to give the following:
Order merge
Analysis on FIG found mirror optimize / redis: cleaning the data in multiline layers does not reduce the size of the mirror, which is above said shared image Problems caused by layers. Therefore, the method of merging instructions is to clean up the cache and unused tool software in the same layer to achieve the purpose of reducing the mirror volume.

Multi-stage construction

The multi-stage construction method is the best practice for official packaging images, and it is the ultimate method of streamlining the number of layers. Generally speaking, it divides the packaging image into two stages. One stage is used for development and packaging, which contains all the content needed to build the application; the other is for production operation, which only contains your application and runs it. What you need. This is called "builder mode". The relationship between the two stages is a bit like the relationship between JDK and JRE.
The use of multi-stage construction will definitely reduce the image size, but the granularity of slimming is related to the programming language and has a better effect on compiled languages, because it removes redundant dependencies in the compilation environment and directly uses compiled binary files or jar packages. For interpreted languages, the effect is not so obvious.

It is still the requirement to package the redis image above, using a multi-stage built Dockerfile , and the packaged size is 135M.

FROM ubuntu:focal AS build

ENV REDIS_VERSION=6.0.5
ENV REDIS_URL=http://download.redis.io/releases/redis-$REDIS_VERSION.tar.gz

# update source and install tools
RUN sed -i "s/archive.ubuntu.com/mirrors.aliyun.com/g; s/security.ubuntu.com/mirrors.aliyun.com/g" /etc/apt/sources.list &&\
    apt update &&\
    apt install -y curl make gcc &&\

# download source code and install redis
    curl -L $REDIS_URL | tar xzv &&\
    cd redis-$REDIS_VERSION &&\
    make &&\
    make install

FROM ubuntu:focal
# copy
ENV REDIS_VERSION=6.0.5
COPY --from=build /usr/local/bin/redis* /usr/local/bin/

CMD ["redis-server"]

Compared with optimize/redis:singleline, the changes have the following three points:

  1. As build is added to the first line to prepare for the COPY later
  2. There is no cleanup operation in the first stage, because the image built in the first stage is only useful for the compiled object files (binary files or jar packages), and the others are useless
  3. The second stage copies the target file directly from the first stage

Similarly, the use docker historyto view mirror case Volume:
Multi-stage construction

Comparing our multi-stage built image with the official redis:6 (can't be compared with redis:6-alpine, because redis:6 and ubuntu:focal are both mirrors based on debain) and found that they have 30M of space. Researching the Dockerfile of redis:6 found the following "sao operation":

serverMd5="$(md5sum /usr/local/bin/redis-server | cut -d' ' -f1)"; export serverMd5; \
find /usr/local/bin/redis* -maxdepth 0 \
		-type f -not -name redis-server \
		-exec sh -eux -c ' \
			md5="$(md5sum "$1" | cut -d" " -f1)"; \
			test "$md5" = "$serverMd5"; \
		' -- '{}' ';' \
		-exec ln -svfT 'redis-server' '{}' ';' \

Compiling the redis source code found that the binary files redis-server and redis-check-aof (aof persistence), redis-check-rdb (rdb persistence), redis-sentinel (redis sentinel) are the same files, with a size of 11M. The official image is generated by ln the last three through the above script.

Use a suitable base image

The base image, Alpine is recommended. Alpine is a highly streamlined and lightweight Linux distribution that includes basic tools. The basic image is only 4.41M. Each development language and framework has a basic image based on Alpine. It is strongly recommended to use it. Advanced can try to use scratch and busybox images to build a basic image. From the official mirrors redis:6 (104M) and redis:6-alpine (31.5M), it can be seen that the alpine mirror is only 1/3 of the debian mirror.

One point to note when using Alpine mirroring is that it is based on muslc (an alternative to glibc standard library). These two libraries implement the same kernel interface.
Among them, glibc is more common and faster, while muslic uses less space and focuses on security.
When compiling an application, most of it is compiled for a specific libc. If we want to use them with another libc, we must recompile them. In other words, building a container based on the Alpine base image may cause unexpected behavior because the standard C library is different.
However, this situation is more difficult to encounter, and there are solutions even if it does .

Delete RUN cache file

Most package management software in linux needs to update the source, this operation will bring some cache files, here is a record of common cleaning methods.

  • Debian based mirror

    # 换国内源,并更新
    sed -i “s/deb.debian.org/mirrors.aliyun.com/g” /etc/apt/sources.list && apt update
    # --no-install-recommends 很有用
    apt install -y --no-install-recommends a b c && rm -rf /var/lib/apt/lists/*
    
  • alpine mirror

    # 换国内源,并更新
    sed -i 's/dl-cdn.alpinelinux.org/mirrors.tuna.tsinghua.edu.cn/g' /etc/apk/repositories
    # --no-cache 表示不缓存
    apk add --no-cache a b c && rm -rf /var/cache/apk/*
    
  • centos mirror

    # 换国内源并更新
    curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo && yum makecache
    yum install -y a b c  && yum clean al
    

Dockfile practice

Best practice points

  • Write .dockerignore file
  • A container only runs a single application
  • Do not use latest for the labels of the base image and production image
  • Set up WORKDIR and CMD
  • Use ENTRYPOINT and start the command with exec (optional)
  • COPY is preferred to ADD
  • Set default environment variables, map ports and data volumes
  • Use LABEL to set mirror metadata
  • Add HEALTHCHECK

Multi-stage build example

FROM golang:1.11-alpine AS build

# 安装项目所需工具
# Run `docker build --no-cache .` to update dependencies
RUN apk add --no-cache git
RUN go get github.com/golang/dep/cmd/dep

# 安装项目的依赖库(GO使用 Gopkg.toml and Gopkg.lock)
# These layers are only re-built when Gopkg files are updated
COPY Gopkg.lock Gopkg.toml /go/src/project/
WORKDIR /go/src/project/
# Install library dependencies
RUN dep ensure -vendor-only

# 拷贝项目并进行构建
# This layer is rebuilt when a file changes in the project directory
COPY . /go/src/project/
RUN go build -o /bin/project

# 精简的生成环境
FROM scratch
COPY --from=build /bin/project /bin/project
ENTRYPOINT ["/bin/project"]
CMD ["--help"]

common problem

Alpine base image usage

  1. Solve glibc problems

    ENV ALPINE_GLIBC_VERSION="2.31-r0"
    ENV LANG=C.UTF-8
    
    RUN set -x \
        && sed -i 's/dl-cdn.alpinelinux.org/mirrors.tuna.tsinghua.edu.cn/g' /etc/apk/repositories \
        && apk add --no-cache wget \
        && wget -q -O /etc/apk/keys/sgerrand.rsa.pub https://alpine-pkgs.sgerrand.com/sgerrand.rsa.pub \
        && wget -O https://github.com/sgerrand/alpine-pkg-glibc/releases/download/$ALPINE_GLIBC_VERSION/glibc-$ALPINE_GLIBC_VERSION.apk \
        && wget -O https://github.com/sgerrand/alpine-pkg-glibc/releases/download/$ALPINE_GLIBC_VERSION/glibc-$ALPINE_GLIBC_VERSION.apk \
        && wget -O https://github.com/sgerrand/alpine-pkg-glibc/releases/download/$ALPINE_GLIBC_VERSION/glibc-bin-$ALPINE_GLIBC_VERSION.apk \
        && wget -O https://github.com/sgerrand/alpine-pkg-glibc/releases/download/$ALPINE_GLIBC_VERSION/glibc-i18n-$ALPINE_GLIBC_VERSION.apk \
        && apk add --no-cache glibc-$ALPINE_GLIBC_VERSION.apk  \
                        glibc-bin-$ALPINE_GLIBC_VERSION.apk \
                        glibc-i18n-$ALPINE_GLIBC_VERSION.apk \
        && /usr/glibc-compat/bin/localedef --force --inputfile POSIX --charmap UTF-8 "$LANG" || true \
        && echo "export LANG=$LANG" > /etc/profile.d/locale.sh \
        && apk del glibc-i18n \
        && rm glibc-$ALPINE_GLIBC_VERSION.apk glibc-bin-$ALPINE_GLIBC_VERSION.apk glibc-i18n-$ALPINE_GLIBC_VERSION.apk
    

references

  1. Dockerfile best practices
  2. Docker multi-stage build
  3. Three tips to reduce Docker image volume by 90%
  4. Five general methods to streamline Docker images
  5. Optimizing Dockerfile best practices
  6. alpine3.12 mirror

If this article has helped you, or if you are interested in technical articles, you can follow the WeChat public account: Technical Tea Party, you can receive related technical articles as soon as possible, thank you!
Technical tea party

This article is automatically published by ArtiPub , a multi- posting platform

Guess you like

Origin blog.csdn.net/haojunyu2012/article/details/112690552
Recommended