Optimizing image pull speed with Harbor and Kraken

1. Brief introduction of P2P image distribution

As the cloud-native architecture is accepted by more and more enterprises, the scale of container clusters in enterprise applications is also increasing. When the container cluster reaches a certain scale and the number of copies of a single container application reaches a certain level, the distribution of container images in the cluster will face challenges.
  P2P(Peer-to-Peer, peer-to-peer) image distribution draws on the idea of ​​Internet P2P file transfer, aims to improve the distribution efficiency of images in container clusters, and optimize the kubernetes cluster with a faster image pull speed.
This article mainly describes the theoretical part of Kraken+Harbor, and the practical part of deployment and use will be described in detail in the next article.

1.1 The principle of P2P mirror distribution

When the image distribution scale reaches a certain amount, the bandwidth of the Harbor service or back-end storage will be the first to be pressured. 如果100个节点同时拉取镜像,镜像被压缩为500MB,此时需要在10秒内完成拉取,则后端存储面临5GB/s的带宽。In some larger clusters, there may be thousands of nodes pulling a mirror at the same time, and the bandwidth pressure can be imagined.
There are many ways to solve this problem, such as dividing clusters, increasing cache and load balancing, etc., but a better solution may be to use P2P mirror distribution technology.

Ps: Harbor’s default timeout for mirror synchronization and mirror pull is 900s. The timeout for mirror pull can be modified in Harbor’s Nginx configuration file /etc/nginx/nginx.conf.

$ cat /etc/nginx/nginx.conf
http {
    
    
  ...
  proxy_connect_timeout  600;   #连接超时时间
  proxy_send_timeout     600;   #发送超时时间
  proxy_read_timeout     600;   #读取超时时间
    ...
}

After the modification is complete, you need to restart the Harbor service and Nginx service to make the configuration take effect.

The P2P mirror distribution technology divides the files that need to be distributed to generate seed files, and each P2P node downloads the fragments according to the seed files. When doing sharding, the file can be split into multiple tasks to be executed in parallel. Different nodes can pull different shards from the seed node. After the download is completed, they can be used as the seed node for other nodes to download. After adopting the decentralized pull method, the traffic is evenly distributed to the nodes in the P2P network, which can significantly increase the distribution speed.

1.2 What are the P2P image distribution technologies?

The following are some common P2P image distribution technologies:

Kraken: An open-source mirroring P2P distribution project developed by Uber using the Go language
Dragonfly: Alibaba's open-source P2P mirroring and file distribution system, which can solve the distribution problems faced by native cloud applications.
BitTorrent: BitTorrent is a commonly used P2P file distribution protocol, which can be used to distribute image files. The BitTorrent protocol divides files into multiple blocks, and each block is shared by multiple nodes. Downloaders can download blocks from multiple nodes at the same time, thereby increasing the download speed.
IPFS: IPFS (InterPlanetary File System) is a distributed file system that can be used to store and distribute image files. IPFS divides files into multiple blocks, each block is stored by multiple nodes, and downloaders can download blocks from multiple nodes, thereby improving download speed and reliability.
Docker Swarm: Docker Swarm is a container orchestration tool officially provided by Docker, which can be used to distribute image files. Docker Swarm distributes images to multiple nodes, and downloaders can download images from multiple nodes, thereby improving download speed and reliability.
Alibaba Cloud P2P mirror service: Alibaba Cloud provides P2P mirror service, which can be used to distribute mirror files. The P2P mirroring service distributes the mirroring to multiple nodes, and the downloader can download the mirroring from the nearest node, thereby improving the download speed and reliability.
It should be noted that the P2P image distribution technology needs to share and download files among multiple nodes, so security and copyright issues need to be considered. In addition, different P2P image distribution technologies have different implementation methods and characteristics, and an appropriate technology needs to be selected according to the actual situation.

2. Kraken P2P mirror distribution

Project address: https://github.com/uber/kraken

2.1 What is Kraken?

Kraken is a P2P driven Docker registry with a focus on scalability and availability. It is designed for Docker image management, replication and distribution in hybrid cloud environments. With pluggable backend support, Kraken can be easily integrated into existing Docker registry setups as a distribution layer.

Kraken has been in production at Uber since early 2018. On our busiest cluster, Kraken distributes over 1 million blobs per day, including 100,000 1G+ blobs. During peak production load, Kraken distributed 20K 100MB-1G blobs in 30 seconds.
  There are many ways to solve this problem, such as dividing clusters, increasing cache and load balancing, etc., but a better solution may be to use P2P mirror distribution technology. Kraken is an open-source mirror P2P distribution project developed by Uber using the Go language. It adopts a shared-nothing architecture, which has the characteristics of simple deployment and high fault tolerance, so the overall system operation and maintenance cost is relatively low.

There are many ways to solve this problem, such as dividing clusters, increasing cache and load balancing, etc., but a better solution may be to use P2P mirror distribution technology. The core of Kraken is P2P file distribution. Its main function is the P2P distribution of container images. Currently, it does not support the distribution of general files.

2.2 Kraken components

Kraken consists of five components: Proxy, Build-Index, Tracker, Origin, and Agent:

Proxy: As the portal of Kraken, it implements the Docker Registry V2 interface.
Build-Index: It is connected to the back-end storage and is responsible for the mapping of mirroring Tags and Digests.
Origin: docks with the back-end storage, is responsible for the storage of file objects, and serves as a seed node during distribution.
Tracker : The central service of P2P distribution, which records the content of nodes and is responsible for forming a P2P distribution network.
Agent: deployed on the node, implements the Docker Registry V2 interface, and is responsible for pulling image files through P2P.

2.3 Features

Here are some highlights of Kraken:
Highly scalable: Kraken is able to distribute Docker images at speeds exceeding 50% of the maximum download speed limit per host. Cluster size and image size had no significant effect on download speed.
Each cluster supports at least 15k hosts.
Arbitrarily large blobs/layers are supported. We generally limit the maximum size to 20G for best performance.
Highly available: no component is a single point of failure.
Security: Supports uploader authentication and data integrity protection via TLS.
Pluggable storage options: Instead of managing data, Kraken plugs into reliable blob storage options like S3, GCS, ECR, HDFS, or other registries. The storage interface is simple and new options are easy to add.
Lossless cross-cluster replication: Kraken supports rule-based asynchronous replication between clusters.
Minimal dependencies: Besides pluggable storage, Kraken only has an optional dependency on DNS.

2.4 Three options for Kraken+Harbor

Combined with Kraken's P2P image distribution technology, the basic idea is to use Harbor as the image management system and Kraken as the image distribution tool. There are three options: use Harbor as a
方案一:unified external interface, use Kraken as the backend, and then connect to Registry;
方案二:Use Kraken as the unified external interface and Harbor as the backend;
方案三:use a public Registry as the unified backend, both Harbor and Kraken use this Registry.

Among them, in the third scheme, the two systems are decoupled. When Harbor or Kraken fails, the other system can still work normally. This solution requires that Harbor and Kraken can access the same Registry. It is best to deploy Harbor and Kraken in the same Kubernetes cluster, and then access the same Registry service through the Service mode.

If they cannot be deployed in the same cluster, cross-cluster access is required, and efficiency is affected. Therefore, when Harbor and Kraken are deployed in different clusters, the second option can also be used, so that Kraken will strongly rely on Harbor. Kraken accesses Harbor through the Docker Registry V2 interface exposed by Harbor. Users can still manage images through Harbor and distribute images through Kraken.
insert image description here
In the solution shown above, Harbor and Kraken share the Registry service, and only one copy of mirror data will be saved in the Registry. Through Kraken's image warm-up function, after the image is pushed to Harbor, a copy can be cached in Kraken. Users can set the size and expiration policy of Kraken cache.

In the P2P distribution system, when starting to distribute a file, the P2P distribution system first needs to generate the torrent file (Torrent file) of the file, which contains the basic information of the file, the fragmentation information of the file, and the Tracker server (P2P Center Services) address, etc.

When a node in the P2P network needs to download a file, it will first obtain the seed file of the corresponding file, then obtain the information of each segment that can be downloaded from the P2P center service, and then download it to different nodes according to the segment. After all the files are downloaded, they are assembled into a complete file. The specific implementation is different in different P2P distribution systems, but the core mechanism is the same.

As far as Kraken is concerned, when there is no file cache in the Kraken system and a node needs to download the file, Kraken will first fetch the file from the back-end storage, then calculate the seed information of the file, and then distribute it in shards.

When generating the seed file, you need to download the file first, which is a time-consuming process. In order to save the time for downloading the file, Kraken has designed a mirror preheating system. Users can inform Kraken of the mirror image that will be distributed through the interface provided by Kraken. Kraken will download the corresponding image from the back-end storage in advance, and calculate the seed file for distribution, which greatly shortens the time for image distribution.

2.5 Kraken's warm-up process

Kraken's warm-up solution is implemented based on the Registry's notification mechanism, as shown in Figure 1-2, and the triggering process is as follows.
(1) The user pushes the image to Harbor.
(2) Registry triggers the notification mechanism and sends a notification request to Kraken-proxy.
(3) Kraken-proxy parses the received request, which contains all the information about pushing the image.
(4) Kraken-proxy requests to Kraken-origin to obtain the Manifest file corresponding to the image.
(5) Kraken-proxy parses the Manifest file to obtain all layer file information of the image.
(6) Kraken-proxy will request Kraken-origin to download each layer file one by one.
(7) When Kraken-origin downloads layer files from the Registry, it will first cache the layer files locally and calculate the seed information of each layer file.
insert image description here

Kraken's preheating scheme will cause all images to be cached in Kraken-origin, resulting in high disk throughput and occupancy for Kraken-origin. In order to avoid this problem, it is usually recommended to separate the test and production mirror repositories and use two Harbors, in which the test repositories are used independently, and the production environment integrates Kraken to use P2P distribution, and the mirror synchronization between the two repositories is realized through Harbor's remote replication function.

Because usually only images that pass verification in the test environment are released to Harbor in the production environment, the pressure on the production warehouse is greatly reduced, and the number of cached images in the Kraken system can also be reduced; when separating the test and production environments , can clearly divide the roles and permissions, and realize multi-party collaboration.

Harbor has designed a policy-based warm-up function in future version planning, which can pre-warm matching images into Kraken according to the set policy, which is the best solution to the problem.

Complain that it is better to walk in the dark than to carry a lamp.

Guess you like

Origin blog.csdn.net/qq_50573146/article/details/131042236