Kubelet garbage (exited containers and unused images) recovery source code analysis

1 Overview:

1.1 Code environment

The version information is as follows:
a. kubernetes cluster: v1.15.0

1.2 Brief description of garbage collection

In order to reduce the resource consumption of the node, two coroutines are running inside the kubelet component. One coroutine cleans up the exited container on the current node every 1 minute (the time period cannot be modified, and the code is dead). 5 minutes (the time period cannot be modified, the code is hard to write) to clean up the unused images on the current node.


2 Parameters affecting garbage collection:

2.1 Parameters affecting garbage collection that cannot be configured by the user

ContainerGCPeriod is fixed at 1 minute. This value is a constant. The timed task method uses this constant directly when calling.
ImageGCPeriod, fixed at 5 minutes, this value is a constant, the timed task method is directly used when calling this constant.

const (
	// ContainerGCPeriod is the period for performing container garbage collection.
	ContainerGCPeriod = time.Minute
	// ImageGCPeriod is the period for performing image garbage collection.
	ImageGCPeriod = 5 * time.Minute
)

2.2 User-configurable parameters affecting garbage collection

2.2.1 Three parameters related to container garbage collection

(It can only be written in the kubelet startup command parameters, not its configuration file, because there are no related fields in the kubeconfigration object)

a, --maximum-dead-containers-per-container:
how many stopped containers can be stored in a pod at most, the default is 1;
b, --maximum-dead-containers:
how many stopped containers can be kept on the current node , The default is -1, which means there is no limit;
c, --minimum-container-ttl-duration:
the minimum time that the container that has exited can survive, the default is 0s;

fs.DurationVar(&f.MinimumGCAge.Duration, "minimum-container-ttl-duration", f.MinimumGCAge.Duration, "Minimum age for a finished container before it is garbage collected.  Examples: '300ms', '10s' or '2h45m'")
fs.Int32Var(&f.MaxPerPodContainerCount, "maximum-dead-containers-per-container", f.MaxPerPodContainerCount, "Maximum number of old instances to retain per container.  Each container takes up some disk space.")
fs.Int32Var(&f.MaxContainerCount, "maximum-dead-containers", f.MaxContainerCount, "Maximum number of old instances of containers to retain globally.  Each container takes up some disk space. To disable, set to a negative number.")

The maximum-dead-containers-per-container, maximum-dead-containers, –minimum-container-ttl-duration parameter values ​​will eventually be saved in this structure object, which will be treated as a garbage collection method GarbageCollect( )'S entry.


type ContainerGCPolicy struct {
	// Minimum age at which a container can be garbage collected, zero for no limit.
	MinAge time.Duration

	// Max number of dead containers any single pod (UID, container name) pair is
	// allowed to have, less than zero for no limit.
	MaxPerPodContainer int

	// Max number of total dead containers, less than zero for no limit.
	MaxContainers int
}

2.2.2 There are three main parameters related to mirror recovery:

–Image-gc-high-threshold: When the disk usage reaches this threshold, kubelet starts to reclaim the image, the default is 85%;
–image-gc-low-threshold: When the disk usage reaches this threshold, kubelet stops reclaiming the image, the default is 80 %;
--minimum-image-ttl-duration: the minimum retention time of unused images before being recycled, the default is 2m0s;

fs.DurationVar(&c.ImageMinimumGCAge.Duration, "minimum-image-ttl-duration", c.ImageMinimumGCAge.Duration, "Minimum age for an unused image before it is garbage collected.  Examples: '300ms', '10s' or '2h45m'.")
fs.Int32Var(&c.ImageGCHighThresholdPercent, "image-gc-high-threshold", c.ImageGCHighThresholdPercent, "The percent of disk usage after which image garbage collection is always run. Values must be within the range [0, 100], To disable image garbage collection, set to 100. ")
fs.Int32Var(&c.ImageGCLowThresholdPercent, "image-gc-low-threshold", c.ImageGCLowThresholdPercent, "The percent of disk usage before which image garbage collection is never run. Lowest disk usage to garbage collect to. Values must be within the range [0, 100] and should not be larger than that of --image-gc-high-threshold.")

These three parameters can also be written in var/lib/kubelet/config.yaml, which are
imageGCHighThresholdPercent
imageGCLowThresholdPercent
ImageMinimumGCAge


3 The process of kubelet recycling container:

1) Kubelet first reclaims the recyclable business containers in all pods (list recyclable containers, the reclaiming action is to call the killContainer method and removeContainer method of the kubeGenericRuntimeManager object. The value of the ContainerGCPolicy object affects container screening and code forks)

2) kubelet then reclaims the recyclable sandboxes in all pods (list recyclable containers, the reclaim action is to call the killContainer method and removeContainer method of the kubeGenericRuntimeManager object. The value of the ContainerGCPolicy object affects container screening and code forks)

3) Kubelet finally recovers the log directories of pods and containers (/var/log/pods/ directory and /var/log/containers/ directory).

Insert picture description here
Insert picture description here


4 The process of kubelet reclaiming the image:

Obtain the disk information (capacity, usage, etc.) of the disk partition where the mirror is located, and determine whether the disk usage is greater than the threshold configured by the user (the value of --image-gc-high-threshold). If it is an attempt to delete unused mirrors, Let the usage of mirror disks fall below the threshold. To put it more bluntly, kubelet executes the df command to query the disk usage rate. If the usage rate is greater than the threshold set by the user in the command parameter or configuration file, the docker rmi command is executed to delete unused images.

5 important structures

type Kubelet struct {
	
	/*
		其他属性
	*/
	
	// Policy for handling garbage collection of dead containers.
	//用于回收退出状态的容器
	//实现类是结构体realContainerGC,包含了kubeGenericRuntimeManager对象,位于k8s.io/kubernetes/pkg/kubelet/container/container_gc.go
	containerGC kubecontainer.ContainerGC

	// Manager for image garbage collection.
	//用于回收未使用的镜像
	//实现类是结构体realImageGCManager,位于k8s.io/kubernetes/pkg/kubelet/images/image_gc_manager.go
	imageManager images.ImageGCManager
}

type kubeGenericRuntimeManager struct {

	/*
		其他属性
	*/

	// Container GC manager
	containerGC *containerGC
	
	// gRPC service clients
	// 以下两个属性包含了grpc客户端,最终是调用容器运行时(一般情况为docker)的容器接口和镜像接口
	runtimeService internalapi.RuntimeService
	imageService   internalapi.ImageManagerService
}

6 backbone code



//在创建kubelet对象时(即kubelet还没运行主循环),就启动了垃圾回收的协程StartGarbageCollection()
func createAndInitKubelet(...) {
    k, err = kubelet.NewMainKubelet(
       //代码省略
    )
    if err != nil {
        return nil, err
    }
    //代码省略
    
    k.StartGarbageCollection()
    return k, nil
}


func (kl *Kubelet) StartGarbageCollection() {
    loggedContainerGCFailure := false

    // 1)启动容器垃圾回收服务
	//启动协程定时执行kl.containerGC.GarbageCollect()方法,此方法是被封装在wait.Until()方法中,从而起到定时执行的效果。
	go wait.Until(func() {
		if err := kl.containerGC.GarbageCollect(); err != nil {
			//此代码可忽略
		} else {
			//此代码可忽略
			klog.V(vLevel).Infof("Container garbage collection succeeded")
		}
	}, ContainerGCPeriod, wait.NeverStop)
	
    //2)当ImageGCHighThresholdPercent 参数的值为100时,直接返回,即不启动镜像回收的协程。
    if kl.kubeletConfiguration.ImageGCHighThresholdPercent == 100 {
        return
    }

    //3)启动镜像垃圾回收服务
    //启动协程定时执行kl.imageManager.GarbageCollect()方法,此方法是被封装在wait.Until()方法中,从而起到定时执行的效果。
	go wait.Until(func() {
		if err := kl.imageManager.GarbageCollect(); err != nil {
			//此代码可忽略
		} else {
			//此代码可忽略
			klog.V(vLevel).Infof("Image garbage collection succeeded")
		}
	}, ImageGCPeriod, wait.NeverStop)
}



func (cgc *containerGC) GarbageCollect(gcPolicy kubecontainer.ContainerGCPolicy, allSourcesReady bool, evictTerminatedPods bool) error {
    errors := []error{}
    // 回收pod 中的container
    //kubeGenericRuntimeManager对象的killContainer方法和removeContainer方法
    if err := cgc.evictContainers(gcPolicy, allSourcesReady, evictTerminatedPods); err != nil {
        errors = append(errors, err)
    }

    // 回收pod中的sandboxes
     //kubeGenericRuntimeManager对象的killContainer方法和removeContainer方法
    if err := cgc.evictSandboxes(evictTerminatedPods); err != nil {
        errors = append(errors, err)
    }

    //回收pod和容器的日志目录(/var/log/pods/目录和/var/log/containers/目录)
    if err := cgc.evictPodLogsDirectories(allSourcesReady); err != nil {
        errors = append(errors, err)
    }
    return utilerrors.NewAggregate(errors)
}


func (im *realImageGCManager) GarbageCollect() error {
    // 获取容器镜像存储目录所在分区的磁盘信息
    fsStats, err := im.statsProvider.ImageFsStats()
    
	/*
	一些异常情况判断和处理
	*/
	
    //若分区使用率使用率大于HighThresholdPercent,此时进入if语句进行回收未使用的镜像
    usagePercent := 100 - int(available*100/capacity)
    if usagePercent >= im.policy.HighThresholdPercent {
        //计算需要释放的磁盘量,相比用户设置的阈值
        amountToFree := capacity*int64(100-im.policy.LowThresholdPercent)/100 - available

        //调用im.freeSpace()方法真正地回收未使用的镜像,底层是调用kubeGenericRuntimeManager对象的RemoveImage()方法
        freed, err := im.freeSpace(amountToFree, time.Now())
        if err != nil {
            return err
        }
		
		/*
			一些日志相关的代码
			此处省略
		*/       
    }

    return nil
}

7 Summary

kubelet opens a timing task to delete the containers and unused images that are exited in the pod. Some parameters of the command and configuration file can affect the execution of these two timing tasks, and the user executes the docker rmi and docker rm commands in the linux crontab. There is no essential difference in achieving the goal of reducing node resource consumption.

Guess you like

Origin blog.csdn.net/nangonghen/article/details/109271380