Un breve análisis del código fuente del proceso de eliminación de pod en kubernetes

1. Información general:

1.1 Entorno de código

La información de la versión es la siguiente:
a, clúster de kubernetes: v1.15.4

1.2 Breve descripción del proceso de eliminación de Pod

Cuando el usuario ejecuta el comando kubectl delete pod (en realidad con período de gracia = 30 s), en realidad está accediendo a la interfaz DELETE de kube-apiserver (en este momento, la lógica empresarial solo actualiza la metainformación del objeto Pod (campo DeletionTimestamp y el campo DeleteGracePeriodSeconds)), el registro no se elimina en etcd), en este momento la ejecución del comando kubectl se bloqueará y mostrará que el pod está siendo eliminado. Cuando el componente kubelet escucha el evento de actualización del objeto Pod, comienza a ejecutar el método de devolución de llamada correspondiente (debido a que existe el campo DeleteTimestamp, el método killPod () se ejecutará en la lógica empresarial). Después de un corto período de tiempo, kubelet escuchará el evento de eliminación de pod y llamará al método de devolución de llamada correspondiente (acceda a la interfaz DELETE de kube-apiserver con grace-period = 0), y luego se activará la interfaz DELETE de kube-apiserver El objeto de pod se elimina en etcd. En este momento, el usuario kubectl get pod realmente no puede ver el objeto de pod, porque el registro está realmente eliminado.


La interfaz DELETE de kube-apiserver se activa por primera vez, y el método regresa directamente después de ingresar la instrucción if. La captura de pantalla de depuración es la siguiente:
Inserte la descripción de la imagen aquí
La interfaz DELETE de kube-apiserver se activa por segunda vez, sin ingresar el if declaración, continúe ejecutando la lógica comercial posterior (elimine el objeto de pod de la foto etcd), la captura de pantalla de depuración es la siguiente:
Inserte la descripción de la imagen aquí


2 Análisis del código fuente principal

2.1 HANDLER de la interfaz DELETE de kube-apiserver

kube-apiserver es un servidor web, registrará el controlador HTTP cuando se inicie, la interfaz DELETE es la siguiente

func (a *APIInstaller) registerResourceHandlers(path string, storage rest.Storage, ws *restful.WebService) (*metav1.APIResource, error) {

    switch action.Verb {
        
    case "DELETE": // Delete a resource.  删除一个api资源对象
                /*
                其他代码
                */
                //handler的主要逻辑在于restfulDeleteResource()方法
                handler := metrics.InstrumentRouteFunc(action.Verb, group, version, resource, subresource, requestScope, metrics.APIServerComponent,
                
                    restfulDeleteResource(gracefulDeleter, isGracefulDeleter, reqScope, admit))
                    
                route := ws.DELETE(action.Path).To(handler).                    
                    /*
                    其他代码
                    */
                    Returns(http.StatusOK, "OK", versionedStatus).                    
                /*
                其他代码
                */
                addParams(route, action.Params)
                routes = append(routes, route)
    }
}

La lógica empresarial de la interfaz DELETE es en realidad un método estático ubicado en staging / src / k8s.io / apiserver / pkg / endpoints / handlers / delete.go

func restfulDeleteResource(r rest.GracefulDeleter, allowsOptions bool, scope handlers.RequestScope, admit admission.Interface) restful.RouteFunction {
    return func(req *restful.Request, res *restful.Response) {
        //调用一个静态方法,来自staging/src/k8s.io/apiserver/pkg/endpoints/handlers/delete.go
        handlers.DeleteResource(r, allowsOptions, &scope, admit)(res.ResponseWriter, req.Request)
    }
}


//当用户执行kubectl delete pod PODA时,本方法会被触发两次。
//第一次由kubectl的访问而触发
//第二次由kubelet组件的statusManager模块的访问而触发。
func DeleteResource(r rest.GracefulDeleter, allowsOptions bool, scope *RequestScope, admit admission.Interface) http.HandlerFunc {
    return func(w http.ResponseWriter, req *http.Request) {        
        trace := utiltrace.New("Delete " + req.URL.Path)    
        /*
        其他代码
        */
        options := &metav1.DeleteOptions{}        
        trace.Step("About to delete object from database")

        result, err := finishRequest(timeout, func() (runtime.Object, error) {
        	//重点在 r.Delete(...)
            obj, deleted, err := r.Delete(ctx, name, rest.AdmissionToValidateObjectDeleteFunc(admit, staticAdmissionAttrs, scope), options)
            /*
            其他代码
            */
            return obj, err
        })
        /*
        检查性代码
        */
        trace.Step("Object deleted from database")

        status := http.StatusOK
        /*
        其他代码
        */
        //向客户端返回响应
        transformResponseObject(ctx, scope, trace, req, w, status, outputMediaType, result)
    }
}

func (e *Store) Delete(ctx context.Context, name string, deleteValidation rest.ValidateObjectFunc, options *metav1.DeleteOptions) (...) {
    key, err := e.KeyFunc(ctx, name)
    /*
    检查性代码、无关紧要的代码
    */
    if graceful || pendingFinalizers || shouldUpdateFinalizers {
        //更新pod对象的元数据
        err, ignoreNotFound, deleteImmediately, out, lastExisting = e.updateForGracefulDeletionAndFinalizers(ctx, name, key, options, preconditions, deleteValidation, obj)
    }

	//第一次来到此处,直接返回
    // !deleteImmediately covers all cases where err != nil. We keep both to be future-proof.
    if !deleteImmediately || err != nil {
        return out, false, err
    }

    //第二次才会到达此处
    klog.V(6).Infof("going to delete %s from registry: ", name)
    //从etcd中删除对象
    e.Storage.Delete(...)
}

func (e *Store) updateForGracefulDeletionAndFinalizers(...) (...){
    /*
    其他代码
    */
    graceful, pendingGraceful, err := rest.BeforeDelete(e.DeleteStrategy, ctx, existing, options)
    /*
    其他代码
    */
}

func BeforeDelete(...) (...){
    //修改目标对象的元数据:DeletionTimestamp字段和DeletionGracePeriodSeconds字段
    objectMeta.SetDeletionTimestamp(&now)
    objectMeta.SetDeletionGracePeriodSeconds(options.GracePeriodSeconds)
}

2.1 flujo de procesamiento de kubelet

2.1.1 kubelet escucha el evento de actualización del objeto de pod

SyncPod () se ejecutará en el bucle principal, y la lógica syncPod () ejecutará el método kl.killPod (...)


func (kl *Kubelet) syncPod(o syncPodOptions) error {
      /*
        其他代码
        */
    //pod对象具备DeletionTimestamp字段则进入if语句
    if !runnable.Admit || pod.DeletionTimestamp != nil || apiPodStatus.Phase == v1.PodFailed {    
        //killPod(..)调用容器运行时来停止pod中容器
        if err := kl.killPod(pod, nil, podStatus, nil); err != nil {
        /*
            其他代码
        */
        } else {
          /*
            其他代码
        */
        }
        return syncErr
    }

}
func (kl *Kubelet) killPod(pod *v1.Pod, runningPod *kubecontainer.Pod, status *kubecontainer.PodStatus, gracePeriodOverride *int64) error {
	var p kubecontainer.Pod
	 /*
            其他代码
    */
	// 调用容器运行时停止pod中的容器
	if err := kl.containerRuntime.KillPod(pod, p, gracePeriodOverride); err != nil {
		return err
	}
	if err := kl.containerManager.UpdateQOSCgroups(); err != nil {
		klog.V(2).Infof("Failed to update QoS cgroups while killing pod: %v", err)
	}
	return nil
}

2.1.2 kubelet escucha el evento de eliminación del objeto de pod

La corrutina de statusManager ejecutará m.kubeClient.CoreV1 (). Pods (pod.Namespace) .Delete (pod.Name, deleteOptions), de modo que kube-apiserver eliminará el objeto pod de etcd.


//kubelet组件有一个statusManager模块,它会for循环调用syncPod()方法
//方法内部有机会调用kube-apiserver的DELETE接口(强制删除,非平滑)
func (m *manager) syncPod(uid types.UID, status versionedPodStatus) {
    /*
    其他代码
    */
    //当pod带有DeletionTimestamp字段,并且其内容器已被删除、持久卷已被删除等的多条件下,才会进入if语句内部
    if m.canBeDeleted(pod, status.status) {
        deleteOptions := metav1.NewDeleteOptions(0)
        deleteOptions.Preconditions = metav1.NewUIDPreconditions(string(pod.UID))
        
        //强制删除pod对象:kubectl delete pod podA --grace-period=0
        err = m.kubeClient.CoreV1().Pods(pod.Namespace).Delete(pod.Name, deleteOptions)
        
        /*
        其他代码
        */
    }
}

3 Documento oficial en inglés: terminación de pods

Because Pods represent running processes on nodes in the cluster, it is important to allow those processes to gracefully terminate when they are no longer needed (vs being violently killed with a KILL signal and having no chance to clean up).
Users should be able to request deletion and know when processes terminate, but also be able to ensure that deletes eventually complete.
#当一个用户发送一个delete pod的请求,系统会记录一个平滑时间后往Pod中每个容器的主进程发送一个TERM信号
When a user requests deletion of a Pod, the system records the intended grace period before the Pod is allowed to be forcefully killed, and a [ TERM signal ] is sent to the main process in each container. 
#当平滑时间到达,KILL信号发送到Pod中每个容器的主进程,apiServer也将Pod对象删除
Once the grace period has expired, the [ KILL signal ] is sent to those processes, and the Pod is then deleted from the API server. 
If the Kubelet or the container manager is restarted while waiting for processes to terminate, the termination will be retried with the full grace period.

An example flow:
	1. 
		User sends command to delete Pod, with default grace period (30s)
	2. 
		The Pod in the API server is updated with the time beyond which the Pod is considered “dead” along with the grace period.
	3. 
		Pod shows up as [ "Terminating" ] when listed in client commands
	4. 
		(simultaneous with 3) When the [ Kubelet ] sees that a Pod has been marked as terminating because the time in 2 has been set, it begins the pod shutdown process.

		4.1. 
				If the pod has defined a preStop hook, it is invoked inside of the pod. If the preStop hook is still running after the grace period expires, step 2 is then invoked with a small (2 second) extended grace period.
		4.2. 
				The processes in the Pod are sent the [ TERM signal ].
	5. 
		(simultaneous with 3) Pod is removed from endpoints list for service, and are no longer considered part of the set of running pods for replication controllers. Pods that shutdown slowly cannot continue to serve traffic as load balancers (like the service proxy) remove them from their rotations.
	6. 
		When the [ grace period expires ], any processes [ still running ] in the Pod are killed with [ SIGKILL ].
	7. 
		The Kubelet will finish deleting the Pod on the API server by setting grace period 0 (immediate deletion). The Pod disappears from the API and is no longer visible from the client.

By default, all deletes are graceful within 30 seconds. The kubectl delete command supports the --grace-period=<seconds> option which allows a user to override the default and specify their own value. 
The value 0 force deletes the pod. In kubectl version >= 1.5, you must specify an additional flag --force along with --grace-period=0 in order to perform force deletions.

Supongo que te gusta

Origin blog.csdn.net/nangonghen/article/details/109305635
Recomendado
Clasificación