In-depth k8s: Kubernetes persistent volume PV, PVC and source code analysis

Recommended reading:

Start with PV and PVC from an example

The Kubernetes project introduced a set of API objects called Persistent Volume Claim (PVC) and Persistent Volume (PV) to manage storage volumes.

Let's take a look at an example, this example comes from "k8s in Action":

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongodb-pvc
spec:
  resources:
    requests:
      storage: 1Gi
  accessModes:
  - ReadWriteOnce
  storageClassName: ""

The storage defined in the yaml file is 1 GiB to indicate the capacity required by the PVC;

Access Modes indicates the type of volume storage required, and ReadWriteOnce indicates that read and write operations can only be performed on one node.

storageClassName is empty, it means the name of storageClass, we will talk about it below.

Then get the status of PVC:

$ kubectl get pvc
NAME                   STATUS   VOLUME              CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mongodb-pvc            Available    mongodb-pv          1Gi        RWO,ROX                       2m25s

At this point, you can see that our PVC is in a usable state.

Then define a PV:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: mongodb-pv
spec:
  capacity:
    storage: 1Gi
  accessModes:
  - ReadWriteOnce
  - ReadOnlyMany
  persistentVolumeReclaimPolicy: Retain
  gcePersistentDisk:
    pdName: mongodb
    fsType: ext4

This PV object will define in detail that the storage type is GCE and the size is 1 GiB. The storageClassName is not declared here because the storageClassName is empty by default.

Then we look at the status of PV and PVC:

$ kubectl get pv
NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                          STORAGECLASS   REASON   AGE
mongodb-pv          1Gi        RWO,ROX        Retain           Bound       default/mongodb-pvc                                    77m

$ kubectl get pvc
NAME                   STATUS   VOLUME              CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mongodb-pvc            Bound    mongodb-pv          1Gi        RWO,ROX                       7m7s

You can see that PV and PVC have been bound to each other.

PVC and PV are equivalent to "interface" and "implementation", so we need to bind PVC and PV before they can be used, and when binding PVC and PV, we need to meet:

  1. To match the spec field of PV and PVC, such as the storage size of PV, it must meet the requirements of PVC.
  2. The storageClassName fields of PV and PVC must be the same for binding. storageClassName represents the name attribute of StorageClass.

If we want to use this PVC in Pod, then we can do this:

apiVersion: v1
kind: Pod
metadata:
  name: mongodb
spec:
  containers:
  - image: mongo
    name: mongodb
    volumeMounts:
    - name: mongodb-data
      mountPath: /data/db
    ports:
    - containerPort: 27017
      protocol: TCP
  volumes:
  - name: mongodb-data
    persistentVolumeClaim:
      claimName: mongodb-pvc

You only need to declare the name of the PVC in the Pod. After the Pod is created, kubelet will mount the PV corresponding to this PVC, which is a GCE-type Volume, on the directory in the Pod container.

PersistentVolumeController will continuously check whether each current PVC is in the Bound state. If it is not, it will traverse all available PVs and try to bind them to this "single" PVC. The source code of this PersistentVolumeController will be analyzed below. So the question is, why does k8s divide a storage volume into two parts?

Because in fact, in our project, R&D personnel and cluster management personnel are separated. R&D personnel only use it, but do not care what storage technology is used in the bottom layer. Therefore, R&D personnel only need to declare a PVC to indicate how much I need. A storage, and read-write type will do.

StorageClass的Dynamic Provisioning

The process of PV and PVC binding we mentioned above is called Static Provisioning, which requires manual creation of PV. We may have such a situation in our research and development, that is, the administrator has not created the corresponding PV for us in time, has it been waiting? ? Therefore, StorageClass is needed at this time. StorageClass provides a Dynamic Provisioning mechanism to create PV based on a template.

The StorageClass object defines the following two parts:

  1. The attributes of PV. For example, storage type, volume size, etc.
  2. The storage plug-in needed to create this kind of PV. For example, Ceph and so on.

In this way, k8s can find a corresponding StorageClass according to the PVC submitted by the user, and then call the storage plug-in declared by the StorageClass to create the required PV.

For example, declare the following StorageClass:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: block-service
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd

The StorageClass named block-service is defined here. The value of the provisioner field is: kubernetes.io/gce-pd, which is a built-in storage plug-in of k8s. The type field is also defined with provisioner. The official default supports the built-in storage plug-in of Dynamic Provisioning

Then you can declare the storageClassName as block-service in the PVC. After the PVC object is created, k8s will call the corresponding storage plug-in API to create a PV object.

as follows:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: claim1
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: block-service
  resources:
    requests:
      storage: 30Gi

This automatic PV creation mechanism is Dynamic Provisioning. Kubernetes can find a corresponding StorageClass based on the PVC submitted by the user, and then call the storage plugin declared by the StorageClass to create the required PV.

It should be noted that if the StorageClassName is not declared in the PVC, the value of the storageClassName of the PVC is "", which also means that it can only be bound to the PV whose storageClassName is also "".

Life cycle of PV and PVC

The interaction between PV and PVC follows this life cycle:

Provisioning —>Binding —>Using —>Reclaiming

Provisioning

k8s provides two PV generation methods: statically or dynamically

statically: PVs are created by the administrator, and they carry detailed information about the real storage available to cluster users. They exist in the Kubernetes API and can be used for consumption.

dynamically: When the static PV created by the administrator does not match the user's PersistentVolumeClaim, the cluster may try to dynamically configure the volume for the PVC. This configuration is based on StorageClasses. The PVC must request a StorageClasses, and the administrator must have created and configured the class to be dynamically configured.

Binding

After the PersistentVolumeClaim is created by the user, the PersistentVolumeController will continuously check whether each current PVC is already in the Bound state. If it is not, it will traverse all available PVs and try to bind them to this "single" PVC.

Using

After the Pods declare and use the PVC as the volume, the cluster will find the PVC. If the PVC is already bound to the PV, it will mount the volume to the Pod.

Reclaiming

When the user no longer uses the volume, the PVC can be deleted so that the resources can be recycled. Correspondingly, after the PVC is deleted, the PV recycling strategy can be Retained, Recycled, or Deleted. This strategy can be set in the field spec.persistentVolumeReclaimPolicy.

  • Retain: This policy allows manual recovery of resources. When the PVC is deleted, the PV can still exist. The administrator can manually delete the PV, and the storage resources bound to the PV will not be deleted. If you want to delete the corresponding storage For resource data, you need to manually delete the data corresponding to the storage resource.
  • Delete: This strategy will delete PV and storage resources managed by PV after the PVC is deleted.
  • Recycle: It is equivalent to executing the rm -rf /thevolume/* command in the volume so that the volume can be reused.

Under normal circumstances, we follow this deletion process:

  1. Delete the Pod using this PV;
  2. Remove the local disk from the host (for example, umount it);
  3. Delete PVC;
  4. Delete the PV.

Source code analysis

The source code processing logic of PV and PVC are in the two files pv_controller_base.go and pv_controller.go. Let's look directly at the core code.

First, let's take a look at the Run method of PersistentVolumeController. This is the entry point:

func (ctrl *PersistentVolumeController) Run(stopCh <-chan struct{}) {
    ... 
    go wait.Until(ctrl.resync, ctrl.resyncPeriod, stopCh)
    go wait.Until(ctrl.volumeWorker, time.Second, stopCh)
    go wait.Until(ctrl.claimWorker, time.Second, stopCh)
    ...
}

This code mainly consists of three Goroutines, which run different methods. The resync method is very simple. The main function is to find the list of PV and PVC and put them into the queues volumeQueue and claimQueue to consume for volumeWorker and claimWorker. So below we mainly look at volumeWorker and claimWorker

volumeWorker

The volumeWorker will continuously consume the data in the volumeQueue queue, and then obtain the corresponding PV to perform the updateVolume operation.

func (ctrl *PersistentVolumeController) updateVolume(volume *v1.PersistentVolume) {
    // Store the new volume version in the cache and do not process it if this
    // is an old version.
    //更新缓存
    new, err := ctrl.storeVolumeUpdate(volume)
    if err != nil {
        klog.Errorf("%v", err)
    }
    if !new {
        return
    }
    //核心方法,根据当前 PV 对象的规格对 PV 和 PVC 进行绑定或者解绑
    err = ctrl.syncVolume(volume)
    if err != nil {
        if errors.IsConflict(err) {
            // Version conflict error happens quite often and the controller
            // recovers from it easily.
            klog.V(3).Infof("could not sync volume %q: %+v", volume.Name, err)
        } else {
            klog.Errorf("could not sync volume %q: %+v", volume.Name, err)
        }
    }
}

The updateVolume method will call the syncVolume method to execute the core process.

We continue:

func (ctrl *PersistentVolumeController) syncVolume(volume *v1.PersistentVolume) error {
    klog.V(4).Infof("synchronizing PersistentVolume[%s]: %s", volume.Name, getVolumeStatusForLogging(volume)) 
    ...
    //如果spec.claimRef未设置,则是未使用过的pv,则调用updateVolumePhase函数更新状态设置 phase 为 available
    if volume.Spec.ClaimRef == nil { 
        klog.V(4).Infof("synchronizing PersistentVolume[%s]: volume is unused", volume.Name)
        if _, err := ctrl.updateVolumePhase(volume, v1.VolumeAvailable, ""); err != nil { 
            return err
        }
        return nil
    } else /* pv.Spec.ClaimRef != nil */ { 
        //正在被bound中,更新状态available
        if volume.Spec.ClaimRef.UID == "" { 
            klog.V(4).Infof("synchronizing PersistentVolume[%s]: volume is pre-bound to claim %s", volume.Name, claimrefToClaimKey(volume.Spec.ClaimRef))
            if _, err := ctrl.updateVolumePhase(volume, v1.VolumeAvailable, ""); err != nil { 
                return err
            }
            return nil
        }
        klog.V(4).Infof("synchronizing PersistentVolume[%s]: volume is bound to claim %s", volume.Name, claimrefToClaimKey(volume.Spec.ClaimRef))
        // Get the PVC by _name_
        var claim *v1.PersistentVolumeClaim
        //根据 pv 的 claimRef 获得 pvc
        claimName := claimrefToClaimKey(volume.Spec.ClaimRef)
        obj, found, err := ctrl.claims.GetByKey(claimName)
        if err != nil {
            return err
        }
        //如果在队列未发现,可能是volume被删除了,或者失败了,重新同步pvc
        if !found && metav1.HasAnnotation(volume.ObjectMeta, pvutil.AnnBoundByController) { 
            if volume.Status.Phase != v1.VolumeReleased && volume.Status.Phase != v1.VolumeFailed {
                obj, err = ctrl.claimLister.PersistentVolumeClaims(volume.Spec.ClaimRef.Namespace).Get(volume.Spec.ClaimRef.Name)
                if err != nil && !apierrors.IsNotFound(err) {
                    return err
                }
                found = !apierrors.IsNotFound(err)
                if !found {
                    obj, err = ctrl.kubeClient.CoreV1().PersistentVolumeClaims(volume.Spec.ClaimRef.Namespace).Get(context.TODO(), volume.Spec.ClaimRef.Name, metav1.GetOptions{})
                    if err != nil && !apierrors.IsNotFound(err) {
                        return err
                    }
                    found = !apierrors.IsNotFound(err)
                }
            }
        }
        if !found {
            klog.V(4).Infof("synchronizing PersistentVolume[%s]: claim %s not found", volume.Name, claimrefToClaimKey(volume.Spec.ClaimRef)) 
        } else {
            var ok bool
            claim, ok = obj.(*v1.PersistentVolumeClaim)
            if !ok {
                return fmt.Errorf("Cannot convert object from volume cache to volume %q!?: %#v", claim.Spec.VolumeName, obj)
            }
            klog.V(4).Infof("synchronizing PersistentVolume[%s]: claim %s found: %s", volume.Name, claimrefToClaimKey(volume.Spec.ClaimRef), getClaimStatusForLogging(claim))
        }
        if claim != nil && claim.UID != volume.Spec.ClaimRef.UID { 
            klog.V(4).Infof("synchronizing PersistentVolume[%s]: claim %s has different UID, the old one must have been deleted", volume.Name, claimrefToClaimKey(volume.Spec.ClaimRef))
            // Treat the volume as bound to a missing claim.
            claim = nil
        }
        //claim可能被删除了,或者pv被删除了
        if claim == nil { 
            if volume.Status.Phase != v1.VolumeReleased && volume.Status.Phase != v1.VolumeFailed {
                // Also, log this only once:
                klog.V(2).Infof("volume %q is released and reclaim policy %q will be executed", volume.Name, volume.Spec.PersistentVolumeReclaimPolicy)
                if volume, err = ctrl.updateVolumePhase(volume, v1.VolumeReleased, ""); err != nil { 
                    return err
                }
            }
            //根据persistentVolumeReclaimPolicy配置做相应的处理,Retain 保留/ Delete 删除/ Recycle 回收
            if err = ctrl.reclaimVolume(volume); err != nil { 
                return err
            }
            if volume.Spec.PersistentVolumeReclaimPolicy == v1.PersistentVolumeReclaimRetain {
                // volume is being retained, it references a claim that does not exist now.
                klog.V(4).Infof("PersistentVolume[%s] references a claim %q (%s) that is not found", volume.Name, claimrefToClaimKey(volume.Spec.ClaimRef), volume.Spec.ClaimRef.UID)
            }
            return nil
        } else if claim.Spec.VolumeName == "" {
            if pvutil.CheckVolumeModeMismatches(&claim.Spec, &volume.Spec) { 
                volumeMsg := fmt.Sprintf("Cannot bind PersistentVolume to requested PersistentVolumeClaim %q due to incompatible volumeMode.", claim.Name)
                ctrl.eventRecorder.Event(volume, v1.EventTypeWarning, events.VolumeMismatch, volumeMsg)
                claimMsg := fmt.Sprintf("Cannot bind PersistentVolume %q to requested PersistentVolumeClaim due to incompatible volumeMode.", volume.Name)
                ctrl.eventRecorder.Event(claim, v1.EventTypeWarning, events.VolumeMismatch, claimMsg)
                // Skipping syncClaim
                return nil
            }

            if metav1.HasAnnotation(volume.ObjectMeta, pvutil.AnnBoundByController) { 
                klog.V(4).Infof("synchronizing PersistentVolume[%s]: volume not bound yet, waiting for syncClaim to fix it", volume.Name)
            } else { 
                klog.V(4).Infof("synchronizing PersistentVolume[%s]: volume was bound and got unbound (by user?), waiting for syncClaim to fix it", volume.Name)
            } 
            ctrl.claimQueue.Add(claimToClaimKey(claim))
            return nil
        //  已经绑定更新状态status phase为Bound
        } else if claim.Spec.VolumeName == volume.Name {
            // Volume is bound to a claim properly, update status if necessary
            klog.V(4).Infof("synchronizing PersistentVolume[%s]: all is bound", volume.Name)
            if _, err = ctrl.updateVolumePhase(volume, v1.VolumeBound, ""); err != nil {
                // Nothing was saved; we will fall back into the same
                // condition in the next call to this method
                return err
            }
            return nil
        //  PV绑定到PVC上,但是PVC被绑定到其他PV上,重置
        } else {
            // Volume is bound to a claim, but the claim is bound elsewhere
            if metav1.HasAnnotation(volume.ObjectMeta, pvutil.AnnDynamicallyProvisioned) && volume.Spec.PersistentVolumeReclaimPolicy == v1.PersistentVolumeReclaimDelete {

                if volume.Status.Phase != v1.VolumeReleased && volume.Status.Phase != v1.VolumeFailed { 
                    klog.V(2).Infof("dynamically volume %q is released and it will be deleted", volume.Name)
                    if volume, err = ctrl.updateVolumePhase(volume, v1.VolumeReleased, ""); err != nil { 
                        return err
                    }
                }
                if err = ctrl.reclaimVolume(volume); err != nil { 
                    return err
                }
                return nil
            } else { 
                if metav1.HasAnnotation(volume.ObjectMeta, pvutil.AnnBoundByController) { 
                    klog.V(4).Infof("synchronizing PersistentVolume[%s]: volume is bound by controller to a claim that is bound to another volume, unbinding", volume.Name)
                    if err = ctrl.unbindVolume(volume); err != nil {
                        return err
                    }
                    return nil
                } else { 
                    klog.V(4).Infof("synchronizing PersistentVolume[%s]: volume is bound by user to a claim that is bound to another volume, waiting for the claim to get unbound", volume.Name) 
                    if err = ctrl.unbindVolume(volume); err != nil {
                        return err
                    }
                    return nil
                }
            }
        }
    }
}

This method is a bit long, let’s analyze it step by step:

This method first verifies whether a ClaimRef is set, because if a PV is bound, its ClaimRef attribute will be assigned. We can use kubectl edit pv mongodb-pv to enter the instance to view the current PV Properties, you will find:

  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: mongodb-pvc
    namespace: default
    resourceVersion: "824043"
    uid: 5cf34ad0-2181-4d99-9875-0d4559e58f42

So if this attribute is empty, then the status of the PV needs to be updated to Available.

If ClaimRef is not empty, the UID attribute will be checked next. If UID is empty, it means that the PV is bound to the PVC, but the PVC is not bound to the PV, so you need to reset the status of the PV to Available;

Then get the PVC corresponding to the PV. If the corresponding PVC is not found in the PVC collection, then in order to prevent the local cache from not being refreshed, look for it again through apiserver, and then mark the found variable;

If you find the corresponding PVC, you need to compare whether the UIDs are equal. If they are not equal, it means that it is not the PVC that is bound. It can be considered that the PVC is deleted. Then you need to update and release the PV, and change the status of the PV to Released. ;

Then the reclaimVolume method will be called, which will be processed according to the persistentVolumeReclaimPolicy configuration:

PersistentVolumeController#reclaimVolume

func (ctrl *PersistentVolumeController) reclaimVolume(volume *v1.PersistentVolume) error {
    ...
    switch volume.Spec.PersistentVolumeReclaimPolicy {
    //这个策略允许手动回收资源,当PVC被删除后,PV仍然可以存在,管理员可以手动的执行删除PV
    case v1.PersistentVolumeReclaimRetain:
        klog.V(4).Infof("reclaimVolume[%s]: policy is Retain, nothing to do", volume.Name)
    //回收PV,如果没有pod在使用PV,那么将该PV的状态设置为Available
    case v1.PersistentVolumeReclaimRecycle:
        ...
        ctrl.scheduleOperation(opName, func() error {
            ctrl.recycleVolumeOperation(volume)
            return nil
        })
    //这个策略会在PVC被删除之后,连带将PV以及PV管理的存储资源也删除
    case v1.PersistentVolumeReclaimDelete:
        ...
        ctrl.scheduleOperation(opName, func() error {
            _, err := ctrl.deleteVolumeOperation(volume)
            if err != nil { 
                metrics.RecordMetric(volume.Name, &ctrl.operationTimestamps, err)
            }
            return err
        })

    default:
        ...
    }
    return nil
}

In this method, a switch case is used to process the PersistentVolumeReclaimPolicy policy. If it is a Retain policy, you need to manually delete it, and only one log is recorded here; if it is Recycle, call recycleVolumeOperation to perform the unbinding operation; if it is Delete, call the deleteVolumeOperation method Delete the corresponding PV.

Let's pick deleteVolumeOperation to see the specific implementation of this method:

func (ctrl *PersistentVolumeController) deleteVolumeOperation(volume *v1.PersistentVolume) (string, error) {
    klog.V(4).Infof("deleteVolumeOperation [%s] started", volume.Name)

    //这里先读取最新的PV实例
    newVolume, err := ctrl.kubeClient.CoreV1().PersistentVolumes().Get(context.TODO(), volume.Name, metav1.GetOptions{})
    if err != nil {
        klog.V(3).Infof("error reading persistent volume %q: %v", volume.Name, err)
        return "", nil
    }
    //如果已经被删除了,直接返回
    if newVolume.GetDeletionTimestamp() != nil {
        klog.V(3).Infof("Volume %q is already being deleted", volume.Name)
        return "", nil
    }
    //看一下是否还能找得到对应的PVC
    needsReclaim, err := ctrl.isVolumeReleased(newVolume)
    if err != nil {
        klog.V(3).Infof("error reading claim for volume %q: %v", volume.Name, err)
        return "", nil
    }
    //如果还有PVC与之关联,那么就不能删除这个PV
    if !needsReclaim {
        klog.V(3).Infof("volume %q no longer needs deletion, skipping", volume.Name)
        return "", nil
    }
    //调用相应的plugin删除PV
    pluginName, deleted, err := ctrl.doDeleteVolume(volume) 
    ...
    return pluginName, nil
}

It can be seen that when deleting is executed, a series of verifications will be performed first to confirm whether the PV has been manually deleted and whether the PVC corresponding to the PV still exists, and then the corresponding plug-in is called to execute the deletion.

We continue back to the syncVolume method of PersistentVolumeController. After verifying the claim, it will continue to check whether the VolumeName is empty, which indicates that it is being bound;

If the VolumeName of the PVC is equal to the name of the PV, it means that it has been bound, then update the status to Bound; otherwise, it means that the PV is bound to the PVC, but the PVC is bound to other PVs, check whether it is dynamically provisioned and automatically generated , If it is, release the PV; if it is a manually created PV, call unbindVolume to unbind it.

At this point we have finished watching volumeWorker, let’s look at claimWorker

claimWorker

Like volumeWorker, claimWorker also continuously obtains PVC in a loop, and then calls the updateClaim method to enter syncClaim for specific operations:

PersistentVolumeController#syncClaim

func (ctrl *PersistentVolumeController) syncClaim(claim *v1.PersistentVolumeClaim) error {
    klog.V(4).Infof("synchronizing PersistentVolumeClaim[%s]: %s", claimToClaimKey(claim), getClaimStatusForLogging(claim))

    newClaim, err := ctrl.updateClaimMigrationAnnotations(claim)
    if err != nil { 
        return err
    }
    claim = newClaim
    //根据当前对象中的注解决定调用逻辑
    if !metav1.HasAnnotation(claim.ObjectMeta, pvutil.AnnBindCompleted) {
        //处理未绑定的pvc
        return ctrl.syncUnboundClaim(claim)
    } else {
        //处理已经绑定的pvc
        return ctrl.syncBoundClaim(claim)
    }
}

This method will retrieve some PVCs from the cache, and then call the logic according to the destined PVC.

Let me start with syncUnboundClaim. The method is relatively long and is divided into two parts:

func (ctrl *PersistentVolumeController) syncUnboundClaim(claim *v1.PersistentVolumeClaim) error {
        //说明pvc处于pending状态,没有完成绑定操作
    if claim.Spec.VolumeName == "" {
        // User did not care which PV they get.
        // 是否是延迟绑定
        delayBinding, err := pvutil.IsDelayBindingMode(claim, ctrl.classLister)
        if err != nil {
            return err
        }

        // [Unit test set 1]
        //根据声明的PVC设置的字段找到对应的PV
        volume, err := ctrl.volumes.findBestMatchForClaim(claim, delayBinding)
        if err != nil {
            klog.V(2).Infof("synchronizing unbound PersistentVolumeClaim[%s]: Error finding PV for claim: %v", claimToClaimKey(claim), err)
            return fmt.Errorf("Error finding PV for claim %q: %v", claimToClaimKey(claim), err)
        }
        //如果没有可用volume情况
        if volume == nil {
            klog.V(4).Infof("synchronizing unbound PersistentVolumeClaim[%s]: no volume found", claimToClaimKey(claim)) 
            switch {
            case delayBinding && !pvutil.IsDelayBindingProvisioning(claim):
                if err = ctrl.emitEventForUnboundDelayBindingClaim(claim); err != nil {
                    return err
                }
            //  找对应的storageclass
            case v1helper.GetPersistentVolumeClaimClass(claim) != "":
                //根据对应的插件创建PV
                if err = ctrl.provisionClaim(claim); err != nil {
                    return err
                }
                return nil
            default:
                ctrl.eventRecorder.Event(claim, v1.EventTypeNormal, events.FailedBinding, "no persistent volumes available for this claim and no storage class is set")
            }

            // 等待下次循环再查找匹配的PV进行绑定
            if _, err = ctrl.updateClaimStatus(claim, v1.ClaimPending, nil); err != nil {
                return err
            }
            return nil
        //  找到volume,进行绑定操作
        } else /* pv != nil */ { 
            claimKey := claimToClaimKey(claim)
            klog.V(4).Infof("synchronizing unbound PersistentVolumeClaim[%s]: volume %q found: %s", claimKey, volume.Name, getVolumeStatusForLogging(volume))
            //执行绑定操作
            if err = ctrl.bind(volume, claim); err != nil { 
                metrics.RecordMetric(claimKey, &ctrl.operationTimestamps, err)
                return err
            } 
            metrics.RecordMetric(claimKey, &ctrl.operationTimestamps, nil)
            return nil
        }
    }
    ...
}

This method will first check whether VolumeName is empty, if it is empty, then check whether delayed binding is set

Then go to the PV collection to see if you can find the PV that meets the requirements. If there is no available PV, then see if it is dynamically provisioned. If so, set the PVC status to Binding after creating the PV asynchronously, and then wait for the next cycle to find the matching one. PV is bound;

If a matching PV is found, the bind method is called to perform the binding, the bind method will not be posted, and the ClaimRef field, status phase, VolumeName, etc. will be updated.

Next, take a look at the lower part of the syncUnboundClaim code:

func (ctrl *PersistentVolumeController) syncUnboundClaim(claim *v1.PersistentVolumeClaim) error {
    ...
    } else /* pvc.Spec.VolumeName != nil */ { 
        klog.V(4).Infof("synchronizing unbound PersistentVolumeClaim[%s]: volume %q requested", claimToClaimKey(claim), claim.Spec.VolumeName)
        //若VolumeName不为空,那么找到相应的PV
        obj, found, err := ctrl.volumes.store.GetByKey(claim.Spec.VolumeName)
        if err != nil {
            return err
        }
        //说明对应的PV已经不存在了,更新状态为Pending
        if !found { 
            klog.V(4).Infof("synchronizing unbound PersistentVolumeClaim[%s]: volume %q requested and not found, will try again next time", claimToClaimKey(claim), claim.Spec.VolumeName)
            if _, err = ctrl.updateClaimStatus(claim, v1.ClaimPending, nil); err != nil {
                return err
            }
            return nil
        } else {
            volume, ok := obj.(*v1.PersistentVolume)
            if !ok {
                return fmt.Errorf("Cannot convert object from volume cache to volume %q!?: %+v", claim.Spec.VolumeName, obj)
            }
            klog.V(4).Infof("synchronizing unbound PersistentVolumeClaim[%s]: volume %q requested and found: %s", claimToClaimKey(claim), claim.Spec.VolumeName, getVolumeStatusForLogging(volume))
            //PV的ClaimRef字段为空,那么调用bind执行绑定操作
            if volume.Spec.ClaimRef == nil { 
                klog.V(4).Infof("synchronizing unbound PersistentVolumeClaim[%s]: volume is unbound, binding", claimToClaimKey(claim))
                if err = checkVolumeSatisfyClaim(volume, claim); err != nil {
                    klog.V(4).Infof("Can't bind the claim to volume %q: %v", volume.Name, err) 
                    msg := fmt.Sprintf("Cannot bind to requested volume %q: %s", volume.Name, err)
                    ctrl.eventRecorder.Event(claim, v1.EventTypeWarning, events.VolumeMismatch, msg) 
                    if _, err = ctrl.updateClaimStatus(claim, v1.ClaimPending, nil); err != nil {
                        return err
                    }
                } else if err = ctrl.bind(volume, claim); err != nil { 
                    return err
                } 
                return nil
            //  这里主要校验volume是否已绑定了别的PVC,如果没有的话,执行绑定
            } else if pvutil.IsVolumeBoundToClaim(volume, claim) { 
                klog.V(4).Infof("synchronizing unbound PersistentVolumeClaim[%s]: volume already bound, finishing the binding", claimToClaimKey(claim))

                if err = ctrl.bind(volume, claim); err != nil {
                    return err
                } 
                return nil
            } else {
                //这里是PV绑定了其他PVC,等待下次循环再重试
                ...
            }
        }
    }
}

It is explained here that VolumeName is not empty, so it is natural to take out the corresponding PV. If the corresponding PV no longer exists, then wait for the next call to execute the binding;

If the corresponding PV is found, if the ClaimRef field is empty, then call bind to perform the binding operation;

If ClaimRef is not empty, then call IsVolumeBoundToClaim to check whether the PV has been bound to another PVC, if not, perform binding

IsVolumeBoundToClaim

func IsVolumeBoundToClaim(volume *v1.PersistentVolume, claim *v1.PersistentVolumeClaim) bool {
    if volume.Spec.ClaimRef == nil {
        return false
    }
    if claim.Name != volume.Spec.ClaimRef.Name || claim.Namespace != volume.Spec.ClaimRef.Namespace {
        return false
    }
    if volume.Spec.ClaimRef.UID != "" && claim.UID != volume.Spec.ClaimRef.UID {
        return false
    }
    return true
}

We can see that this method is mainly to check whether the corresponding fields are equal, if they are not equal, return false, indicating that the PV is bound to other PVCs, and wait for the next cycle and try again.

Let's take a look at what syncBoundClaim does:

func (ctrl *PersistentVolumeController) syncBoundClaim(claim *v1.PersistentVolumeClaim) error { 
    if claim.Spec.VolumeName == "" { 
        //这里说明以前被绑定过,但现在已经找不到对应的PV了,说明数据丢失,在变更状态的同时,需要发出一个警告事件
        if _, err := ctrl.updateClaimStatusWithEvent(claim, v1.ClaimLost, nil, v1.EventTypeWarning, "ClaimLost", "Bound claim has lost reference to PersistentVolume. Data on the volume is lost!"); err != nil {
            return err
        }
        return nil
    }
    obj, found, err := ctrl.volumes.store.GetByKey(claim.Spec.VolumeName)
    if err != nil {
        return err
    }
    //绑定到不存在的pv情况
    if !found { 
        //这里说明以前被绑定过,但现在已经找不到对应的PV了,说明数据丢失,在变更状态的同时,需要发出一个警告事件
        if _, err = ctrl.updateClaimStatusWithEvent(claim, v1.ClaimLost, nil, v1.EventTypeWarning, "ClaimLost", "Bound claim has lost its PersistentVolume. Data on the volume is lost!"); err != nil {
            return err
        }
        return nil
    //  存在pv情况
    } else {
        volume, ok := obj.(*v1.PersistentVolume)
        if !ok {
            return fmt.Errorf("Cannot convert object from volume cache to volume %q!?: %#v", claim.Spec.VolumeName, obj)
        }

        klog.V(4).Infof("synchronizing bound PersistentVolumeClaim[%s]: volume %q found: %s", claimToClaimKey(claim), claim.Spec.VolumeName, getVolumeStatusForLogging(volume))
        //更新绑定关系,这里说明PVC是绑定的,但是PV处于未绑定
        if volume.Spec.ClaimRef == nil { 
            klog.V(4).Infof("synchronizing bound PersistentVolumeClaim[%s]: volume is unbound, fixing", claimToClaimKey(claim))
            if err = ctrl.bind(volume, claim); err != nil {
                // Objects not saved, next syncPV or syncClaim will try again
                return err
            }
            return nil
        //  更新绑定关系
        } else if volume.Spec.ClaimRef.UID == claim.UID { 
            klog.V(4).Infof("synchronizing bound PersistentVolumeClaim[%s]: claim is already correctly bound", claimToClaimKey(claim))
            if err = ctrl.bind(volume, claim); err != nil { 
                return err
            }
            return nil
        } else { 
            //这里说明两个PVC绑定到同一个PV上了
            if _, err = ctrl.updateClaimStatusWithEvent(claim, v1.ClaimLost, nil, v1.EventTypeWarning, "ClaimMisbound", "Two claims are bound to the same volume, this one is bound incorrectly"); err != nil {
                return err
            }
            return nil
        }
    }
}

This method is mainly to deal with various abnormal situations that PVC has been bound, such as checking whether the VolumeName field is empty, checking whether the corresponding PV can be found, checking whether the corresponding PV has been bound to the current PVC, and checking whether there are multiple PVC bound Set to the same PV and so on.

to sum up

This article first explained the use of PV and PVC through an example, then explained the process of dynamic binding, and finally explained some basic concepts of PV and PVC. Next, we learned about the processing flow corresponding to PV and PVC through the source code, and the details of binding each other, but this article has some regrets about how the AD controller corresponding to the volume attach and detach is performed. There will be opportunities later. Fill.

Guess you like

Origin blog.csdn.net/weixin_45784983/article/details/108340193