kubeflow-11-Operación de almacenamiento persistente VolumeOp

1 kfp.dsl.VolumeOp

class VolumeOp(kfp.dsl._resource_op.ResourceOp)
Represents an op which will be translated into a resource template which will be creating a PVC.
表示将转换为资源模板的op,该模板将创建PVC。
参数:
(1)resource_name:必须 A desired name for the PVC which will be created.
(2)size:必须 The size of the PVC which will be created.
例如size="1Gi"
(3)storage_class:可选 The storage class to use for the dynamically created PVC.
(4)modes:必须 The access modes for the PVC.
The user may find the following modes built-in:
* `VOLUME_MODE_RWO`: `["ReadWriteOnce"]`
* `VOLUME_MODE_RWM`: `["ReadWriteMany"]`
* `VOLUME_MODE_ROM`: `["ReadOnlyMany"]`
默认 `VOLUME_MODE_RWM`.
(5)annotations: 可选 Annotations to be patched in the PVC.
(6)data_source:可选 May be a V1TypedLocalObjectReference, 
and then it is used in the data_source field of the PVC as is. 
Can also be a string/PipelineParam, 
and in that case it will be used as a VolumeSnapshot name (Alpha feature).
(7)volume_name: VolumeName is the binding reference to the PersistentVolume backing this claim.
是对支持此声明的PersistentVolume的绑定引用。
(8)kwargs: See :py:class:`kfp.dsl.ResourceOp`
例如name表示操作Op的名字,显示在图中。
****************************************************
异常信息:ValueError: 
if k8s_resource is provided along with other arguments
if k8s_resource is not a V1PersistentVolumeClaim
if size is None
if size is an invalid memory string (when not a PipelineParam)
if data_source is not one of (str, PipelineParam, V1TypedLocalObjectReference)
****************************************************
__init__(self, 
resource_name:str=None, 
size:str=None, 
storage_class:str=None, 
modes:List[str]=None, 
annotations:Dict[str, str]=None, 
data_source=None, 
volume_name=None, 
**kwargs)

 |
 |  ----------------------------------------------------------------------
 |  Methods inherited from kfp.dsl._resource_op.ResourceOp:
 |
 |  delete(self, flags:Union[List[str], NoneType]=None)
 |      Returns a ResourceOp which deletes the resource.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from kfp.dsl._resource_op.ResourceOp:
 |
 |  resource
 |      `Resource` object that represents the `resource` property in
 |      `io.argoproj.workflow.v1alpha1.Template`.
 |
 |  ----------------------------------------------------------------------
 |  Methods inherited from kfp.dsl._container_op.BaseOp:
 |
 |  __repr__(self)
 |      Return repr(self).
 |
 |  add_affinity(self, affinity:kubernetes.client.models.v1_affinity.V1Affinity)
 |      Add K8s Affinity
 |
 |      Args:
 |        affinity: Kubernetes affinity
 |        For detailed spec, check affinity definition
 |        https://github.com/kubernetes-client/python/blob/master/kubernetes/client/models/v1_affinity.py
 |
 |      Example::
 |
 |          V1Affinity(
 |              node_affinity=V1NodeAffinity(
 |                  required_during_scheduling_ignored_during_execution=V1NodeSelector(
 |                      node_selector_terms=[V1NodeSelectorTerm(
 |                          match_expressions=[V1NodeSelectorRequirement(
 |                              key='beta.kubernetes.io/instance-type', operator='In', values=['p2.xlarge'])])])))
 |
 |  add_init_container(self, init_container:kfp.dsl._container_op.UserContainer)
 |      Add a init container to the Op.
 |
 |      Args:
 |        init_container: UserContainer object.
 |
 |  add_node_selector_constraint(self, label_name, value)
 |      Add a constraint for nodeSelector. Each constraint is a key-value pair label. For the
 |      container to be eligible to run on a node, the node must have each of the constraints appeared
 |      as labels.
 |
 |      Args:
 |        label_name: The name of the constraint label.
 |        value: The value of the constraint label.
 |
 |  add_pod_annotation(self, name:str, value:str)
 |      Adds a pod's metadata annotation.
 |
 |      Args:
 |        name: The name of the annotation.
 |        value: The value of the annotation.
 |
 |  add_pod_label(self, name:str, value:str)
 |      Adds a pod's metadata label.
 |
 |      Args:
 |        name: The name of the label.
 |        value: The value of the label.
 |
 |  add_sidecar(self, sidecar:kfp.dsl._container_op.Sidecar)
 |      Add a sidecar to the Op.
 |
 |      Args:
 |        sidecar: SideCar object.
 |
 |  add_toleration(self, tolerations:kubernetes.client.models.v1_toleration.V1Toleration)
 |      Add K8s tolerations
 |
 |      Args:
 |        tolerations: Kubernetes toleration
 |        For detailed spec, check toleration definition
 |        https://github.com/kubernetes-client/python/blob/master/kubernetes/client/models/v1_toleration.py
 |
 |  add_volume(self, volume)
 |      Add K8s volume to the container
 |
 |      Args:
 |        volume: Kubernetes volumes
 |        For detailed spec, check volume definition
 |        https://github.com/kubernetes-client/python/blob/master/kubernetes/client/models/v1_volume.py
 |
 |  after(self, *ops)
 |      Specify explicit dependency on other ops.
 |
 |  apply(self, mod_func)
 |      Applies a modifier function to self. The function should return the passed object.
 |      This is needed to chain "extention methods" to this class.
 |
 |      Example::
 |
 |          from kfp.gcp import use_gcp_secret
 |          task = (
 |              train_op(...)
 |                  .set_memory_request('1G')
 |                  .apply(use_gcp_secret('user-gcp-sa'))
 |                  .set_memory_limit('2G')
 |          )
 |
 |  set_display_name(self, name:str)
 |
 |  set_retry(self, num_retries:int, policy:str=None)
 |      Sets the number of times the task is retried until it's declared failed.
 |
 |      Args:
 |        num_retries: Number of times to retry on failures.
 |        policy: Retry policy name.
 |
 |  set_timeout(self, seconds:int)
 |      Sets the timeout for the task in seconds.
 |
 |      Args:
 |        seconds: Number of seconds.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from kfp.dsl._container_op.BaseOp:
 |
 |  __dict__
 |      dictionary for instance variables (if defined)
 |
 |  __weakref__
 |      list of weak references to the object (if defined)
 |
 |  inputs
 |      List of PipelineParams that will be converted into input parameters
 |      (io.argoproj.workflow.v1alpha1.Inputs) for the argo workflow.
 |
 |  ----------------------------------------------------------------------
 |  Data and other attributes inherited from kfp.dsl._container_op.BaseOp:
 |
 |  attrs_with_pipelineparams = ['node_selector', 'volumes', 'pod_annotati...

2 kfp.dsl.ContainerOp

kfp.dsl.ContainerOp
significa que el contenedor implementa la operación de duplicación

class ContainerOp(BaseOp)
Represents an op implemented by a container image.
参数说明:
常用(1)name: the name of the op.
 It does not have to be unique within a pipeline,
 because the pipeline will generates a unique new name in case of conflicts.
常用(2)image: the container image name, such as 'python:3.5-jessie'
常用(3)command: the command to run in the container.
If None, uses default CMD in defined in container.
常用(4)arguments: the arguments of the command. 
The command can include "%s" and supply a PipelineParam as the string replacement.
 
For example, ('echo %s' % input_param).
At container run time the argument will be 'echo param_value'.
(5)init_containers: the list of `UserContainer` objects 
describing the InitContainer to deploy before the `main` container.
(6)sidecars: the list of `Sidecar` objects 
describing the sidecar containers to deploy together with the `main` container.
(7)container_kwargs: the dict of additional keyword arguments to pass to the op's `Container` definition.

(8)artifact_argument_paths: Optional. Maps input artifact arguments (values or references) to the local file paths where they'll be placed. 

At pipeline run time, the value of the artifact argument is saved to a local file with specified path. 

This parameter is only needed when the input file paths are hard-coded in the program. 

Otherwise it's better to pass input artifact placement paths by including artifact arguments in the command-line using the InputArgumentPath class instances.

常用(9)file_outputs: Maps output names to container local output file paths.
将输出名称映射到容器本地输出文件路径。

The system will take the data from those files and will make it available for passing to downstream tasks.
系统将从这些文件中获取数据,并将其传递给下游任务。

For each output in the file_outputs map there will be a corresponding output reference available in the task.outputs dictionary.
对于file_outputs map中的每个输出,输出引用对应于task.outputs字典。

These output references can be passed to the other tasks as arguments.
这些输出引用可以作为参数传递给其他任务。

The following output names are handled specially by the frontend and backend: "mlpipeline-ui-metadata" and "mlpipeline-metrics".
(10)output_artifact_paths: Deprecated. Maps output artifact labels to local artifact file paths. Deprecated: Use file_outputs instead. It now supports big data outputs.
 
(11)is_exit_handler: Deprecated. This is no longer needed.
(12)pvolumes: Dictionary for the user to match a path on the op's fs with a V1Volume or it inherited type.
E.g {
    
    "/my/path": vol, "/mnt": other_op.pvolumes["/output"]}.
**********************************************************
例如:
from kfp import dsl|      
from kubernetes.client.models import V1EnvVar, V1SecretKeySelector

@dsl.pipeline(
  name='foo',
  description='hello world')
def foo_pipeline(tag: str, pull_image_policy: str):

  # any attributes can be parameterized (both serialized string or actual PipelineParam)
  op = dsl.ContainerOp(name='foo',
	  image='busybox:%s' % tag,
	  # pass in init_container list
	  init_containers=[dsl.UserContainer('print', 'busybox:latest', command='echo "hello"')],
	  # pass in sidecars list
	  sidecars=[dsl.Sidecar('print', 'busybox:latest', command='echo "hello"')],
	  # pass in k8s container kwargs
	  container_kwargs={
    
    'env': [V1EnvVar('foo', 'bar')]},
  )
  # set `imagePullPolicy` property for `container` with `PipelineParam`
  op.container.set_image_pull_policy(pull_image_policy)

  # add sidecar with parameterized image tag
  # sidecar follows the argo sidecar swagger spec
  op.add_sidecar(dsl.Sidecar('redis', 'redis:%s' % tag).set_image_pull_policy('Always'))
********************************************************
__init__(self, 
name:str, 
image:str, 
command:Union[str, List[str], NoneType]=None, 
arguments:Union[str, List[str], NoneType]=None, 
init_containers:Union[List[kfp.dsl._container_op.UserContainer], NoneType]=None, 
sidecars:Union[List[kfp.dsl._container_op.Sidecar], NoneType]=None, 
container_kwargs:Union[Dict, NoneType]=None, 
artifact_argument_paths:Union[List[kfp.dsl._container_op.InputArgumentPath], NoneType]=None, 
file_outputs:Union[Dict[str, str], NoneType]=None, 
output_artifact_paths:Union[Dict[str, str], NoneType]=None, 
is_exit_handler:bool=False, 
pvolumes:Union[Dict[str,kubernetes.client.models.v1_volume.V1Volume], NoneType]=None)

初始化Initialize self. 

add_pvolumes(self, 
pvolumes:Dict[str,kubernetes.client.models.v1_volume.V1Volume]=None)
Updates the existing pvolumes dict, extends volumes and  volume_mounts and redefines the pvolume attribute.
 |
 |      Args:
 |          pvolumes: Dictionary. Keys are mount paths, values are Kubernetes
 |                    volumes or inherited types (e.g. PipelineVolumes).
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  arguments
 |
 |  command
 |
 |  container
 |      `Container` object that represents the `container` property in
 |      `io.argoproj.workflow.v1alpha1.Template`. Can be used to update the
 |      container configurations.
 |
 |      Example::
 |
 |          import kfp.dsl as dsl
 |          from kubernetes.client.models import V1EnvVar
 |
 |          @dsl.pipeline(name='example_pipeline')
 |          def immediate_value_pipeline():
 |              op1 = (dsl.ContainerOp(name='example', image='nginx:alpine')
 |                      .container
 |                          .add_env_variable(V1EnvVar(name='HOST', value='foo.bar'))
 |                          .add_env_variable(V1EnvVar(name='PORT', value='80'))
 |                          .parent # return the parent `ContainerOp`
 |                      )
 |
 |  env_variables
 |
 |  image
 |
 |  ----------------------------------------------------------------------
 |  Methods inherited from BaseOp:
 |
 |  __repr__(self)
 |      Return repr(self).
 |
 |  add_affinity(self, affinity:kubernetes.client.models.v1_affinity.V1Affinity)
 |      Add K8s Affinity喜好
 |
 |      Args:
 |        affinity: Kubernetes affinity
 |        For detailed spec, check affinity definition
 |        https://github.com/kubernetes-client/python/blob/master/kubernetes/client/models/v1_affinity.py
 |
 |      Example::
 |
 |          V1Affinity(
 |              node_affinity=V1NodeAffinity(
 |                  required_during_scheduling_ignored_during_execution=V1NodeSelector(
 |                      node_selector_terms=[V1NodeSelectorTerm(
 |                          match_expressions=[V1NodeSelectorRequirement(
 |                              key='beta.kubernetes.io/instance-type', operator='In', values=['p2.xlarge'])])])))
 |
 |  add_init_container(self, init_container:kfp.dsl._container_op.UserContainer)
 |      Add a init container to the Op.
 |
 |      Args:
 |        init_container: UserContainer object.
 |
 |  add_node_selector_constraint(self, label_name, value)
 |      Add a constraint for nodeSelector. Each constraint is a key-value pair label. For the
 |      container to be eligible to run on a node, the node must have each of the constraints appeared
 |      as labels.
 |
 |      Args:
 |        label_name: The name of the constraint label.
 |        value: The value of the constraint label.
 |
 |  add_pod_annotation(self, name:str, value:str)
 |      Adds a pod's metadata annotation.
 |
 |      Args:
 |        name: The name of the annotation.
 |        value: The value of the annotation.
 |
 |  add_pod_label(self, name:str, value:str)
 |      Adds a pod's metadata label.
 |
 |      Args:
 |        name: The name of the label.
 |        value: The value of the label.
 |
 |  add_sidecar(self, sidecar:kfp.dsl._container_op.Sidecar)
 |      Add a sidecar to the Op.
 |
 |      Args:
 |        sidecar: SideCar object.
 |
 |  add_toleration(self, tolerations:kubernetes.client.models.v1_toleration.V1Toleration)
 |      Add K8s tolerations
 |
 |      Args:
 |        tolerations: Kubernetes toleration
 |        For detailed spec, check toleration definition
 |        https://github.com/kubernetes-client/python/blob/master/kubernetes/client/models/v1_toleration.py
 |
 |  add_volume(self, volume)
 |      Add K8s volume to the container
 |
 |      Args:
 |        volume: Kubernetes volumes
 |        For detailed spec, check volume definition
 |        https://github.com/kubernetes-client/python/blob/master/kubernetes/client/models/v1_volume.py
 |
 |  after(self, *ops)
 |      Specify explicit dependency on other ops.
 |
 |  apply(self, mod_func)
 |      Applies a modifier function to self. The function should return the passed object.
 |      This is needed to chain "extention methods" to this class.
 |
 |      Example::
 |
 |          from kfp.gcp import use_gcp_secret
 |          task = (
 |              train_op(...)
 |                  .set_memory_request('1G')
 |                  .apply(use_gcp_secret('user-gcp-sa'))
 |                  .set_memory_limit('2G')
 |          )
 |
 |  set_display_name(self, name:str)
 |
 |  set_retry(self, num_retries:int, policy:str=None)
 |      Sets the number of times the task is retried until it's declared failed.
 |
 |      Args:
 |        num_retries: Number of times to retry on failures.
 |        policy: Retry policy name.
 |
 |  set_timeout(self, seconds:int)
 |      Sets the timeout for the task in seconds.
 |
 |      Args:
 |        seconds: Number of seconds.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from BaseOp:
 |
 |  __dict__
 |      dictionary for instance variables (if defined)
 |
 |  __weakref__
 |      list of weak references to the object (if defined)
 |
 |  inputs
 |      List of PipelineParams that will be converted into input parameters
 |      (io.argoproj.workflow.v1alpha1.Inputs) for the argo workflow.
 |
 |  ----------------------------------------------------------------------
 |  Data and other attributes inherited from BaseOp:
 |
 |  attrs_with_pipelineparams = ['node_selector', 'volumes', 'pod_annotati...

3 aplicación

3.1 Ejemplo 1

import kfp
@kfp.dsl.pipeline(
    name="VolumeOp Basic",
    description="A Basic Example on VolumeOp Usage."
)
def my_pipeline():
    vop = kfp.dsl.VolumeOp(
        name="create-pvc",
        resource_name="my-pvc",
        modes=kfp.dsl.VOLUME_MODE_RWO,
        size="1Gi"
    )

    cop = kfp.dsl.ContainerOp(
        name="cop",
        image="library/bash:4.4.23",
        command=["sh", "-c"],
        arguments=["echo foo > /mnt/file1"],
        pvolumes={
    
    "/mnt": vop.volume}
    )

if __name__ == '__main__':
    # (1)编译
    pipeline_func = my_pipeline
    pipeline_filename = pipeline_func.__name__ + '.yaml'
    kfp.compiler.Compiler().compile(pipeline_func,pipeline_filename)
    #创建实验
    client = kfp.Client()
    try:
        experiment = client.create_experiment("mnist experiment")
    except Exception:
        experiment = client.get_experiment(experiment_name="mnist experiment")

    #运行作业
    run_name = pipeline_func.__name__ + ' test_run'
    run_result = client.run_pipeline(experiment.id, run_name, pipeline_filename)

3.1.1 Primer paso de la operación

vop = kfp.dsl.VolumeOp (
nombre = "create-pvc",
resource_name = "my-pvc",
modes = kfp.dsl.VOLUME_MODE_RWO,
size = "1Gi"
)

nombre es el nombre que se muestra en la figura.
Pipeline.name coopera con vop.name para formar el nombre de pvc.
Creará automáticamente PV y PVC de almacenamiento según la StorageClass configurada por el sistema.

Inserte la descripción de la imagen aquí
(1-1) Ver pv

Inserte la descripción de la imagen aquí
(1-2) Ver PVC

Inserte la descripción de la imagen aquí

3.1.2 Segundo paso de Op

cop = kfp.dsl.ContainerOp (
nombre = "cop",
imagen = "biblioteca / bash: 4.4.23",
comando = ["sh", "-c"],
argumentos = ["echo foo> / mnt / file1 ”],
Pvolumes = {" / mnt ": vop.volume}
)
nombre es el nombre que se muestra en la figura.
pvolumes = {"/ mnt": vop.volume}.

Ver el nodo al que está programado el PVC
/ mnt es equivalente a / opt / local-path-provisioner / pvc-6f3c34ae-7de9-4f3d-bfe9-b7a80299f994
donde vop.volume representa el siguiente contenido

{
    
    'aws_elastic_block_store': None,
 'azure_disk': None,
 'azure_file': None,
 'cephfs': None,
 'cinder': None,
 'config_map': None,
 'csi': None,
 'downward_api': None,
 'empty_dir': None,
 'fc': None,
 'flex_volume': None,
 'flocker': None,
 'gce_persistent_disk': None,
 'git_repo': None,
 'glusterfs': None,
 'host_path': None,
 'iscsi': None,
 'name': 'create-pvc',
 'nfs': None,
 'persistent_volume_claim': {
    
    'claim_name': '{
    
    {pipelineparam:op=create-pvc;name=name}}',
                             'read_only': None},
 'photon_persistent_disk': None,
 'portworx_volume': None,
 'projected': None,
 'quobyte': None,
 'rbd': None,
 'scale_io': None,
 'secret': None,
 'storageos': None,
 'vsphere_volume': None}

(2-1) Ver entrada y salida
Inserte la descripción de la imagen aquí
(2-2) Ver volúmenes

Inserte la descripción de la imagen aquí
(3) Ver el contenido del archivo escrito
Ver el nodo al que está programado el PVC
/ mnt es equivalente a / opt / local-path-provisioner / pvc-6f3c34ae-7de9-4f3d-bfe9-b7a80299f994
Inserte la descripción de la imagen aquí

3.1.3 Eliminar PVC

Los pasos generales de eliminación son: primero eliminar pod, luego pvc y finalmente pv
(1) eliminar pod
#kubectl eliminar pod volumeop-basic-csgvg-275778418 -n kubeflow
#kubectl eliminar pod volumeop-basic-csgvg-3408782246 -n kubeflow
(2 ) Eliminar pvc
#kubectl eliminar pvc volumeop-basic-csgvg-my-pvc -n kubeflow
(3) Eliminar pv
Cuando se elimina pvc, pv se eliminará automáticamente.

3.2 Ejemplo 2

### Useful attributes
1. The `VolumeOp` step has a `.volume` attribute which is a `PipelineVolume` referencing the
   created PVC.
   A `PipelineVolume` is essentially a `V1Volume` supplemented with an `.after()` method extending
   the carried dependencies.
   These dependencies can then be parsed properly by a `ContainerOp`, if used with the `pvolumes`
   attribute, to extend the `ContainerOp`'s dependencies.
2. A `ContainerOp` has a `pvolumes` argument in its constructor.
   This is a dictionary with mount paths as keys and volumes as values and functions similarly to
   `file_outputs` (which can then be used as `op.outputs["key"]` or `op.output`).
   For example:
   ```python
   vop = dsl.VolumeOp(
       name="volume_creation",
       resource_name="mypvc",
       size="1Gi"
   )
   step1 = dsl.ContainerOp(
       name="step1",
       ...
       pvolumes={
    
    "/mnt": vop.volume}  # Implies execution after vop
   )
   step2 = dsl.ContainerOp(
       name="step2",
       ...
       pvolumes={
    
    "/data": step1.pvolume,  # Implies execution after step1
                 "/mnt": dsl.PipelineVolume(pvc="existing-pvc")}
   )
   step3 = dsl.ContainerOp(
       name="step3",
       ...
       pvolumes={
    
    "/common": step2.pvolumes["/mnt"]}  # Implies execution after step2
   )

Supongo que te gusta

Origin blog.csdn.net/qq_20466211/article/details/114285900
Recomendado
Clasificación