To gpushare-device-plugin, for example, to explore the configuration Resource yaml

1. gpushare-device-plugin an example, the configuration inquiry Resource yaml

I understand k8s most central resource is the Pod, Pod needs to be generated to create yaml file, and then kubectl apply -f example-pod.yamlcreate a pod.

k8s in internal processes should deal with pod would be more complicated, just two weeks I certainly do not have the strength; and familiar with creating pod of yaml relatively easy to configure, only the first will be used to gradually understand the specific implementation details , Deep outside to the inside of this methodology it is not wrong.

Therefore, this paper analyzes the type Pod structstructure, no matter the time being obscure field, focus only on the core field, and then analyzed type DaemonSet struct, and finally through the creation gpushare-device-pluginof device-plugin-ds.yamlto validate our learning

1.1. Pod structure analysis

See Code: kubernetes / vendor / k8s.io / api / core / v1 / types.go

type Pod struct {
    /* 注意,metav1.TypeMeta 是类型,但是成员变量却不存在?这不合常理
       查询了好久才发现,这种叫做 promoted field,还有一个孪生姐妹 anonymous field
       参见:https://stackoverflow.com/questions/28014591/nameless-fields-in-go-structs

       metav1.TypeMeta 在 kubernetes/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/types.go 中
       type TypeMeta struct {
           Kind string `json:"kind,omitempty" protobuf:"bytes,1,opt,name=kind"`
           APIVersion string `json:"apiVersion,omitempty" protobuf:"bytes,2,opt,name=apiVersion"`
       }

       注意,`json:",inline"` 这个被称为 go struct field tag:
       参见:https://medium.com/rungo/structures-in-go-76377cc106a2

       json 代表对 json unpack/pack 处理的 metadata
       protobuf 是对 protobuf 处理的 metadata
       inline 比较特殊,https://github.com/isayme/blog/issues/15,主要是为了从 json 中读取数据后去掉当前层,直接内嵌进去

       注意,Pod 实例可以直接使用 metav1.TypeMeta 实例的成员:pod.Kind、pod.APIVersion,但是 yaml 的配置应该是:

       apiVersion: extensions/v1beta1
       kind: DaemonSet

       这是和 inline 以及 protobuf:"bytes,1,opt,name=kind" 相关的,请注意

    */
    metav1.TypeMeta `json:",inline"`

    /*
        注意,对于 metav1.ObjectMeta,代码中调用时 Pod.Name、Pod.Namespace,但是对于 yaml 文件却是:

        metadata:
          name: gpushare-device-plugin-ds
          namespace: kube-system

        这是因为有 protobuf tag 的指示,注意 metav1.ObjectMeta 不含 inline,因此和 metav1.TypeMeta 的 yaml 有不同,请一定注意

    */
    metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`

    Spec PodSpec `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`

    Status PodStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"`
}

1.1.1. type TypeMeta struct

See Code: kubernetes / vendor / k8s.io / apimachinery / pkg / apis / meta / v1 / types.go

Only two parameters:

  • apiVersion
  • kind

1.1.2. type ObjectMeta struct

See Code: kubernetes / vendor / k8s.io / apimachinery / pkg / apis / meta / v1 / types.go

Common fields as follows:

  • name
  • labels
  • annotations
  • namespace

A complete analysis of the following:

type ObjectMeta struct {
    // Pod 的名字(常用)
    /*
        metadata:
          name: gpushare-device-plugin-ds
    */
    Name string `json:"name,omitempty" protobuf:"bytes,1,opt,name=name"`

    // 打标签(常用) https://www.jianshu.com/p/cd6b4b4caaab
    /*
        metadata:
          labels:
            app: nginx
            release: stable
    */
    Labels map[string]string `json:"labels,omitempty" protobuf:"bytes,11,rep,name=labels"`

    // annotations (常用)k8s 内部组件对这个会比较关心(偏系统),labels 是用户对其关心(偏用户),前者不需要自己设置,k8s 会自动设置,达到某种效果
    Annotations map[string]string `json:"annotations,omitempty" protobuf:"bytes,12,rep,name=annotations"`


    // kube-system,不填就是 default(常用)
    /*
        metadata:
          name: gpushare-device-plugin-ds
          namespace: kube-system
    */
    Namespace string `json:"namespace,omitempty" protobuf:"bytes,3,opt,name=namespace"`

    // 不常用
    GenerateName string `json:"generateName,omitempty" protobuf:"bytes,2,opt,name=generateName"`

    // 不常用
    SelfLink string `json:"selfLink,omitempty" protobuf:"bytes,4,opt,name=selfLink"`

    // 不常用
    UID types.UID `json:"uid,omitempty" protobuf:"bytes,5,opt,name=uid,casttype=k8s.io/kubernetes/pkg/types.UID"`

    // 不常用,系统填写 https://k8smeetup.github.io/docs/reference/api-concepts/
    ResourceVersion string `json:"resourceVersion,omitempty" protobuf:"bytes,6,opt,name=resourceVersion"`

    // 不常用
    Generation int64 `json:"generation,omitempty" protobuf:"varint,7,opt,name=generation"`

    // 不常用,系统填写
    CreationTimestamp Time `json:"creationTimestamp,omitempty" protobuf:"bytes,8,opt,name=creationTimestamp"`

    // 不常用,系统填写
    DeletionTimestamp *Time `json:"deletionTimestamp,omitempty" protobuf:"bytes,9,opt,name=deletionTimestamp"`

    // 优雅的被删除,不常用
    DeletionGracePeriodSeconds *int64 `json:"deletionGracePeriodSeconds,omitempty" protobuf:"varint,10,opt,name=deletionGracePeriodSeconds"`

    // 不常用,垃圾收集相关:https://kubernetes.io/zh/docs/concepts/workloads/controllers/garbage-collection/
    OwnerReferences []OwnerReference `json:"ownerReferences,omitempty" patchStrategy:"merge" patchMergeKey:"uid" protobuf:"bytes,13,rep,name=ownerReferences"`

    // 不常用,1.15 将被废弃
    Initializers *Initializers `json:"initializers,omitempty" protobuf:"bytes,16,opt,name=initializers"`

    // 不常用,垃圾回收相关:https://draveness.me/kubernetes-garbage-collector
    Finalizers []string `json:"finalizers,omitempty" patchStrategy:"merge" protobuf:"bytes,14,rep,name=finalizers"`

    // 不常用,有多 cluster 的时候,可能需要指定 cluster
    ClusterName string `json:"clusterName,omitempty" protobuf:"bytes,15,opt,name=clusterName"`

    // 不常用,不了解
    ManagedFields []ManagedFieldsEntry `json:"managedFields,omitempty" protobuf:"bytes,17,rep,name=managedFields"`
}

1.1.3. type PodSpec struct

Code: kubernetes / vendor / k8s.io / api / core / v1 / types.go

We extract a few more important, the core of the easily confused:

  • volumes
  • InitContainers
  • Containers

Other detailed analysis below, where the site offers a great Kubernetes guide , very comprehensive, helped me a lot

// PodSpec is a description of a pod.
type PodSpec struct {
    // 常用,volumes,下文详细分析
    Volumes []Volume `json:"volumes,omitempty" patchStrategy:"merge,retainKeys" patchMergeKey:"name" protobuf:"bytes,1,rep,name=volumes"`

    // 常用,initContainers,下文详细分析
    InitContainers []Container `json:"initContainers,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,20,rep,name=initContainers"`

    // 常用,containers,下文详细分析
    Containers []Container `json:"containers" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,2,rep,name=containers"`

    // 常用,重启策略 Always、OnFailure、Never
    RestartPolicy RestartPolicy `json:"restartPolicy,omitempty" protobuf:"bytes,3,opt,name=restartPolicy,casttype=RestartPolicy"`

    // 不常用,退出等待多少时间~
    TerminationGracePeriodSeconds *int64 `json:"terminationGracePeriodSeconds,omitempty" protobuf:"varint,4,opt,name=terminationGracePeriodSeconds"`

    // 不常用,和重试有关,也许是标志失败Pod的重试最大时间,超过这个时间不会继续重试
    ActiveDeadlineSeconds *int64 `json:"activeDeadlineSeconds,omitempty" protobuf:"varint,5,opt,name=activeDeadlineSeconds"`

    // 不常用,DNS 配置,ClusterFirstWithHostNet、ClusterFirst、Default、None
    DNSPolicy DNSPolicy `json:"dnsPolicy,omitempty" protobuf:"bytes,6,opt,name=dnsPolicy,casttype=DNSPolicy"`

    // 常用,是一个供用户将 Pod 与 Node 进行绑定的字段
    /*
        spec:
          nodeSelector:
          disktype: ssd
    */
    NodeSelector map[string]string `json:"nodeSelector,omitempty" protobuf:"bytes,7,rep,name=nodeSelector"`

    /*
        常用,pod 也可以访问 apiserver,但是用什么权限呢?ServiceAccountName 就是这个东西

        首先需要创建一个 ServiceAccount:
        apiVersion: v1
        kind: ServiceAccount
        metadata:
          name: gpushare-device-plugin
          namespace: kube-system

        然后,可以直接使用:
        spec:
          serviceAccount: gpushare-device-plugin

    */
    ServiceAccountName string `json:"serviceAccountName,omitempty" protobuf:"bytes,8,opt,name=serviceAccountName"`

    // 不常用,废弃
    DeprecatedServiceAccount string `json:"serviceAccount,omitempty" protobuf:"bytes,9,opt,name=serviceAccount"`

    // 不常用,serviceAccount 是否自动挂载??
    AutomountServiceAccountToken *bool `json:"automountServiceAccountToken,omitempty" protobuf:"varint,21,opt,name=automountServiceAccountToken"`

    // 常用,但不太应该用!指定调度到哪个 node 上,跳过了调度器!
    NodeName string `json:"nodeName,omitempty" protobuf:"bytes,10,opt,name=nodeName"`

    // 常用,isolation,是否用宿主机的网络(namespace)一般 false
    HostNetwork bool `json:"hostNetwork,omitempty" protobuf:"varint,11,opt,name=hostNetwork"`

    // 常用,isolation,是否能看到宿主机的进程 (pid namespace)一般 false
    HostPID bool `json:"hostPID,omitempty" protobuf:"varint,12,opt,name=hostPID"`

    // 常用,isolation,是否能看到宿主机 IPC (ipc namespace)一般 false
    HostIPC bool `json:"hostIPC,omitempty" protobuf:"varint,13,opt,name=hostIPC"`

    // 常用,是否 pod 下面多个 container 共享一个 PID namespace
    // 置为 true 后,container 能互相看到对方的进程
    ShareProcessNamespace *bool `json:"shareProcessNamespace,omitempty" protobuf:"varint,27,opt,name=shareProcessNamespace"`

    // 常用,参见 https://feisky.gitbooks.io/kubernetes/concepts/security-context.html
    // 简单说,启用 selinux?限制端口,总之,限制不可信容器的使用
    // 暂时不用关注
    SecurityContext *PodSecurityContext `json:"securityContext,omitempty" protobuf:"bytes,14,opt,name=securityContext"`

    // 常用,下载镜像也得有权限吧?不过默认是用 default serviceAccount 的 ImagePullSecrets
    ImagePullSecrets []LocalObjectReference `json:"imagePullSecrets,omitempty" patchStrategy:"merge" patchMergeKey:"name"

    // 常用,hostname
    // If specified, the fully qualified Pod hostname will be "<hostname>.<subdomain>.<pod namespace>.svc.<cluster domain>".
    Hostname string `json:"hostname,omitempty" protobuf:"bytes,16,opt,name=hostname"`

    // 同上
    Subdomain string `json:"subdomain,omitempty" protobuf:"bytes,17,opt,name=subdomain"`

    // 常用,调度相关,共分 3 级,NodeAffinity、PodAffinity、PodAntiAffinity
    // 详见:https://feisky.gitbooks.io/kubernetes/components/scheduler.html
    Affinity *Affinity `json:"affinity,omitempty" protobuf:"bytes,18,opt,name=affinity"`

    // 不常用,调度相关,指定调度器的名字
    SchedulerName string `json:"schedulerName,omitempty" protobuf:"bytes,19,opt,name=schedulerName"`

    // 常用,https://feisky.gitbooks.io/kubernetes/components/scheduler.html#taints-%E5%92%8C-tolerations
    // 不调度到哪台机器
    Tolerations []Toleration `json:"tolerations,omitempty" protobuf:"bytes,22,opt,name=tolerations"`

    // 常用,指定 /etc/hosts,非常重要
    /*
        spec:
            hostAliases:
            - ip: "10.1.2.3"
            hostnames:
            - "foo.remote"
            - "bar.remote"


        cat /etc/hosts
        # Kubernetes-managed hosts file.
        127.0.0.1 localhost
        ...
        10.244.135.10 hostaliases-pod
        10.1.2.3 foo.remote
        10.1.2.3 bar.remote

    */
    HostAliases []HostAlias `json:"hostAliases,omitempty" patchStrategy:"merge" patchMergeKey:"ip" protobuf:"bytes,23,rep,name=hostAliases"`

    // 不重要,调度相关,指定被调度的优先级
    // PriorityClass 是一个 resource 需要自己去创建,创建后在这里指定
    // https://feisky.gitbooks.io/kubernetes/concepts/pod.html#%E4%BC%98%E5%85%88%E7%BA%A7
    /*
        apiVersion: scheduling.k8s.io/v1alpha1
        kind: PriorityClass
        metadata:
          name: high-priority
        value: 1000000
        globalDefault: false
        description: "This priority class should be used for XYZ service pods only."
    */
    PriorityClassName string `json:"priorityClassName,omitempty" protobuf:"bytes,24,opt,name=priorityClassName"`

    // 不常用,调度,Priority Admission Controller is enabled 后失效(读 PriorityClassName 中的 value),否则生效。越大越优先
    Priority *int32 `json:"priority,omitempty" protobuf:"bytes,25,opt,name=priority"`

    // 常用,写死 resolve.conf,我觉得 DNSConfig 和 DNSPolicy 类似,前者是用户写死,后者是根据用户选的策略系统自己填写
    DNSConfig *PodDNSConfig `json:"dnsConfig,omitempty" protobuf:"bytes,26,opt,name=dnsConfig"`

    // 不常用,pod 启动后额外的检测,只有通过才是 ready
    // v1.11 引入:https://godleon.github.io/blog/Kubernetes/k8s-Pod-Overview/
    ReadinessGates []PodReadinessGate `json:"readinessGates,omitempty" protobuf:"bytes,28,opt,name=readinessGates"`

    // 不常用,支持多 CRI,v1.12 引入,比如 Pod 包含 Kata Containers/gVisor + runc 的多个容器
    RuntimeClassName *string `json:"runtimeClassName,omitempty" protobuf:"bytes,29,opt,name=runtimeClassName"`

    // 不常用,不详
    EnableServiceLinks *bool `json:"enableServiceLinks,omitempty" protobuf:"varint,30,opt,name=enableServiceLinks"`

    // 不常用,新搞出来的一个资源抢占的内容???
    PreemptionPolicy *PreemptionPolicy `json:"preemptionPolicy,omitempty" protobuf:"bytes,31,opt,name=preemptionPolicy"`
}

1.1.3.1. type Volume struct

See Code: kubernetes / vendor / k8s.io / api / core / v1 / types.go

Volume I understand here is to create a volume, it is necessary volume name + volume source.

We commonly use the host address mapping into the container, as a volume, or a host device mapped into the container.

In addition, volumes more source is ceph, cinder, nfs, iscsi, etc., so that it has the ability to persistence. This part is very complicated, we simply Analysis

Core parameters:

  • name
  • volumeSource
type Volume struct {
    // volume 名字
    Name string `json:"name" protobuf:"bytes,1,opt,name=name"`
    // Volume 的资源来源于哪里?
    VolumeSource `json:",inline" protobuf:"bytes,2,opt,name=volumeSource"`
}

1.1.3.2. type VolumeSource struct

See Code: kubernetes / vendor / k8s.io / api / core / v1 / types.go

VolumeSource I understand that k8s support of back-end storage which, the most common is the host disk to mount containers as volume

type VolumeSource struct {
    // 映射宿主机目录到 container 中,注意,这里仅仅是创建资源,映射在 container 的 spec 中的
    /*
    type HostPathVolumeSource struct {
        // 宿主机路径,可以是 device、dir、socket 等
        Path string `json:"path" protobuf:"bytes,1,opt,name=path"`
        // 支持很多类型,File、socket、Device,最保险的使用默认值 "",一般来说兼容一切
        Type *HostPathType `json:"type,omitempty" protobuf:"bytes,2,opt,name=type"`
    }
    */

    /*
    举个例子:
    spec:
      volumes:
        - name: device-plugin
          hostPath:
            path: /var/lib/kubelet/device-plugins

    */
    HostPath *HostPathVolumeSource `json:"hostPath,omitempty" protobuf:"bytes,1,opt,name=hostPath"`

    // 创建一个宿主机临时目录,给 container 用,这可能涉及到多个 container 之间的数据交互的配合,也可能是 container 本身受到 10GB 大小的限制,可能日志会写不下
    /*
    例子:
        spec:
          volumes:
          - name: nginx-vol
            emptyDir: {}
    */
    EmptyDir *EmptyDirVolumeSource `json:"emptyDir,omitempty" protobuf:"bytes,2,opt,name=emptyDir"`

    // 重点在 ceph 上,但是目前没有这个实力
    NFS *NFSVolumeSource `json:"nfs,omitempty" protobuf:"bytes,7,opt,name=nfs"`
    RBD *RBDVolumeSource `json:"rbd,omitempty" protobuf:"bytes,11,opt,name=rbd"`
    PersistentVolumeClaim *PersistentVolumeClaimVolumeSource `json:"persistentVolumeClaim,omitempty" protobuf:"bytes,10,opt,name=persistentVolumeClaim"`

    // 很多其他的 volume 数据源
    ...

1.1.4. type Container struct

initcontainer and container are struct [] Container type, it is only necessary to analyze Container, Container is the core of the core!

We summarize the commonly used parameters:

  • name
  • image
  • command
  • args
  • envs
  • ports
  • resources
  • volumeMounts
  • lifecycle
    • Transfers
    • mail start
  • devicePath
  • imagePullPolicy
  • stdin
  • tty

Detailed as follows:

type Container struct {
    // 常用,容器得有名字吧?
    Name string `json:"name" protobuf:"bytes,1,opt,name=name"`

    // 常用,容器得用镜像吧?
    /* 例子
      spec:
        containers:
        - name: nginx
          image: nginx:1.8
    */
    Image string `json:"image,omitempty" protobuf:"bytes,2,opt,name=image"`

    // 常用,启动容器的命令行,会覆盖 docker 本身的 entrypoint
    /*
    spec:
        containers:
        - command:
            - gpushare-device-plugin-v2
            - -logtostderr
            - --v=5
            - --memory-unit=GiB
    */
    Command []string `json:"command,omitempty" protobuf:"bytes,3,rep,name=command"`
    // 常用,命令行参数
    /*
    spec:
      containers:
      - name: command-demo-container
        image: debian
        command: ["printenv"]
        args: ["HOSTNAME", "KUBERNETES_PORT"]
    */
    Args []string `json:"args,omitempty" protobuf:"bytes,4,rep,name=args"`

    // 常用,工作目录
    WorkingDir string `json:"workingDir,omitempty" protobuf:"bytes,5,opt,name=workingDir"`

    // 常用,pod 对外的 port
    /*
    spec:
      containers:
      - ports:
        - containerPort: 80
    */
    Ports []ContainerPort `json:"ports,omitempty" patchStrategy:"merge" patchMergeKey:"containerPort" protobuf:"bytes,6,rep,name=ports"`

    // 不常用,可能是 env 存放在 configmap resource 中,这里引用一下
    EnvFrom []EnvFromSource `json:"envFrom,omitempty" protobuf:"bytes,19,rep,name=envFrom"`

    // 传入容器的环境变量,比 EnvFrom 直接
    Env []EnvVar `json:"env,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,7,rep,name=env"`

    // 常用,需要的资源
    /*
    spec:
      containers:
      - resources:
          limits:
            memory: "300Mi"
            cpu: "1"
          requests:
            memory: "300Mi"
            cpu: "1"
    */
    Resources ResourceRequirements `json:"resources,omitempty" protobuf:"bytes,8,opt,name=resources"`

    // 常用,volume 映射,type VolumeMount struct 还需要仔细研究下
    /*
    spec:
      containers:
      - volumeMounts:
        - name: device-plugin
          mountPath: /var/lib/kubelet/device-plugins
    */
    VolumeMounts []VolumeMount `json:"volumeMounts,omitempty" patchStrategy:"merge" patchMergeKey:"mountPath" protobuf:"bytes,9,rep,name=volumeMounts"`

    // 常用,但需要研究,貌似是 blockdevice,需要先创建 PVC
    /*
        apiVersion: v1
        kind: PersistentVolumeClaim
        metadata:
          name: my-pvc
        spec:
          accessModes:
            - ReadWriteMany
          volumeMode: Block
          storageClassName: my-sc
          resources:
            requests:
            storage: 1Gi
    */

    /*
    apiVersion: v1
    kind: Pod
    metadata:
      name: my-pod
    spec:
      containers:
        - name: my-container
          image: busybox
          command:
            - sleep
            - “3600”
          volumeDevices:
            - devicePath: /dev/block
              name: my-volume
          imagePullPolicy: IfNotPresent
      volumes:
        - name: my-volume
          persistentVolumeClaim:
            claimName: my-pvc
    */
    VolumeDevices []VolumeDevice `json:"volumeDevices,omitempty" patchStrategy:"merge" patchMergeKey:"devicePath"

    // 重要,但暂时不看,健康检查
    // https://feisky.gitbooks.io/kubernetes/introduction/201.html#%E5%81%A5%E5%BA%B7%E6%A3%80%E6%9F%A5
    LivenessProbe *Probe `json:"livenessProbe,omitempty" protobuf:"bytes,10,opt,name=livenessProbe"`

    // 重要,但暂时不看,监控检查
    // https://feisky.gitbooks.io/kubernetes/introduction/201.html#%E5%81%A5%E5%BA%B7%E6%A3%80%E6%9F%A5
    ReadinessProbe *Probe `json:"readinessProbe,omitempty" protobuf:"bytes,11,opt,name=readinessProbe"`

    // 在 poststart 和 prestop 的时候可以插入执行的内容
    /*
    spec:
        containers:
        - name: lifecycle-demo-container
        image: nginx
        lifecycle:
          postStart:
            exec:
              command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"
          preStop:
            exec:
              command: ["/usr/sbin/nginx","-s","quit"]
    */
    Lifecycle *Lifecycle `json:"lifecycle,omitempty" protobuf:"bytes,12,opt,name=lifecycle"`

    // 详见:https://k8smeetup.github.io/docs/tasks/debug-application-cluster/determine-reason-pod-failure/
    // 貌似不用特别管
    TerminationMessagePath string `json:"terminationMessagePath,omitempty" protobuf:"bytes,13,opt,name=terminationMessagePath"`

    // 同上,注意
    TerminationMessagePolicy TerminationMessagePolicy `json:"terminationMessagePolicy,omitempty"

    // 常用,类型 Always、Never、IfNotPresent
    ImagePullPolicy PullPolicy `json:"imagePullPolicy,omitempty" protobuf:"bytes,14,opt,name=imagePullPolicy,casttype=PullPolicy"`

    // 同 Pod 中的 SecurityContext
    SecurityContext *SecurityContext `json:"securityContext,omitempty" protobuf:"bytes,15,opt,name=securityContext"`

    // 是否开启 stdin
    Stdin bool `json:"stdin,omitempty" protobuf:"varint,16,opt,name=stdin"`

    // 不常用,估计只能同时允许一个 stdin 的连接
    StdinOnce bool `json:"stdinOnce,omitempty" protobuf:"varint,17,opt,name=stdinOnce"`

    // 是否开启 tty
    TTY bool `json:"tty,omitempty" protobuf:"varint,18,opt,name=tty"`
}

1.1.5. type PodStatus struct

PodStatus all be completed by the system, and create nothing to do, we just need to know you can

In general

  • phase (Pod state)
  • There are all state of the container below the Pod, for a reason, there are some strange field.
type PodStatus struct {
    // 系统填写,Pending、Running、Succeeded、Failed、Unknown 就五种状态
    Phase PodPhase `json:"phase,omitempty" protobuf:"bytes,1,opt,name=phase,casttype=PodPhase"`

    // 系统填写,Pod 下多个 container 的状态,包括
    /*
        type PodCondition struct {
            // ContainersReady,Initialized,Ready,PodScheduled
            Type PodConditionType `json:"type" protobuf:"bytes,1,opt,name=type,casttype=PodConditionType"`
            // True, False, Unknown. 不知道什么意思
            Status ConditionStatus `json:"status" protobuf:"bytes,2,opt,name=status,casttype=ConditionStatus"`
            // 上一次提交状态的时间?
            LastProbeTime metav1.Time `json:"lastProbeTime,omitempty" protobuf:"bytes,3,opt,name=lastProbeTime"`
            // 上一次提交状态变化的时间?
            LastTransitionTime metav1.Time `json:"lastTransitionTime,omitempty" protobuf:"bytes,4,opt,name=lastTransitionTime"`
            // 为啥会状态会变化
            Reason string `json:"reason,omitempty" protobuf:"bytes,5,opt,name=reason"`
            // 为啥状态会变化给用户看的
            Message string `json:"message,omitempty" protobuf:"bytes,6,opt,name=message"`
        }
    */
    Conditions []PodCondition `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type" protobuf:"bytes,2,rep,name=conditions"`

    // 人能看懂的为啥处于这个状态的原因
    Message string `json:"message,omitempty" protobuf:"bytes,3,opt,name=message"`

    // 工程师能看懂的为啥处于这个状态的原因
    Reason string `json:"reason,omitempty" protobuf:"bytes,4,opt,name=reason"`

    // 和抢占有关系,不懂 https://www.jianshu.com/p/bdcb9528a8b1
    NominatedNodeName string `json:"nominatedNodeName,omitempty" protobuf:"bytes,11,opt,name=nominatedNodeName"`

    // 调度该 pod 的 scheduler 的 IP
    HostIP string `json:"hostIP,omitempty" protobuf:"bytes,5,opt,name=hostIP"`

    // Pod 的 IP 地址?
    PodIP string `json:"podIP,omitempty" protobuf:"bytes,6,opt,name=podIP"`

    // 启动时间?,具体再议
    StartTime *metav1.Time `json:"startTime,omitempty" protobuf:"bytes,7,opt,name=startTime"`

    // initcontainer 的状态,因为它是最先启动的,它启动成功后,container 才能启动
    InitContainerStatuses []ContainerStatus `json:"initContainerStatuses,omitempty" protobuf:"bytes,10,rep,name=initContainerStatuses"`

    // container 状态
    ContainerStatuses []ContainerStatus `json:"containerStatuses,omitempty" protobuf:"bytes,8,rep,name=containerStatuses"`

    // QoS 相关的,可选 Guaranteed、Burstable、BestEffort
    QOSClass PodQOSClass `json:"qosClass,omitempty" protobuf:"bytes,9,rep,name=qosClass"`
}

1.2. DaemonSet structure

Code: kubernetes / vendor / k8s.io / api / apps / v1 / types.go

Process analysis, we must pay attention to the contrast Pod

type DaemonSet struct {
    // 同 Pod 第一个字段
    metav1.TypeMeta `json:",inline"`
    // 同 Pod 第二个字段
    metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`

    // 需要仔细分析,类似于 PodSpec
    Spec DaemonSetSpec `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`

    // 需要仔细分析,类似于 PodStatus
    Status DaemonSetStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"`
}

1.2.1. type DaemonSetSpec struct

Code: kubernetes / vendor / k8s.io / api / apps / v1 / types.go

type DaemonSetSpec struct {
    // 该 DaemonSet 部署在哪些机器上?用 selector 来过滤
    Selector *metav1.LabelSelector `json:"selector" protobuf:"bytes,1,opt,name=selector"`

    // 需要仔细分析
    Template v1.PodTemplateSpec `json:"template" protobuf:"bytes,2,opt,name=template"`

    // update 的策略,默认是 RollingUpdate
    /*
    type DaemonSetUpdateStrategy struct {
        // 两种升级策略 RollingUpdate 和 OnDelete
        Type DaemonSetUpdateStrategyType `json:"type,omitempty" protobuf:"bytes,1,opt,name=type"`

        // 如果是 RollingUpdate,才生效,如果填数字 5,代表 5 个 5 个逐步升级,如果填 5%,则5% 5% 逐步升级,默认是 1
        RollingUpdate *RollingUpdateDaemonSet `json:"rollingUpdate,omitempty" protobuf:"bytes,2,opt,name=rollingUpdate"`
    }

    */
    UpdateStrategy DaemonSetUpdateStrategy `json:"updateStrategy,omitempty" protobuf:"bytes,3,opt,name=updateStrategy"`

    // 当 ready 后多少分钟,才认为该 DaemonSet 是 avaliable?
    MinReadySeconds int32 `json:"minReadySeconds,omitempty" protobuf:"varint,4,opt,name=minReadySeconds"`

    // 保存的历史的 checkpoint 有多少个,默认值是 10 个
    RevisionHistoryLimit *int32 `json:"revisionHistoryLimit,omitempty" protobuf:"varint,6,opt,name=revisionHistoryLimit"`
}

1.2.2. PodTemplateSpec type struct

Code: kubernetes / vendor / k8s.io / api / core / v1 / types.go

type PodTemplateSpec struct {
    // 同 Pod/DaemonSet 中第二个字段,不知道这是何意?
    metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`

    // 同 Pod 中的 PodSpec
    Spec PodSpec `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`
}

1.2.3. type DaemonSetStatus struct

Code: kubernetes / vendor / k8s.io / api / apps / v1 / types.go

DaemonSetStatus much simpler than PodStatus, after all, its definition is deployed only in a Pod on each qualifying host.

type DaemonSetStatus struct {
    // 正在运行这个 daemonset 的 node 个数
    CurrentNumberScheduled int32 `json:"currentNumberScheduled" protobuf:"varint,1,opt,name=currentNumberScheduled"`

    // 不该运行这个 daemonset 的 node 个数
    NumberMisscheduled int32 `json:"numberMisscheduled" protobuf:"varint,2,opt,name=numberMisscheduled"`

    // 应当运行这个 daemonset 的 node 个数
    DesiredNumberScheduled int32 `json:"desiredNumberScheduled" protobuf:"varint,3,opt,name=desiredNumberScheduled"`

    // 准备好运行这个 daemonset 的 node 个数
    NumberReady int32 `json:"numberReady" protobuf:"varint,4,opt,name=numberReady"`

    // The most recent generation observed by the daemon set controller.
    // +optional
    ObservedGeneration int64 `json:"observedGeneration,omitempty" protobuf:"varint,5,opt,name=observedGeneration"`

    // 运行最新的 daemonset 的 node 个数
    UpdatedNumberScheduled int32 `json:"updatedNumberScheduled,omitempty" protobuf:"varint,6,opt,name=updatedNumberScheduled"`

    // 好烦
    NumberAvailable int32 `json:"numberAvailable,omitempty" protobuf:"varint,7,opt,name=numberAvailable"`

    NumberUnavailable int32 `json:"numberUnavailable,omitempty" protobuf:"varint,8,opt,name=numberUnavailable"`

    CollisionCount *int32 `json:"collisionCount,omitempty" protobuf:"varint,9,opt,name=collisionCount"`

    // 同 PodCondition
    Conditions []DaemonSetCondition `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type" protobuf:"bytes,10,rep,name=conditions"`
}

Take a look at the actual state of DaemonSet, and definitions DaemonSetStatus really quite similar:

[root@k8s-master kubernetes]# kubectl describe daemonset gpushare-device-plugin-ds -n kube-system
Name:           gpushare-device-plugin-ds
Selector:       app=gpushare,component=gpushare-device-plugin,name=gpushare-device-plugin-ds
Node-Selector:  gpushare=true
Labels:         app=gpushare
                component=gpushare-device-plugin
                name=gpushare-device-plugin-ds
Annotations:    deprecated.daemonset.template.generation: 2
                kubectl.kubernetes.io/last-applied-configuration:
                  {"apiVersion":"extensions/v1beta1","kind":"DaemonSet","metadata":{"annotations":{},"name":"gpushare-device-plugin-ds","namespace":"kube-sy...
Desired Number of Nodes Scheduled: 1
Current Number of Nodes Scheduled: 1
Number of Nodes Scheduled with Up-to-date Pods: 1
Number of Nodes Scheduled with Available Pods: 1
Number of Nodes Misscheduled: 0
Pods Status:  1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app=gpushare
                    component=gpushare-device-plugin
                    name=gpushare-device-plugin-ds
  Annotations:      scheduler.alpha.kubernetes.io/critical-pod:
  Service Account:  gpushare-device-plugin
  Containers:
   gpushare:
    Image:      registry.cn-hangzhou.aliyuncs.com/acs/k8s-gpushare-plugin:v2-1.12-lihao-test
    Port:       <none>
    Host Port:  <none>
    Command:
      gpushare-device-plugin-v2
      -logtostderr
      --v=5
      --memory-unit=GiB
    Limits:
      cpu:     1
      memory:  300Mi
    Requests:
      cpu:     1
      memory:  300Mi
    Environment:
      KUBECONFIG:  /etc/kubernetes/kubelet.conf
      NODE_NAME:    (v1:spec.nodeName)
    Mounts:
      /var/lib/kubelet/device-plugins from device-plugin (rw)
  Volumes:
   device-plugin:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/device-plugins
    HostPathType:
Events:            <none>

1.3 Analysis of gpushare-device-plugin configuration daemon

According to its documentation, gpushare-device-pluginthe configuration is daemonsetdevice-plugin-ds.yaml

[root@k8s-master kubernetes]# cat device-plugin-ds.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: gpushare-device-plugin-ds
  namespace: kube-system
spec:
  template:
    metadata:
      annotations:
        # 调度相关
        scheduler.alpha.kubernetes.io/critical-pod: ""
      labels:
        # 调度相关
        component: gpushare-device-plugin
        app: gpushare
        name: gpushare-device-plugin-ds
    # 就是 PodSpec
    spec:
      # 在 gpushare-device-plugin 的 device-plugin-rbac.yaml 中创建了该 serviceAccount
      serviceAccount: gpushare-device-plugin
      # 使用宿主机网络
      hostNetwork: true
      # 其实我觉得最好在最外层 spec 下用 selector
      # 用 nodeSelector 也行,有 gpushare: true 的 node 才启动该 device plugin
      nodeSelector:
        gpushare: "true"
      containers:
      - image: registry.cn-hangzhou.aliyuncs.com/acs/k8s-gpushare-plugin:v2-1.12-lihao-test
        name: gpushare
        # 用 args 装 -logtostderr 没准更好
        command:
          - gpushare-device-plugin-v2
          - -logtostderr
          - --v=5
          - --memory-unit=GiB
        # 资源用不多
        resources:
          limits:
            memory: "300Mi"
            cpu: "1"
          requests:
            memory: "300Mi"
            cpu: "1"
        # 环境变量有这么几个
        env:
        - name: KUBECONFIG
          value: /etc/kubernetes/kubelet.conf
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        # 权限限制
        securityContext:
          # 不用放大权利
          allowPrivilegeEscalation: false
          capabilities:
            # 不用任何额外的权限
            drop: ["ALL"]
        # 把 grpc unix socket(device plugin)映射到容器中
        volumeMounts:
          - name: device-plugin
            mountPath: /var/lib/kubelet/device-plugins
      # 把 grpc 的 Unix socket 所在本地文件夹作为卷
      volumes:
        - name: device-plugin
          hostPath:
            path: /var/lib/kubelet/device-plugins

Guess you like

Origin www.cnblogs.com/oolo/p/11694401.html