- 1. gpushare-device-plugin an example, the configuration inquiry Resource yaml
1. gpushare-device-plugin an example, the configuration inquiry Resource yaml
I understand k8s most central resource is the Pod, Pod needs to be generated to create yaml file, and then kubectl apply -f example-pod.yaml
create a pod.
k8s in internal processes should deal with pod would be more complicated, just two weeks I certainly do not have the strength; and familiar with creating pod of yaml relatively easy to configure, only the first will be used to gradually understand the specific implementation details , Deep outside to the inside of this methodology it is not wrong.
Therefore, this paper analyzes the type Pod struct
structure, no matter the time being obscure field, focus only on the core field, and then analyzed type DaemonSet struct
, and finally through the creation gpushare-device-plugin
of device-plugin-ds.yaml
to validate our learning
1.1. Pod structure analysis
See Code: kubernetes / vendor / k8s.io / api / core / v1 / types.go
type Pod struct {
/* 注意,metav1.TypeMeta 是类型,但是成员变量却不存在?这不合常理
查询了好久才发现,这种叫做 promoted field,还有一个孪生姐妹 anonymous field
参见:https://stackoverflow.com/questions/28014591/nameless-fields-in-go-structs
metav1.TypeMeta 在 kubernetes/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/types.go 中
type TypeMeta struct {
Kind string `json:"kind,omitempty" protobuf:"bytes,1,opt,name=kind"`
APIVersion string `json:"apiVersion,omitempty" protobuf:"bytes,2,opt,name=apiVersion"`
}
注意,`json:",inline"` 这个被称为 go struct field tag:
参见:https://medium.com/rungo/structures-in-go-76377cc106a2
json 代表对 json unpack/pack 处理的 metadata
protobuf 是对 protobuf 处理的 metadata
inline 比较特殊,https://github.com/isayme/blog/issues/15,主要是为了从 json 中读取数据后去掉当前层,直接内嵌进去
注意,Pod 实例可以直接使用 metav1.TypeMeta 实例的成员:pod.Kind、pod.APIVersion,但是 yaml 的配置应该是:
apiVersion: extensions/v1beta1
kind: DaemonSet
这是和 inline 以及 protobuf:"bytes,1,opt,name=kind" 相关的,请注意
*/
metav1.TypeMeta `json:",inline"`
/*
注意,对于 metav1.ObjectMeta,代码中调用时 Pod.Name、Pod.Namespace,但是对于 yaml 文件却是:
metadata:
name: gpushare-device-plugin-ds
namespace: kube-system
这是因为有 protobuf tag 的指示,注意 metav1.ObjectMeta 不含 inline,因此和 metav1.TypeMeta 的 yaml 有不同,请一定注意
*/
metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`
Spec PodSpec `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`
Status PodStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"`
}
1.1.1. type TypeMeta struct
See Code: kubernetes / vendor / k8s.io / apimachinery / pkg / apis / meta / v1 / types.go
Only two parameters:
- apiVersion
- kind
1.1.2. type ObjectMeta struct
See Code: kubernetes / vendor / k8s.io / apimachinery / pkg / apis / meta / v1 / types.go
Common fields as follows:
- name
- labels
- annotations
- namespace
A complete analysis of the following:
type ObjectMeta struct {
// Pod 的名字(常用)
/*
metadata:
name: gpushare-device-plugin-ds
*/
Name string `json:"name,omitempty" protobuf:"bytes,1,opt,name=name"`
// 打标签(常用) https://www.jianshu.com/p/cd6b4b4caaab
/*
metadata:
labels:
app: nginx
release: stable
*/
Labels map[string]string `json:"labels,omitempty" protobuf:"bytes,11,rep,name=labels"`
// annotations (常用)k8s 内部组件对这个会比较关心(偏系统),labels 是用户对其关心(偏用户),前者不需要自己设置,k8s 会自动设置,达到某种效果
Annotations map[string]string `json:"annotations,omitempty" protobuf:"bytes,12,rep,name=annotations"`
// kube-system,不填就是 default(常用)
/*
metadata:
name: gpushare-device-plugin-ds
namespace: kube-system
*/
Namespace string `json:"namespace,omitempty" protobuf:"bytes,3,opt,name=namespace"`
// 不常用
GenerateName string `json:"generateName,omitempty" protobuf:"bytes,2,opt,name=generateName"`
// 不常用
SelfLink string `json:"selfLink,omitempty" protobuf:"bytes,4,opt,name=selfLink"`
// 不常用
UID types.UID `json:"uid,omitempty" protobuf:"bytes,5,opt,name=uid,casttype=k8s.io/kubernetes/pkg/types.UID"`
// 不常用,系统填写 https://k8smeetup.github.io/docs/reference/api-concepts/
ResourceVersion string `json:"resourceVersion,omitempty" protobuf:"bytes,6,opt,name=resourceVersion"`
// 不常用
Generation int64 `json:"generation,omitempty" protobuf:"varint,7,opt,name=generation"`
// 不常用,系统填写
CreationTimestamp Time `json:"creationTimestamp,omitempty" protobuf:"bytes,8,opt,name=creationTimestamp"`
// 不常用,系统填写
DeletionTimestamp *Time `json:"deletionTimestamp,omitempty" protobuf:"bytes,9,opt,name=deletionTimestamp"`
// 优雅的被删除,不常用
DeletionGracePeriodSeconds *int64 `json:"deletionGracePeriodSeconds,omitempty" protobuf:"varint,10,opt,name=deletionGracePeriodSeconds"`
// 不常用,垃圾收集相关:https://kubernetes.io/zh/docs/concepts/workloads/controllers/garbage-collection/
OwnerReferences []OwnerReference `json:"ownerReferences,omitempty" patchStrategy:"merge" patchMergeKey:"uid" protobuf:"bytes,13,rep,name=ownerReferences"`
// 不常用,1.15 将被废弃
Initializers *Initializers `json:"initializers,omitempty" protobuf:"bytes,16,opt,name=initializers"`
// 不常用,垃圾回收相关:https://draveness.me/kubernetes-garbage-collector
Finalizers []string `json:"finalizers,omitempty" patchStrategy:"merge" protobuf:"bytes,14,rep,name=finalizers"`
// 不常用,有多 cluster 的时候,可能需要指定 cluster
ClusterName string `json:"clusterName,omitempty" protobuf:"bytes,15,opt,name=clusterName"`
// 不常用,不了解
ManagedFields []ManagedFieldsEntry `json:"managedFields,omitempty" protobuf:"bytes,17,rep,name=managedFields"`
}
1.1.3. type PodSpec struct
Code: kubernetes / vendor / k8s.io / api / core / v1 / types.go
We extract a few more important, the core of the easily confused:
- volumes
- InitContainers
- Containers
Other detailed analysis below, where the site offers a great Kubernetes guide , very comprehensive, helped me a lot
// PodSpec is a description of a pod.
type PodSpec struct {
// 常用,volumes,下文详细分析
Volumes []Volume `json:"volumes,omitempty" patchStrategy:"merge,retainKeys" patchMergeKey:"name" protobuf:"bytes,1,rep,name=volumes"`
// 常用,initContainers,下文详细分析
InitContainers []Container `json:"initContainers,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,20,rep,name=initContainers"`
// 常用,containers,下文详细分析
Containers []Container `json:"containers" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,2,rep,name=containers"`
// 常用,重启策略 Always、OnFailure、Never
RestartPolicy RestartPolicy `json:"restartPolicy,omitempty" protobuf:"bytes,3,opt,name=restartPolicy,casttype=RestartPolicy"`
// 不常用,退出等待多少时间~
TerminationGracePeriodSeconds *int64 `json:"terminationGracePeriodSeconds,omitempty" protobuf:"varint,4,opt,name=terminationGracePeriodSeconds"`
// 不常用,和重试有关,也许是标志失败Pod的重试最大时间,超过这个时间不会继续重试
ActiveDeadlineSeconds *int64 `json:"activeDeadlineSeconds,omitempty" protobuf:"varint,5,opt,name=activeDeadlineSeconds"`
// 不常用,DNS 配置,ClusterFirstWithHostNet、ClusterFirst、Default、None
DNSPolicy DNSPolicy `json:"dnsPolicy,omitempty" protobuf:"bytes,6,opt,name=dnsPolicy,casttype=DNSPolicy"`
// 常用,是一个供用户将 Pod 与 Node 进行绑定的字段
/*
spec:
nodeSelector:
disktype: ssd
*/
NodeSelector map[string]string `json:"nodeSelector,omitempty" protobuf:"bytes,7,rep,name=nodeSelector"`
/*
常用,pod 也可以访问 apiserver,但是用什么权限呢?ServiceAccountName 就是这个东西
首先需要创建一个 ServiceAccount:
apiVersion: v1
kind: ServiceAccount
metadata:
name: gpushare-device-plugin
namespace: kube-system
然后,可以直接使用:
spec:
serviceAccount: gpushare-device-plugin
*/
ServiceAccountName string `json:"serviceAccountName,omitempty" protobuf:"bytes,8,opt,name=serviceAccountName"`
// 不常用,废弃
DeprecatedServiceAccount string `json:"serviceAccount,omitempty" protobuf:"bytes,9,opt,name=serviceAccount"`
// 不常用,serviceAccount 是否自动挂载??
AutomountServiceAccountToken *bool `json:"automountServiceAccountToken,omitempty" protobuf:"varint,21,opt,name=automountServiceAccountToken"`
// 常用,但不太应该用!指定调度到哪个 node 上,跳过了调度器!
NodeName string `json:"nodeName,omitempty" protobuf:"bytes,10,opt,name=nodeName"`
// 常用,isolation,是否用宿主机的网络(namespace)一般 false
HostNetwork bool `json:"hostNetwork,omitempty" protobuf:"varint,11,opt,name=hostNetwork"`
// 常用,isolation,是否能看到宿主机的进程 (pid namespace)一般 false
HostPID bool `json:"hostPID,omitempty" protobuf:"varint,12,opt,name=hostPID"`
// 常用,isolation,是否能看到宿主机 IPC (ipc namespace)一般 false
HostIPC bool `json:"hostIPC,omitempty" protobuf:"varint,13,opt,name=hostIPC"`
// 常用,是否 pod 下面多个 container 共享一个 PID namespace
// 置为 true 后,container 能互相看到对方的进程
ShareProcessNamespace *bool `json:"shareProcessNamespace,omitempty" protobuf:"varint,27,opt,name=shareProcessNamespace"`
// 常用,参见 https://feisky.gitbooks.io/kubernetes/concepts/security-context.html
// 简单说,启用 selinux?限制端口,总之,限制不可信容器的使用
// 暂时不用关注
SecurityContext *PodSecurityContext `json:"securityContext,omitempty" protobuf:"bytes,14,opt,name=securityContext"`
// 常用,下载镜像也得有权限吧?不过默认是用 default serviceAccount 的 ImagePullSecrets
ImagePullSecrets []LocalObjectReference `json:"imagePullSecrets,omitempty" patchStrategy:"merge" patchMergeKey:"name"
// 常用,hostname
// If specified, the fully qualified Pod hostname will be "<hostname>.<subdomain>.<pod namespace>.svc.<cluster domain>".
Hostname string `json:"hostname,omitempty" protobuf:"bytes,16,opt,name=hostname"`
// 同上
Subdomain string `json:"subdomain,omitempty" protobuf:"bytes,17,opt,name=subdomain"`
// 常用,调度相关,共分 3 级,NodeAffinity、PodAffinity、PodAntiAffinity
// 详见:https://feisky.gitbooks.io/kubernetes/components/scheduler.html
Affinity *Affinity `json:"affinity,omitempty" protobuf:"bytes,18,opt,name=affinity"`
// 不常用,调度相关,指定调度器的名字
SchedulerName string `json:"schedulerName,omitempty" protobuf:"bytes,19,opt,name=schedulerName"`
// 常用,https://feisky.gitbooks.io/kubernetes/components/scheduler.html#taints-%E5%92%8C-tolerations
// 不调度到哪台机器
Tolerations []Toleration `json:"tolerations,omitempty" protobuf:"bytes,22,opt,name=tolerations"`
// 常用,指定 /etc/hosts,非常重要
/*
spec:
hostAliases:
- ip: "10.1.2.3"
hostnames:
- "foo.remote"
- "bar.remote"
cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
...
10.244.135.10 hostaliases-pod
10.1.2.3 foo.remote
10.1.2.3 bar.remote
*/
HostAliases []HostAlias `json:"hostAliases,omitempty" patchStrategy:"merge" patchMergeKey:"ip" protobuf:"bytes,23,rep,name=hostAliases"`
// 不重要,调度相关,指定被调度的优先级
// PriorityClass 是一个 resource 需要自己去创建,创建后在这里指定
// https://feisky.gitbooks.io/kubernetes/concepts/pod.html#%E4%BC%98%E5%85%88%E7%BA%A7
/*
apiVersion: scheduling.k8s.io/v1alpha1
kind: PriorityClass
metadata:
name: high-priority
value: 1000000
globalDefault: false
description: "This priority class should be used for XYZ service pods only."
*/
PriorityClassName string `json:"priorityClassName,omitempty" protobuf:"bytes,24,opt,name=priorityClassName"`
// 不常用,调度,Priority Admission Controller is enabled 后失效(读 PriorityClassName 中的 value),否则生效。越大越优先
Priority *int32 `json:"priority,omitempty" protobuf:"bytes,25,opt,name=priority"`
// 常用,写死 resolve.conf,我觉得 DNSConfig 和 DNSPolicy 类似,前者是用户写死,后者是根据用户选的策略系统自己填写
DNSConfig *PodDNSConfig `json:"dnsConfig,omitempty" protobuf:"bytes,26,opt,name=dnsConfig"`
// 不常用,pod 启动后额外的检测,只有通过才是 ready
// v1.11 引入:https://godleon.github.io/blog/Kubernetes/k8s-Pod-Overview/
ReadinessGates []PodReadinessGate `json:"readinessGates,omitempty" protobuf:"bytes,28,opt,name=readinessGates"`
// 不常用,支持多 CRI,v1.12 引入,比如 Pod 包含 Kata Containers/gVisor + runc 的多个容器
RuntimeClassName *string `json:"runtimeClassName,omitempty" protobuf:"bytes,29,opt,name=runtimeClassName"`
// 不常用,不详
EnableServiceLinks *bool `json:"enableServiceLinks,omitempty" protobuf:"varint,30,opt,name=enableServiceLinks"`
// 不常用,新搞出来的一个资源抢占的内容???
PreemptionPolicy *PreemptionPolicy `json:"preemptionPolicy,omitempty" protobuf:"bytes,31,opt,name=preemptionPolicy"`
}
1.1.3.1. type Volume struct
See Code: kubernetes / vendor / k8s.io / api / core / v1 / types.go
Volume I understand here is to create a volume, it is necessary volume name + volume source.
We commonly use the host address mapping into the container, as a volume, or a host device mapped into the container.
In addition, volumes more source is ceph, cinder, nfs, iscsi, etc., so that it has the ability to persistence. This part is very complicated, we simply Analysis
Core parameters:
- name
- volumeSource
type Volume struct {
// volume 名字
Name string `json:"name" protobuf:"bytes,1,opt,name=name"`
// Volume 的资源来源于哪里?
VolumeSource `json:",inline" protobuf:"bytes,2,opt,name=volumeSource"`
}
1.1.3.2. type VolumeSource struct
See Code: kubernetes / vendor / k8s.io / api / core / v1 / types.go
VolumeSource I understand that k8s support of back-end storage which, the most common is the host disk to mount containers as volume
type VolumeSource struct {
// 映射宿主机目录到 container 中,注意,这里仅仅是创建资源,映射在 container 的 spec 中的
/*
type HostPathVolumeSource struct {
// 宿主机路径,可以是 device、dir、socket 等
Path string `json:"path" protobuf:"bytes,1,opt,name=path"`
// 支持很多类型,File、socket、Device,最保险的使用默认值 "",一般来说兼容一切
Type *HostPathType `json:"type,omitempty" protobuf:"bytes,2,opt,name=type"`
}
*/
/*
举个例子:
spec:
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins
*/
HostPath *HostPathVolumeSource `json:"hostPath,omitempty" protobuf:"bytes,1,opt,name=hostPath"`
// 创建一个宿主机临时目录,给 container 用,这可能涉及到多个 container 之间的数据交互的配合,也可能是 container 本身受到 10GB 大小的限制,可能日志会写不下
/*
例子:
spec:
volumes:
- name: nginx-vol
emptyDir: {}
*/
EmptyDir *EmptyDirVolumeSource `json:"emptyDir,omitempty" protobuf:"bytes,2,opt,name=emptyDir"`
// 重点在 ceph 上,但是目前没有这个实力
NFS *NFSVolumeSource `json:"nfs,omitempty" protobuf:"bytes,7,opt,name=nfs"`
RBD *RBDVolumeSource `json:"rbd,omitempty" protobuf:"bytes,11,opt,name=rbd"`
PersistentVolumeClaim *PersistentVolumeClaimVolumeSource `json:"persistentVolumeClaim,omitempty" protobuf:"bytes,10,opt,name=persistentVolumeClaim"`
// 很多其他的 volume 数据源
...
1.1.4. type Container struct
initcontainer and container are struct [] Container type, it is only necessary to analyze Container, Container is the core of the core!
We summarize the commonly used parameters:
- name
- image
- command
- args
- envs
- ports
- resources
- volumeMounts
- lifecycle
- Transfers
- mail start
- devicePath
- imagePullPolicy
- stdin
- tty
Detailed as follows:
type Container struct {
// 常用,容器得有名字吧?
Name string `json:"name" protobuf:"bytes,1,opt,name=name"`
// 常用,容器得用镜像吧?
/* 例子
spec:
containers:
- name: nginx
image: nginx:1.8
*/
Image string `json:"image,omitempty" protobuf:"bytes,2,opt,name=image"`
// 常用,启动容器的命令行,会覆盖 docker 本身的 entrypoint
/*
spec:
containers:
- command:
- gpushare-device-plugin-v2
- -logtostderr
- --v=5
- --memory-unit=GiB
*/
Command []string `json:"command,omitempty" protobuf:"bytes,3,rep,name=command"`
// 常用,命令行参数
/*
spec:
containers:
- name: command-demo-container
image: debian
command: ["printenv"]
args: ["HOSTNAME", "KUBERNETES_PORT"]
*/
Args []string `json:"args,omitempty" protobuf:"bytes,4,rep,name=args"`
// 常用,工作目录
WorkingDir string `json:"workingDir,omitempty" protobuf:"bytes,5,opt,name=workingDir"`
// 常用,pod 对外的 port
/*
spec:
containers:
- ports:
- containerPort: 80
*/
Ports []ContainerPort `json:"ports,omitempty" patchStrategy:"merge" patchMergeKey:"containerPort" protobuf:"bytes,6,rep,name=ports"`
// 不常用,可能是 env 存放在 configmap resource 中,这里引用一下
EnvFrom []EnvFromSource `json:"envFrom,omitempty" protobuf:"bytes,19,rep,name=envFrom"`
// 传入容器的环境变量,比 EnvFrom 直接
Env []EnvVar `json:"env,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,7,rep,name=env"`
// 常用,需要的资源
/*
spec:
containers:
- resources:
limits:
memory: "300Mi"
cpu: "1"
requests:
memory: "300Mi"
cpu: "1"
*/
Resources ResourceRequirements `json:"resources,omitempty" protobuf:"bytes,8,opt,name=resources"`
// 常用,volume 映射,type VolumeMount struct 还需要仔细研究下
/*
spec:
containers:
- volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
*/
VolumeMounts []VolumeMount `json:"volumeMounts,omitempty" patchStrategy:"merge" patchMergeKey:"mountPath" protobuf:"bytes,9,rep,name=volumeMounts"`
// 常用,但需要研究,貌似是 blockdevice,需要先创建 PVC
/*
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteMany
volumeMode: Block
storageClassName: my-sc
resources:
requests:
storage: 1Gi
*/
/*
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: my-container
image: busybox
command:
- sleep
- “3600”
volumeDevices:
- devicePath: /dev/block
name: my-volume
imagePullPolicy: IfNotPresent
volumes:
- name: my-volume
persistentVolumeClaim:
claimName: my-pvc
*/
VolumeDevices []VolumeDevice `json:"volumeDevices,omitempty" patchStrategy:"merge" patchMergeKey:"devicePath"
// 重要,但暂时不看,健康检查
// https://feisky.gitbooks.io/kubernetes/introduction/201.html#%E5%81%A5%E5%BA%B7%E6%A3%80%E6%9F%A5
LivenessProbe *Probe `json:"livenessProbe,omitempty" protobuf:"bytes,10,opt,name=livenessProbe"`
// 重要,但暂时不看,监控检查
// https://feisky.gitbooks.io/kubernetes/introduction/201.html#%E5%81%A5%E5%BA%B7%E6%A3%80%E6%9F%A5
ReadinessProbe *Probe `json:"readinessProbe,omitempty" protobuf:"bytes,11,opt,name=readinessProbe"`
// 在 poststart 和 prestop 的时候可以插入执行的内容
/*
spec:
containers:
- name: lifecycle-demo-container
image: nginx
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"
preStop:
exec:
command: ["/usr/sbin/nginx","-s","quit"]
*/
Lifecycle *Lifecycle `json:"lifecycle,omitempty" protobuf:"bytes,12,opt,name=lifecycle"`
// 详见:https://k8smeetup.github.io/docs/tasks/debug-application-cluster/determine-reason-pod-failure/
// 貌似不用特别管
TerminationMessagePath string `json:"terminationMessagePath,omitempty" protobuf:"bytes,13,opt,name=terminationMessagePath"`
// 同上,注意
TerminationMessagePolicy TerminationMessagePolicy `json:"terminationMessagePolicy,omitempty"
// 常用,类型 Always、Never、IfNotPresent
ImagePullPolicy PullPolicy `json:"imagePullPolicy,omitempty" protobuf:"bytes,14,opt,name=imagePullPolicy,casttype=PullPolicy"`
// 同 Pod 中的 SecurityContext
SecurityContext *SecurityContext `json:"securityContext,omitempty" protobuf:"bytes,15,opt,name=securityContext"`
// 是否开启 stdin
Stdin bool `json:"stdin,omitempty" protobuf:"varint,16,opt,name=stdin"`
// 不常用,估计只能同时允许一个 stdin 的连接
StdinOnce bool `json:"stdinOnce,omitempty" protobuf:"varint,17,opt,name=stdinOnce"`
// 是否开启 tty
TTY bool `json:"tty,omitempty" protobuf:"varint,18,opt,name=tty"`
}
1.1.5. type PodStatus struct
PodStatus all be completed by the system, and create nothing to do, we just need to know you can
In general
- phase (Pod state)
- There are all state of the container below the Pod, for a reason, there are some strange field.
type PodStatus struct {
// 系统填写,Pending、Running、Succeeded、Failed、Unknown 就五种状态
Phase PodPhase `json:"phase,omitempty" protobuf:"bytes,1,opt,name=phase,casttype=PodPhase"`
// 系统填写,Pod 下多个 container 的状态,包括
/*
type PodCondition struct {
// ContainersReady,Initialized,Ready,PodScheduled
Type PodConditionType `json:"type" protobuf:"bytes,1,opt,name=type,casttype=PodConditionType"`
// True, False, Unknown. 不知道什么意思
Status ConditionStatus `json:"status" protobuf:"bytes,2,opt,name=status,casttype=ConditionStatus"`
// 上一次提交状态的时间?
LastProbeTime metav1.Time `json:"lastProbeTime,omitempty" protobuf:"bytes,3,opt,name=lastProbeTime"`
// 上一次提交状态变化的时间?
LastTransitionTime metav1.Time `json:"lastTransitionTime,omitempty" protobuf:"bytes,4,opt,name=lastTransitionTime"`
// 为啥会状态会变化
Reason string `json:"reason,omitempty" protobuf:"bytes,5,opt,name=reason"`
// 为啥状态会变化给用户看的
Message string `json:"message,omitempty" protobuf:"bytes,6,opt,name=message"`
}
*/
Conditions []PodCondition `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type" protobuf:"bytes,2,rep,name=conditions"`
// 人能看懂的为啥处于这个状态的原因
Message string `json:"message,omitempty" protobuf:"bytes,3,opt,name=message"`
// 工程师能看懂的为啥处于这个状态的原因
Reason string `json:"reason,omitempty" protobuf:"bytes,4,opt,name=reason"`
// 和抢占有关系,不懂 https://www.jianshu.com/p/bdcb9528a8b1
NominatedNodeName string `json:"nominatedNodeName,omitempty" protobuf:"bytes,11,opt,name=nominatedNodeName"`
// 调度该 pod 的 scheduler 的 IP
HostIP string `json:"hostIP,omitempty" protobuf:"bytes,5,opt,name=hostIP"`
// Pod 的 IP 地址?
PodIP string `json:"podIP,omitempty" protobuf:"bytes,6,opt,name=podIP"`
// 启动时间?,具体再议
StartTime *metav1.Time `json:"startTime,omitempty" protobuf:"bytes,7,opt,name=startTime"`
// initcontainer 的状态,因为它是最先启动的,它启动成功后,container 才能启动
InitContainerStatuses []ContainerStatus `json:"initContainerStatuses,omitempty" protobuf:"bytes,10,rep,name=initContainerStatuses"`
// container 状态
ContainerStatuses []ContainerStatus `json:"containerStatuses,omitempty" protobuf:"bytes,8,rep,name=containerStatuses"`
// QoS 相关的,可选 Guaranteed、Burstable、BestEffort
QOSClass PodQOSClass `json:"qosClass,omitempty" protobuf:"bytes,9,rep,name=qosClass"`
}
1.2. DaemonSet structure
Code: kubernetes / vendor / k8s.io / api / apps / v1 / types.go
Process analysis, we must pay attention to the contrast Pod
type DaemonSet struct {
// 同 Pod 第一个字段
metav1.TypeMeta `json:",inline"`
// 同 Pod 第二个字段
metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`
// 需要仔细分析,类似于 PodSpec
Spec DaemonSetSpec `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`
// 需要仔细分析,类似于 PodStatus
Status DaemonSetStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"`
}
1.2.1. type DaemonSetSpec struct
Code: kubernetes / vendor / k8s.io / api / apps / v1 / types.go
type DaemonSetSpec struct {
// 该 DaemonSet 部署在哪些机器上?用 selector 来过滤
Selector *metav1.LabelSelector `json:"selector" protobuf:"bytes,1,opt,name=selector"`
// 需要仔细分析
Template v1.PodTemplateSpec `json:"template" protobuf:"bytes,2,opt,name=template"`
// update 的策略,默认是 RollingUpdate
/*
type DaemonSetUpdateStrategy struct {
// 两种升级策略 RollingUpdate 和 OnDelete
Type DaemonSetUpdateStrategyType `json:"type,omitempty" protobuf:"bytes,1,opt,name=type"`
// 如果是 RollingUpdate,才生效,如果填数字 5,代表 5 个 5 个逐步升级,如果填 5%,则5% 5% 逐步升级,默认是 1
RollingUpdate *RollingUpdateDaemonSet `json:"rollingUpdate,omitempty" protobuf:"bytes,2,opt,name=rollingUpdate"`
}
*/
UpdateStrategy DaemonSetUpdateStrategy `json:"updateStrategy,omitempty" protobuf:"bytes,3,opt,name=updateStrategy"`
// 当 ready 后多少分钟,才认为该 DaemonSet 是 avaliable?
MinReadySeconds int32 `json:"minReadySeconds,omitempty" protobuf:"varint,4,opt,name=minReadySeconds"`
// 保存的历史的 checkpoint 有多少个,默认值是 10 个
RevisionHistoryLimit *int32 `json:"revisionHistoryLimit,omitempty" protobuf:"varint,6,opt,name=revisionHistoryLimit"`
}
1.2.2. PodTemplateSpec type struct
Code: kubernetes / vendor / k8s.io / api / core / v1 / types.go
type PodTemplateSpec struct {
// 同 Pod/DaemonSet 中第二个字段,不知道这是何意?
metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`
// 同 Pod 中的 PodSpec
Spec PodSpec `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`
}
1.2.3. type DaemonSetStatus struct
Code: kubernetes / vendor / k8s.io / api / apps / v1 / types.go
DaemonSetStatus much simpler than PodStatus, after all, its definition is deployed only in a Pod on each qualifying host.
type DaemonSetStatus struct {
// 正在运行这个 daemonset 的 node 个数
CurrentNumberScheduled int32 `json:"currentNumberScheduled" protobuf:"varint,1,opt,name=currentNumberScheduled"`
// 不该运行这个 daemonset 的 node 个数
NumberMisscheduled int32 `json:"numberMisscheduled" protobuf:"varint,2,opt,name=numberMisscheduled"`
// 应当运行这个 daemonset 的 node 个数
DesiredNumberScheduled int32 `json:"desiredNumberScheduled" protobuf:"varint,3,opt,name=desiredNumberScheduled"`
// 准备好运行这个 daemonset 的 node 个数
NumberReady int32 `json:"numberReady" protobuf:"varint,4,opt,name=numberReady"`
// The most recent generation observed by the daemon set controller.
// +optional
ObservedGeneration int64 `json:"observedGeneration,omitempty" protobuf:"varint,5,opt,name=observedGeneration"`
// 运行最新的 daemonset 的 node 个数
UpdatedNumberScheduled int32 `json:"updatedNumberScheduled,omitempty" protobuf:"varint,6,opt,name=updatedNumberScheduled"`
// 好烦
NumberAvailable int32 `json:"numberAvailable,omitempty" protobuf:"varint,7,opt,name=numberAvailable"`
NumberUnavailable int32 `json:"numberUnavailable,omitempty" protobuf:"varint,8,opt,name=numberUnavailable"`
CollisionCount *int32 `json:"collisionCount,omitempty" protobuf:"varint,9,opt,name=collisionCount"`
// 同 PodCondition
Conditions []DaemonSetCondition `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type" protobuf:"bytes,10,rep,name=conditions"`
}
Take a look at the actual state of DaemonSet, and definitions DaemonSetStatus really quite similar:
[root@k8s-master kubernetes]# kubectl describe daemonset gpushare-device-plugin-ds -n kube-system
Name: gpushare-device-plugin-ds
Selector: app=gpushare,component=gpushare-device-plugin,name=gpushare-device-plugin-ds
Node-Selector: gpushare=true
Labels: app=gpushare
component=gpushare-device-plugin
name=gpushare-device-plugin-ds
Annotations: deprecated.daemonset.template.generation: 2
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"extensions/v1beta1","kind":"DaemonSet","metadata":{"annotations":{},"name":"gpushare-device-plugin-ds","namespace":"kube-sy...
Desired Number of Nodes Scheduled: 1
Current Number of Nodes Scheduled: 1
Number of Nodes Scheduled with Up-to-date Pods: 1
Number of Nodes Scheduled with Available Pods: 1
Number of Nodes Misscheduled: 0
Pods Status: 1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=gpushare
component=gpushare-device-plugin
name=gpushare-device-plugin-ds
Annotations: scheduler.alpha.kubernetes.io/critical-pod:
Service Account: gpushare-device-plugin
Containers:
gpushare:
Image: registry.cn-hangzhou.aliyuncs.com/acs/k8s-gpushare-plugin:v2-1.12-lihao-test
Port: <none>
Host Port: <none>
Command:
gpushare-device-plugin-v2
-logtostderr
--v=5
--memory-unit=GiB
Limits:
cpu: 1
memory: 300Mi
Requests:
cpu: 1
memory: 300Mi
Environment:
KUBECONFIG: /etc/kubernetes/kubelet.conf
NODE_NAME: (v1:spec.nodeName)
Mounts:
/var/lib/kubelet/device-plugins from device-plugin (rw)
Volumes:
device-plugin:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/device-plugins
HostPathType:
Events: <none>
1.3 Analysis of gpushare-device-plugin configuration daemon
According to its documentation, gpushare-device-plugin
the configuration is daemonsetdevice-plugin-ds.yaml
[root@k8s-master kubernetes]# cat device-plugin-ds.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: gpushare-device-plugin-ds
namespace: kube-system
spec:
template:
metadata:
annotations:
# 调度相关
scheduler.alpha.kubernetes.io/critical-pod: ""
labels:
# 调度相关
component: gpushare-device-plugin
app: gpushare
name: gpushare-device-plugin-ds
# 就是 PodSpec
spec:
# 在 gpushare-device-plugin 的 device-plugin-rbac.yaml 中创建了该 serviceAccount
serviceAccount: gpushare-device-plugin
# 使用宿主机网络
hostNetwork: true
# 其实我觉得最好在最外层 spec 下用 selector
# 用 nodeSelector 也行,有 gpushare: true 的 node 才启动该 device plugin
nodeSelector:
gpushare: "true"
containers:
- image: registry.cn-hangzhou.aliyuncs.com/acs/k8s-gpushare-plugin:v2-1.12-lihao-test
name: gpushare
# 用 args 装 -logtostderr 没准更好
command:
- gpushare-device-plugin-v2
- -logtostderr
- --v=5
- --memory-unit=GiB
# 资源用不多
resources:
limits:
memory: "300Mi"
cpu: "1"
requests:
memory: "300Mi"
cpu: "1"
# 环境变量有这么几个
env:
- name: KUBECONFIG
value: /etc/kubernetes/kubelet.conf
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# 权限限制
securityContext:
# 不用放大权利
allowPrivilegeEscalation: false
capabilities:
# 不用任何额外的权限
drop: ["ALL"]
# 把 grpc unix socket(device plugin)映射到容器中
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
# 把 grpc 的 Unix socket 所在本地文件夹作为卷
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins