K8S原理剖析:存储原理剖析和实践

大纲

  • K8S存储整体框架及原理解析
  • K8S FlexVolume存储扩展机制
  • K8S CSI存储扩展机制

K8S存储整体框架及原理解析

Docker插件机制-架构&评价

优点:

  • 1)不重新编译docker的情况下可以提供针对鉴权、网络、卷等功能的扩展;
  • 2)基于Http JSON-PRC格式的接口与插件进行交互;
  • 3)支持容器化和非容器化多种部署形式;
  • 4)支持插件生命周期管理: 1.52+  docker plugin命令及其API; <1.52  docker volume/network;
  • 5)支持基于TLS的安全加固。

约束:

  • 某些插件(认证)的添加需要重启docker daemon

Docker Volume Plugin列表

名称 描述 地址
Azure File Storage
plugin
Lets you mount Microsoft Azure File Storage shares to Docker containers as volumes using the SMB 3.0 protocol. Learn
more.
https://github.com/Azure/azurefile
dockervolumedriver
BeeGFS Volume
Plugin
An open source volume plugin to create persistent volumes in a BeeGFS parallel file system. https://github.com/RedCoolBeans/
docker-volume-beegfs
Blockbridge plugin A volume plugin that provides access to an extensible set of container-based persistent storage options. It supports single
and multi-host Docker environments with features that include tenant isolation, automated provisioning, encryption,
secure deletion, snapshots and QoS.
https://github.com/blockbridge/blo
ckbridge-docker-volume
Contiv Volume
Plugin
An open source volume plugin that provides multi-tenant, persistent, distributed storage with intent based consumption. It
has support for Ceph and NFS.
https://github.com/rancher/convoy
DigitalOcean Block
Storage plugin
Integrates DigitalOcean’s block storage solution into the Docker ecosystem by automatically attaching a given block storage
volume to a DigitalOcean droplet and making the contents of the volume available to Docker containers running on that
droplet.
https://github.com/omallo/docker
volume-plugin-dostorage
DRBD plugin A volume plugin that provides highly available storage replicated by DRBD. Data written to the docker volume is replicated
in a cluster of DRBD nodes.
https://www.drbd.org/en/supporte
d-projects/docker
Flocker plugin A volume plugin that provides multi-host portable volumes for Docker, enabling you to run databases and other stateful
containers and move them around across a cluster of machines.
https://clusterhq.com/docker
plugin/
Fuxi Volume Plugin A volume plugin that is developed as part of the OpenStack Kuryr project and implements the Docker volume plugin API by
utilizing Cinder, the OpenStack block storage service.
https://github.com/openstack/fuxi
gce-docker plugin A volume plugin able to attach, format and mount Google Compute persistent-disks. https://github.com/mcuadros/gce
docker
GlusterFS plugin A volume plugin that provides multi-host volumes management for Docker using GlusterFS. https://github.com/calavera/docker
-volume-glusterfs
名称 描述 地址
Horcrux Volume
Plugin
A volume plugin that allows on-demand, version controlled access to your data. Horcrux is an open-source plugin,
written in Go, and supports SCP, Minio and Amazon S3.
https://github.com/muthu-r/horcrux
HPE 3Par Volume
Plugin
A volume plugin that supports HPE 3Par and StoreVirtual iSCSI storage arrays. https://github.com/hpe storage/python-hpedockerplugin/ -
Infinit volume
plugin
A volume plugin that makes it easy to mount and manage Infinit volumes using Docker. https://infinit.sh/documentation/dock er/volume-plugin
IPFS Volume
Plugin
An open source volume plugin that allows using an ipfs filesystem as a volume. http://github.com/vdemeester/docker -volume-ipfs
Keywhiz plugin A plugin that provides credentials and secret management using Keywhiz as a central repository. https://github.com/calavera/docker
volume-keywhiz
Local Persist
Plugin
A volume plugin that extends the default local driver’s functionality by allowing you specify a mountpoint anywhere
on the host, which enables the files to always persist, even if the volume is removed via docker volume rm.
https://github.com/CWSpear/local
persist
NetApp
Plugin(nDVP)
A volume plugin that provides direct integration with the Docker ecosystem for the NetApp storage portfolio. The
nDVP package supports the provisioning and management of storage resources from the storage platform to Docker
hosts, with a robust framework for adding additional platforms in the future.
https://github.com/NetApp/netappdv
p
Netshare plugin A volume plugin that provides volume management for NFS 3/4, AWS EFS and CIFS file systems. https://github.com/ContainX/docker
volume-netshare
Nimble Storage
Volume Plugin
A volume plug-in that integrates with Nimble Storage Unified Flash Fabric arrays. The plug-in abstracts array volume
capabilities to the Docker administrator to allow self-provisioning of secure multi-tenant volumes and clones.
https://connect.nimblestorage.com/co
mmunity/app-integration/docker
名称 描述 地址
OpenStorage Plugin A cluster-aware volume plugin that provides volume management for file and block storage solutions. It implements a
vendor neutral specification for implementing extensions such as CoS, encryption, and snapshots. It has example drivers
based on FUSE, NFS, NBD and EBS to name a few.
https://github.com/libopenstorage/
openstorage
Portworx Volume Plugin A volume plugin that turns any server into a scale-out converged compute/storage node, providing container granular
storage and highly available volumes across any node, using a shared-nothing storage backend that works with any docker
scheduler.
https://github.com/portworx/px
dev
Quobyte Volume Plugin A volume plugin that connects Docker to Quobyte’s data center file system, a general-purpose scalable and fault-tolerant
storage platform.
https://github.com/quobyte/docker
-volume
REX-Ray plugin A volume plugin which is written in Go and provides advanced storage functionality for many platforms including VirtualBox,
EC2, Google Compute Engine, OpenStack, and EMC.
https://github.com/emccode/rexray
Virtuozzo Storage and
Ploop plugin
A volume plugin with support for Virtuozzo Storage distributed cloud file system as well as ploop devices. https://github.com/virtuozzo/docke r-volume-ploop
VMware vSphere Storage
Plugin
Docker Volume Driver for vSphere enables customers to address persistent storage requirements for Docker containers in
vSphere environments.
https://github.com/vmware/docker
volume-vsphere

支持Kubernetes的平台和存储服务

K8S 存储能力-Volume概述

  • K8S中的普通Volume提供了在容器中挂卷的能力,它不是独立的K8S资源对象,不能通过k8s去管理(创建、删除等),只能在创建Pod时去引用。
  • Pod需要设置卷来源( spec.volume ) 和挂载点( spec.containers.volumeMounts ) 两个信息后才可以使用相应的Volume

K8S 存储能力: In-Tree Volume Plugins

K8S的VolumePlugin提供了插件化扩展存储的机制,分为内置插件(In-Tree Plugins)和外置插件(Out-of-Tree) 两种

名称 描述
awsElasticBlockStore mounts an Amazon Web Services (AWS) EBS Volume (Elastic Block Store)
azureDisk is used to mount a Microsoft Azure Data Disk into a Pod.
azureFile is used to mount a Microsoft Azure File Volume (SMB 2.1 and 3.0) into a Pod.
cephfs allows an existing CephFS volume to be mounted into your pod.
cinder is used to mount OpenStack Block Storage into a pod.
configMap The data stored in a ConfigMap object can be referenced in a volume of type configMap and then consumed by containerized applications running in a Pod.
downwardAPI is used to make downward API data available to applications. It mounts a directory and writes the requested data in plain text files
emptyDir is first created when a Pod is assigned to a Node, and exists as long as that Pod is running on that node. When a Pod is removed from a node for any reason, the
data in the emptyDir is deleted forever.
fc (fibre channel) allows an existing fibre channel volume to be mounted in a pod
flocker allows a Flocker dataset to be mounted into a pod.
gcePersistentDisk mounts a Google Compute Engine (GCE) Persistent Disk into your pod.
gitRepo mounts an empty directory and clones a git repository into it for your pod to use.
glusterfs allows a Glusterfs (an open source networked filesystem) volume to be mounted into your pod
hostPath mounts a file or directory from the host node’s filesystem into your pod.
iscsi allows an existing iSCSI (SCSI over IP) volume to be mounted into your pod
local represents a mounted local storage device such as a disk, partition or directory. can only be used as a statically created PersistentVolume.
名称 描述
nfs allows an existing NFS (Network File System) share to be mounted into your pod
persistentVolumeClaim is used to mount a PersistentVolume into a pod.
projected maps several existing volume sources into the same directory.
portworxVolume can be dynamically created through Kubernetes or it can also be pre-provisioned and referenced inside a Kubernetes pod.
quobyte allows an existing Quobyte volume to be mounted into your pod.
rbd allows a Rados Block Device volume to be mounted into your pod.
scaleIO ScaleIO is a software-based storage platform that can use existing hardware to create clusters of scalable shared block networked storage. The ScaleIO
volume plugin allows deployed pods to access existing ScaleIO volumes
secret is used to pass sensitive information, such as passwords, to pods
storageos allows an existing StorageOS volume to be mounted into your pod. StorageOS provides block storage to containers, accessible via a file system.
vsphereVolume used to mount a vSphere VMDK Volume into your Pod.

K8S 存储能力-PersistentVolume

Kubernetes通过Persistent Volume子系统API对管理员和用户提供了存储资源创建和使用的抽象

  • FlexVolume: 此Volume Driver允许不同厂商去开发他们自己的驱动来挂载卷到计算节点
  • PersistentVolumeClaim: K8提供的资源抽象的Volume Driver,让用户不用关心具体的Volume的实现细节

K8S FlexVolume存储扩展机制

K8S 1.5引入1.8GA的Out-Of-Tree Volume Plugin:

  • driver以二进制命令行形式实现FlexVolume API,以供Controller-Manager和Kubelet调用,对外接口实现容易;
  • DaemonSet方式部署确保Master和Node上都会将driver安装到插件目录;
  • Docker镜像+yaml配置的交付形式

Flex Volume Driver部署脚本和配置

部署脚本假设所需要的驱动二进制文件,且已经被至于部署镜像的/$DRIVER目录下:启动脚本将二进制driver重名名为.driver,再拷贝到<plugindir>/<vendor~driver>/.driver目录下,接着使用mv将其重命名为driver(确保驱动安装的原子性),最后进入死循环确保容器活着。

Flex Volume CLI API

步骤 命令 描述
Init <driver executable> init 初始化驱动。在Kubelet和Controller-Manager初始化时被调用。若调用成功则需要返回一个展示对应驱动
所支持的FlexVolume能力的map,现在只包含一个必填字段attach,用于表明本驱动是否需要attach和
detach操作。为向后兼容该字段一般默认值设为true。
Attach <driver executable> attach <json options> <node
name>
将给定规格的卷添加到给定的主机上。若调用成功则返回存储设备添加到该主机的路径。 Kubelet和
Controller-Manager都需要调用该方法。
Detach <driver executable> detach <mount device> <node
name>
卸载给定主机上的指定卷。 Kubelet和Controller-Manager都需要调用该方法。
Wait for attach <driver executable> waitforattach <mount device>
<json options>
等待卷被添加到远程节点。若调用成功则将返回设备路径。 Kubelet和Controller-Manager都需要调用该方
法。
Volume is
Attached
<driver executable> isattached <json options> <node
name>
检查卷是否已被添加到节点上。 Kubelet和Controller-Manager都需要调用该方法。
Mount device <driver executable> mountdevice <mount dir> <mount
device> <json options>
将存储设备挂载到一个将被pod使用的全局路径上。 Kubelet需要调用该方法。
Unmount device <driver executable> unmountdevice <mount device> 将存储设备卸载。 This is called once all bind mounts have been unmounted. Kubelet需要调用该方法。
Mount <driver executable> mount <mount dir> <json options> 将卷挂载到指定目录。 Kubelet需要调用该方法。
Unmount <driver executable> unmount <mount dir> 将卷进行卸载。 Kubelet需要调用该方法。

K8S CSI存储扩展机制

术语 含义
CO 容器编排系统(Container Orchestrator),使用CSI gRPC服务来与插件通信
RPC 远程方法调用(Remote Procedure Call)
Plugin 插件实现,实现CSI服务的gRPC访问端点
SP 存储提供商(Storage Provider),负责提供CSI插件实现
Volume 卷, CO管理的容器可使用的存储单元
Block Volume 块设备卷
Mounted Volume 使用指定文件系统挂载到容器的卷,并显示为容器内的一个目录
Workload 工作负载,是CO任务调度的基本单元,可以是一个或一组容器
Node 用户运行工作负载的主机,从插件的角度通过节点 ID来进行唯一标识
In-Tree 内置的,存在于K8S核心代码仓库内的代码
Out-Of-Tree 外置的,存在于K8S核心代码仓库外的代码
CSI Volume Plugin 一个新的内置卷插件,作为一个适配器来使得外置的第三方CSI卷驱动可以被K8S所使用
CSI Volume Driver 一个外置的CSI兼容的卷插件驱动,可通过K8S卷插件被K8S所使用

CSI通用架构

CO通过gRPC与插件交互,每个SP必须实现以下两个plugin:
• Node Plugin: 需要运行在使用Volume的Node上,主要负责Volume Mount/Unmount等操作
• Controller Plugin:可以运行在任何节点上,主要负责Volume Creation/Deletion、 Attach/Detach等操作

CO与Plugin的交互: 01.卷的创建和Attach

• 卷的Mount系列操作是由Workload的启动所触发的
• K8S中volume的全局挂载路径(存储挂载点)格式样例:
/var/lib/kubelet/plugins/kubernetes.io/$volume_plugin/mounts/$volume_name
• K8S中volume的workload挂载路径(软链)格式样例:
/var/lib/kubelet/pods/$pod_id/volumes/$volume_plugin/$volume_name

卷的生命周期

RPC接口集合-Identity

CSI规范定义了3类RPC集合:
• Identity Service: Node Plugin和Controller Plugin都需要实现的RPC集合
• Controller Service: Controller Plugin需要实现的RPC集合
• Node Service: Node Plugin需要实现的RPC集合
Identity Service RPC:身份服务RPC允许CO查询插件的功能,健康状况和其他元数据。

RPC接口集合-Controller

Controller Service RPC: 控制服务RPC提供卷的创建、删除、 Attach、 Detach、查询等功能,以及卷快照的创建、删除、查询等功能

RPC接口集合-Node

K8S CSI架构

K8S 1.9实现了CSI plugin alpha版本, 1.11版本已升至Beta

为了部署一个容器化的第三方CSI volume
driver,存储提供商需要执行如下操作:
1. 创建一个实现CSI规范描述的插件功能,并通过Unix套接字来暴露gPRC访问接口的”CSI volume driver” 容器;
2. 结合使用K8S团队提供的帮助容器来部署CSI volume driver,具体需要创建如下两类K8S对象:

  • 1) StatefulSet:用于与K8S控制器进行交互,实例数1,包含3个容器( CSI volume driver、 external-attacher 、 externalprovisioner ),需要挂载一个挂载点为/var/lib/csi/sockets/pluginproxy/的emptyDir volume
  • 2) DaemonSet :包含2个容器( CSI volumedriver、 K8S CSI Helper),挂载3个hostpath volume

3. 集群管理员为存储系统在K8S集群中部署上述StatefulSet和DaemonSet

发布了389 篇原创文章 · 获赞 1041 · 访问量 163万+

猜你喜欢

转载自blog.csdn.net/fly910905/article/details/103837685