Use of rook-ceph

Preface

environment:centos 7.9 k8s 1.22.17

Why use rook to deploy ceph? From the official website, https://docs.ceph.com/en/latest/install/#recommended-methodswe can learn that ceph officially recommends using rook to deploy and manage ceph clusters in k8s clusters. So this article explains how to use rook to deploy a ceph cluster in k8s.

what is rook

rook官网:https://rook.io/
官网介绍,rook是开源的,Kubernetes的云原生存储。
rook是Kubernetes的存储操作员,Rook将分布式存储系统转变为自管理、自扩展、自修复的存储服务。它自动化了存储管理员的任务:部署、引导、配
置、发放、扩展、升级、迁移、灾难恢复、监控和资源管理。
Rook使用Kubernetes平台的强大功能,通过Kubernetes Operator为Ceph提供服务。
rook是Ceph存储提供商,Rook编排Ceph存储解决方案,使用专门的Kubernetes Operator实现自动化管理。Rook确保Ceph在Kubernetes上运行良好,
并简化部署和管理。

Rook是一个开源的云原生存储编排器,为Ceph存储提供平台、框架和支持,以便与云原生环境进行本地集成。
rook既可以k8s集群里创建一个ceph集群,rook还可以配置链接外部的ceph集群供k8s使用,本篇使用rook在k8s集群内部创建ceph集群。

prerequisites

1、安装依赖
#k8s所有节点安装lvm,因为OSD pods运行的节点需要
sudo yum install -y lvm2
2、确认内核是否支持rbd
#确认内核是否支持rbd
lsmod | grep rbd		#有输出说明支持
rbd                    83640  0 
libceph               306625  1 rbd
#如果没有输出,则临时载入rbd模块,重启失效
modprobe  rbd			#内核加载rbd
#写入配置文件,开机自动加载,
[root@master ~]# vim  /etc/modules-load.d/rbd.conf		#文件名任意,以.conf结尾即可
rbd										#内容写rbd即可
[root@master ~]# 
3、内核要求
如果从CephFS创建RWX卷,推荐的最低内核版本是4.17。如果内核版本低于4.17,则不会强制执行所请求的PVC大小。存储配额只在较新的内核上强制执行。
4、k8s版本要求
Kubernetes v1.21 or higher is supported by Rook.
5、服务器架构要求
Architectures supported are amd64 / x86_64 and arm64.
6、存储要求,满足一种即可
原始设备(没有分区或格式化的文件系统)
原始分区(没有格式化的文件系统)
LVM逻辑卷(没有格式化的文件系统)
块模式下存储类中可用的持久卷
使用命令lsblk -f查看,如果FSTYPE字段不为空,则在相应的设备上没有文件系统,则rook可以使用该设备。

Start deploying rook

#官方文档https://rook.io/docs/rook/v1.11/Getting-Started/quickstart/#tldr
#在k8smaster节点上克隆rook的源码
mkdir  ~/rook-ceph
cd ~/rook-ceph
yum install git -y
git clone --single-branch --branch v1.11.7 https://github.com/rook/rook.git
cd rook/deploy					#deploy目录下有两种安装方式
[root@master deploy]# ll							
drwxr-xr-x 5 root root   63 Jun 13 17:44 charts			#helm安装的ceph
drwxr-xr-x 4 root root 4096 Jun 13 17:44 examples		#yaml文件手动安装的rook-ceph
drwxr-xr-x 3 root root   22 Jun 13 17:44 olm

#本次采用yaml文件手动安装的rook-ceph
[root@master deploy]# cd examples

[root@master examples]# ls 	crds.yaml -f common.yaml -f operator.yaml	cluster.yaml images.txt 
crds.yaml  common.yaml  operator.yaml  cluster.yaml
[root@master examples]# 
#从上面的名字可以得知,crd.yaml就是自定义的crd资源清单,common.yaml定义了各种各样的rbac相关的role、rolebinding和serviceaccount
#operator.yaml文件里面用于创建一个configmap和deployment,该deployment就是Rook Ceph Operator
#cluster.yaml创建了kind: CephCluster的自定义资源,这个资源其实会安装Ceph daemon pods (mon, mgr, osd, mds, rgw)
#images.txt 镜像文件,可以知道需要哪些镜像,可以提前下载好镜像

#开始安装rook-ceph,安装顺序crds.yaml、common.yaml、operator.yaml、cluster.yaml
[root@master examples]# kubectl create -f crds.yaml -f common.yaml
[root@master examples]# vim operator.yaml			#可以先看下operator.yaml,该文件里面用于创建一个configmap和deployment
 ROOK_CSI_KUBELET_DIR_PATH: "/var/lib/kubelet"	#这个变量配置的是kubelet的路径,如果你的kubelet安装在其它目录请更改该参数
 ROOK_ENABLE_DISCOVERY_DAEMON: "true"		#是否启动discovery守护进程监视集群内节点上的裸存储设备,默认是false,这里设为true
 #如果你只打算基于StorageClassDeviceSets和pvc创建你的osd,这个守护进程不需要运行

operator.yaml文件configmap里面有一些注释掉的默认的镜像可能下载不了,找了个别人的镜像仓库,使用下面的方法解决:
docker pull registry.aliyuncs.com/it00021hot/csi-attacher:v4.1.0
docker tag registry.aliyuncs.com/it00021hot/csi-attacher:v4.1.0 registry.k8s.io/sig-storage/csi-attacher:v4.1.0
docker rmi registry.aliyuncs.com/it00021hot/csi-attacher:v4.1.0

docker pull registry.aliyuncs.com/it00021hot/csi-node-driver-registrar:v2.7.0
docker tag registry.aliyuncs.com/it00021hot/csi-node-driver-registrar:v2.7.0  registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.7.0
docker rmi registry.aliyuncs.com/it00021hot/csi-node-driver-registrar:v2.7.0

docker pull registry.aliyuncs.com/it00021hot/csi-resizer:v1.7.0
docker tag registry.aliyuncs.com/it00021hot/csi-resizer:v1.7.0 registry.k8s.io/sig-storage/csi-resizer:v1.7.0
docker rmi registry.aliyuncs.com/it00021hot/csi-resizer:v1.7.0 

docker pull  registry.aliyuncs.com/it00021hot/csi-provisioner:v3.4.0
docker tag registry.aliyuncs.com/it00021hot/csi-provisioner:v3.4.0 registry.k8s.io/sig-storage/csi-provisioner:v3.4.0
docker rmi registry.aliyuncs.com/it00021hot/csi-provisioner:v3.4.0

docker pull  registry.aliyuncs.com/it00021hot/csi-snapshotter:v6.2.1
docker tag registry.aliyuncs.com/it00021hot/csi-snapshotter:v6.2.1  registry.k8s.io/sig-storage/csi-snapshotter:v6.2.1
docker rmi registry.aliyuncs.com/it00021hot/csi-snapshotter:v6.2.1 

[root@master examples]# kubectl create  -f operator.yaml	#安装operator.yaml

Explanation of cluster.yaml file

[root@master examples]# grep -Ev '^$|#' cluster.yaml 
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph # namespace:cluster
spec:
  cephVersion:
    image: quay.io/ceph/ceph:v17.2.6
    allowUnsupported: false
  dataDirHostPath: /var/lib/rook		# 配置文件将被持久化的主机路径,必须指定.
  skipUpgradeChecks: false
  continueUpgradeAfterChecksEvenIfNotHealthy: false
  waitTimeoutForHealthyOSDInMinutes: 10
  mon:
    count: 3							#mon的个数,默认3个,对于生产环境,建议至少配置3个mon,应该指定奇数个mon
    allowMultiplePerNode: false			#是否允许多个mon在一个主机上,默认false不允许
  mgr:
    count: 2							#mgr的个数,默认2个,当需要mgr高可用性时,应当设置为2,实现一主一备
    allowMultiplePerNode: false			#是否允许多个mon在一个主机上,默认false不允许
    modules:
      - name: pg_autoscaler
        enabled: true
  dashboard:
    enabled: true						#ceph的dashboard面板,默认是开启的
    ssl: true
  monitoring:				
    enabled: false
    metricsDisabled: false
  network:
    connections:
      encryption:
        enabled: false
      compression:
        enabled: false
      requireMsgr2: false
  crashCollector:
    disable: false
  logCollector:
    enabled: true
    periodicity: daily # one of: hourly, daily, weekly, monthly
    maxLogSize: 500M # SUFFIX may be 'M' or 'G'. Must be at least 1M.
  cleanupPolicy:
    confirmation: ""
    sanitizeDisks:
      method: quick
      dataSource: zero
      iteration: 1
    allowUninstallWithVolumes: false
  annotations:
  labels:
  resources:
  removeOSDsIfOutAndSafeToRemove: false
  priorityClassNames:
    mon: system-node-critical
    osd: system-node-critical
    mgr: system-cluster-critical
  storage: 					#集群级存储配置和选则,即选择哪些磁盘作为OSD存储
    useAllNodes: true		#默认使用所有节点,如果有节点规划,则设为false
    useAllDevices: true		#默认使用所有没有文件系统的分区,如果有节点磁盘规划,则设为false
    config:
    onlyApplyOSDPlacement: false
  disruptionManagement:
    managePodBudgets: true
    osdMaintenanceTimeout: 30
    pgHealthCheckTimeout: 0

  healthCheck:
    daemonHealth:
      mon:
        disabled: false
        interval: 45s
      osd:
        disabled: false
        interval: 60s
      status:
        disabled: false
        interval: 60s
    livenessProbe:
      mon:
        disabled: false
      mgr:
        disabled: false
      osd:
        disabled: false
    startupProbe:
      mon:
        disabled: false
      mgr:
        disabled: false
      osd:
        disabled: false

[root@master examples]# 



[root@master examples]# vim cluster.yaml					#查看cluster.yaml,关注下下面这个参数

#重要提示:如果您重新安装集群,请确保在每个主机上删除此目录,否则在新集群上启动mons将失败。
#在Minikube中,'/data'目录被配置为在重启时持久化。在Minikube环境中使用“/data/rook”
dataDirHostPath: /var/lib/rook		#保持默认,不用改


[root@master examples]# kubectl create -f cluster.yaml		#安装cluster.yaml

View cephcluster

[root@ceph-1 examples]# kubectl -n rook-ceph get cephcluster
NAME        DATADIRHOSTPATH   MONCOUNT   AGE    PHASE   MESSAGE                        HEALTH      EXTERNAL   FSID
rook-ceph   /var/lib/rook     3          2d1h   Ready   Cluster created successfully   HEALTH_OK              25f12c96-29b0-4487-8e2d-ec24525e33f9
[root@ceph-1 examples]# 

View pod

[root@master examples]# kubectl  get all -n rook-ceph
NAME                                                   READY   STATUS      RESTARTS   AGE
pod/csi-cephfsplugin-c4dt7                             2/2     Running     0          15m
pod/csi-cephfsplugin-ddk4x                             2/2     Running     0          15m
pod/csi-cephfsplugin-nv2m8                             2/2     Running     0          3m42s
pod/csi-cephfsplugin-provisioner-58948fc785-jbmxs      5/5     Running     0          4m20s
pod/csi-cephfsplugin-provisioner-58948fc785-sr9r6      5/5     Running     0          15m
pod/csi-rbdplugin-ht49j                                2/2     Running     0          15m
pod/csi-rbdplugin-k5lwp                                2/2     Running     0          15m
pod/csi-rbdplugin-provisioner-84bfcb8bfc-8xb4f         5/5     Running     0          15m
pod/csi-rbdplugin-provisioner-84bfcb8bfc-m8kgp         5/5     Running     0          15m
pod/csi-rbdplugin-x9p4g                                2/2     Running     0          3m42s
pod/rook-ceph-crashcollector-ceph-1-6f67c84f4b-p2rxt   1/1     Running     0          15m
pod/rook-ceph-crashcollector-ceph-2-76d94b6769-x2q95   1/1     Running     0          13m
pod/rook-ceph-crashcollector-ceph-3-5ffd7fc4d-4tkfc    1/1     Running     0          13m
pod/rook-ceph-mgr-a-745c8fb9c-8lxkm                    3/3     Running     0          15m
pod/rook-ceph-mgr-b-77bdd5584b-wqb6w                   3/3     Running     0          15m
pod/rook-ceph-mon-a-5bdf5ccd9-ptzhl                    2/2     Running     0          17m
pod/rook-ceph-mon-b-577d5bfcd7-22vtt                   2/2     Running     0          15m
pod/rook-ceph-mon-c-84bccb66f5-k9lzj                   2/2     Running     0          15m
pod/rook-ceph-operator-58d8b7b5df-8blw4                1/1     Running     0          166m
pod/rook-ceph-osd-0-844ffbd8d7-9cclh                   2/2     Running     0          13m
pod/rook-ceph-osd-1-6ccd9cc645-jvqjh                   2/2     Running     0          13m
pod/rook-ceph-osd-2-54866f6f46-c7vpx                   2/2     Running     0          13m
pod/rook-ceph-osd-prepare-ceph-1-w5cnd                 0/1     Completed   0          13m
pod/rook-ceph-osd-prepare-ceph-2-rxml8                 0/1     Completed   0          13m
pod/rook-ceph-osd-prepare-ceph-3-dkq98                 0/1     Completed   0          12m

NAME                              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
service/rook-ceph-mgr             ClusterIP   10.233.42.104   <none>        9283/TCP            14m
service/rook-ceph-mgr-dashboard   ClusterIP   10.233.36.230   <none>        8443/TCP            14m
service/rook-ceph-mon-a           ClusterIP   10.233.3.156    <none>        6789/TCP,3300/TCP   17m
service/rook-ceph-mon-b           ClusterIP   10.233.62.240   <none>        6789/TCP,3300/TCP   15m
service/rook-ceph-mon-c           ClusterIP   10.233.34.110   <none>        6789/TCP,3300/TCP   15m

NAME                              DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/csi-cephfsplugin   3         3         3       3            3           <none>          15m
daemonset.apps/csi-rbdplugin      3         3         3       3            3           <none>          15m

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/csi-cephfsplugin-provisioner      2/2     2            2           15m
deployment.apps/csi-rbdplugin-provisioner         2/2     2            2           15m
deployment.apps/rook-ceph-crashcollector-ceph-1   1/1     1            1           15m
deployment.apps/rook-ceph-crashcollector-ceph-2   1/1     1            1           15m
deployment.apps/rook-ceph-crashcollector-ceph-3   1/1     1            1           15m
deployment.apps/rook-ceph-mgr-a                   1/1     1            1           15m
deployment.apps/rook-ceph-mgr-b                   1/1     1            1           15m
deployment.apps/rook-ceph-mon-a                   1/1     1            1           17m
deployment.apps/rook-ceph-mon-b                   1/1     1            1           15m
deployment.apps/rook-ceph-mon-c                   1/1     1            1           15m
deployment.apps/rook-ceph-operator                1/1     1            1           3h54m
deployment.apps/rook-ceph-osd-0                   1/1     1            1           13m
deployment.apps/rook-ceph-osd-1                   1/1     1            1           13m
deployment.apps/rook-ceph-osd-2                   1/1     1            1           13m

NAME                                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/csi-cephfsplugin-provisioner-58948fc785      2         2         2       15m
replicaset.apps/csi-rbdplugin-provisioner-84bfcb8bfc         2         2         2       15m
replicaset.apps/rook-ceph-crashcollector-ceph-1-6f67c84f4b   1         1         1       15m
replicaset.apps/rook-ceph-crashcollector-ceph-2-76d94b6769   1         1         1       13m
replicaset.apps/rook-ceph-crashcollector-ceph-2-fd87584cf    0         0         0       15m
replicaset.apps/rook-ceph-crashcollector-ceph-3-5ffd7fc4d    1         1         1       13m
replicaset.apps/rook-ceph-crashcollector-ceph-3-6955f6c55d   0         0         0       15m
replicaset.apps/rook-ceph-mgr-a-745c8fb9c                    1         1         1       15m
replicaset.apps/rook-ceph-mgr-b-77bdd5584b                   1         1         1       15m
replicaset.apps/rook-ceph-mon-a-5bdf5ccd9                    1         1         1       17m
replicaset.apps/rook-ceph-mon-b-577d5bfcd7                   1         1         1       15m
replicaset.apps/rook-ceph-mon-c-84bccb66f5                   1         1         1       15m
replicaset.apps/rook-ceph-operator-58d8b7b5df                1         1         1       3h54m
replicaset.apps/rook-ceph-osd-0-844ffbd8d7                   1         1         1       13m
replicaset.apps/rook-ceph-osd-1-6ccd9cc645                   1         1         1       13m
replicaset.apps/rook-ceph-osd-2-54866f6f46                   1         1         1       13m

NAME                                     COMPLETIONS   DURATION   AGE
job.batch/rook-ceph-osd-prepare-ceph-1   1/1           2s         13m
job.batch/rook-ceph-osd-prepare-ceph-2   1/1           4s         13m
job.batch/rook-ceph-osd-prepare-ceph-3   1/1           7s         12m
[root@master examples]# 

Install ceph-toolbox client tool

We have set up the ceph cluster above, so how to verify whether the ceph cluster is healthy? The official ceph-toolbox container client tool is provided to use ceph-related commands to operate the ceph cluster. ceph-toolbox is also a container. Just follow the steps on the official website to create ceph-toolbox:

ceph-toolbox官网:https://rook.io/docs/rook/v1.11/Troubleshooting/ceph-toolbox/
官网说toolbox有两种运行模式:
交互式:启动一个toolbox pod,您可以在其中连接并从shell执行Ceph命令;
一次性作业:使用Ceph命令运行脚本,并从作业日志中收集结果。
我们这里使用交互式,即启动一个toolbox pod,然后在其中使用shell执行Ceph命令。
toolbox的yaml文件就在源码包里面,直接安装即可。

[root@master examples]# ll toolbox.yaml toolbox-job.yaml 
-rw-r--r-- 1 root root 1868 Jul 11 17:31 toolbox-job.yaml	#一次性作业
-rw-r--r-- 1 root root 4220 Jul 11 17:31 toolbox.yaml		#交互式
[root@master examples]# 
[root@master examples]# kubectl  apply  -f toolbox.yaml 
deployment.apps/rook-ceph-tools created
[root@master examples]# 
[root@master examples]# kubectl  get pod -n rook-ceph 
NAME                                               READY   STATUS      RESTARTS   AGE
rook-ceph-tools-657868c8cf-sz5b7                   1/1     Running     0          25s
[root@master examples]# kubectl  -n rook-ceph exec -it rook-ceph-tools-657868c8cf-sz5b7 -- bash
bash-4.4$ ceph status			#查看ceph集群状态
  cluster:
    id:     25f12c96-29b0-4487-8e2d-ec24525e33f9
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 54m)
    mgr: a(active, since 51m), standbys: b
    osd: 3 osds: 3 up (since 51m), 3 in (since 52m)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 449 KiB
    usage:   62 MiB used, 147 GiB / 147 GiB avail
    pgs:     1 active+clean
 
bash-4.4$ ceph fsid						#查看ceph集群cluster ID
25f12c96-29b0-4487-8e2d-ec24525e33f9
bash-4.4$ 
bash-4.4$ cat /etc/ceph/ceph.conf   	#可以看到,容器里面其实保存了ceph.conf文件
[global]
mon_host = 10.233.34.110:6789,10.233.3.156:6789,10.233.62.240:6789		#3个mon的svc IP

[client.admin]
keyring = /etc/ceph/keyring
bash-4.4$ cat /etc/ceph/keyring   		#可以看到,容器里面其实保存了keyring文件
[client.admin]
key = AQBuZa1kTXIsLBAATbTu19OPcVATJX4rgnJFCQ==
bash-4.4$ ceph auth  get-key client.admin	#这条命令也可以看到keyring
AQBuZa1kTXIsLBAATbTu19OPcVATJX4rgnJFCQ==
bash-4.4$

Enable ceph-dashboard

Official website:https://rook.io/docs/rook/v1.11/Storage-Configuration/Monitoring/ceph-dashboard/#enable-the-ceph-dashboard

#在安装cluster.yaml是我们启用了ceph的dashboard面板
[root@ceph-1 examples]# grep -B9  -i dashboard cluster.yaml 
# enable the ceph dashboard for viewing cluster status
  dashboard:
    enabled: true
    # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
    # urlPrefix: /ceph-dashboard
    # serve the dashboard at the given port.
    # port: 8443
    # serve the dashboard using SSL
[root@ceph-1 examples]# kubectl -n rook-ceph get service	#查看svc
NAME                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
rook-ceph-mgr             ClusterIP   10.233.42.104   <none>        9283/TCP            43h
rook-ceph-mgr-dashboard   ClusterIP   10.233.36.230   <none>        8443/TCP            43h
..........................
说明:
rook-ceph-mgr这个service是报告Prometheus metrics的;
rook-ceph-mgr-dashboard这个service就是ceph的dashboard面板
[root@ceph-1 examples]# 
Rook为dashboard创建了一个默认用户,名称叫做admin,该用户密码保存在rook-ceph-dashboard-password这个secret里面
可以这样解密:kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{.data.password}" | base64 --decode && echo 

Visit dashboard

Since rook-ceph-mgr-dashboard svc is of ClusterIP type by default, external browsers cannot access it. Try to edit rook-ceph-mgr-dashboard. This svc is of NodePort type. Even if it is saved successfully during editing, it will automatically become The ClusterIP type is automatically restored. So this approach won't work.
According to the official website, there are many ways to implement it. NodePort is the simplest one. The official website provides the corresponding svc, which is in the source code:

[root@ceph-1 examples]# ll dashboard-external-https.yaml
-rw-r--r-- 1 root root 432 Jul 11 17:31 dashboard-external-https.yaml
[root@ceph-1 examples]# kubectl  apply  -f dashboard-external-https.yaml
[root@ceph-1 examples]# kubectl  get svc -n rook-ceph 
NAME                                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
rook-ceph-mgr                            ClusterIP   10.233.42.104   <none>        9283/TCP            44h
rook-ceph-mgr-dashboard                  ClusterIP   10.233.36.230   <none>        8443/TCP            44h
rook-ceph-mgr-dashboard-external-https   NodePort    10.233.63.61    <none>        8443:17468/TCP      8s
[root@ceph-1 examples]# 
浏览器访问:
https://192.168.244.6:17468/	
注意是https访问,不是http,账号admin,密码保存在rook-ceph-dashboard-password这个secret里面

Delete rook-ceph cluster

官网清除集群说明:https://rook.io/docs/rook/v1.11/Getting-Started/ceph-teardown/
#如果yaml文件还存在,则可以根据yaml删除资源
kubectl delete -f cluster.yaml
kubectl delete -f common.yaml
kubectl delete -f operator.yaml
rm -rf /var/lib/rook		#删除每台宿主机上的ceph默认配置目录

Guess you like

Origin blog.csdn.net/MssGuo/article/details/131187935
use
use