Microservice architecture series (1) virtual platform, distributed storage, and high-availability k8s cluster environment construction

1. Construction of physical machine virtual platform

1. Conversion diagram from real to required architecture (each physical machine requires two hard disks for distributed storage and system disks. Mine is an old machine from 10 years ago and does not have any big requirements):

 

2. For system installation, go to the official website to download the Proxmox VE 7.x ISO Installer  (select the version according to your needs), download the promox image, download the USB disk creation tool , and finally select the DD mode in the pop-up box; if the machine is relatively new, you can choose the USB disk creation or installation tool yourself . (U disk recovery requires manual formatting for daily use: cmd-->diskpart-->list disk-->select 1. The actual disk number corresponding to u shall prevail--->clean--->create partition primary--> active--->format fs=ntfs label="my u pan" quick--->assign)

3. Follow the step-by-step installation interface. There is nothing to pay attention to in the next step. It is best to have two disks per machine, one to install the system and one larger to be used as a spare for distributed storage. Just be careful to select the drive letter. 

2. Set up a distributed storage ceph environment (at least three machines must be installed on each)

1. Nano /etc/apt/sources.list Replace the following sources into the sources.list file and save and exit (ctrl +o to save ---> Enter to replace the same name ---> ctrl +x to exit)

#deb http://ftp.debian.org/debian bullseye main contrib
#deb http://ftp.debian.org/debian bullseye-updates main contrib
# security updates
#deb http://security.debian.org bullseye-security main contrib
# debian aliyun source
deb https://mirrors.aliyun.com/debian/ bullseye main non-free contrib
deb-src https://mirrors.aliyun.com/debian/ bullseye main non-free contrib
deb https://mirrors.aliyun.com/debian-security/ bullseye-security main
deb-src https://mirrors.aliyun.com/debian-security/ bullseye-security main
deb https://mirrors.aliyun.com/debian/ bullseye-updates main non-free contrib
deb-src https://mirrors.aliyun.com/debian/ bullseye-updates main non-free contrib
deb https://mirrors.aliyun.com/debian/ bullseye-backports main non-free contrib
deb-src https://mirrors.aliyun.com/debian/ bullseye-backports main non-free contrib
# proxmox source
# deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription
deb https://mirrors.ustc.edu.cn/proxmox/debian/pve bullseye pve-no-subscription

2. nano /etc/apt/sources.list.d/pve-enterprise.list comment the original synchronization address, save and exit

# deb https://enterprise.proxmox.com/debian/pve buster pve-enterprise

3. Nano /etc/apt/sources.list.d/ceph.list replaces the source of ceph, saves and exits

deb http://mirrors.ustc.edu.cn/proxmox/debian/ceph-pacific bullseye main

4. Update source

root@masterxxx:~# apt-get update 
root@masterxxx:~# apt-get upgrade 
root@masterxxx:~# apt-get dist-upgrade 

5. Check whether the nano /etc/apt/sources.list.d/ceph.list file has changed. If there is any change, repeat step 3 again.

6. Configure the Promox7.1 cluster. I created it by logging in to master001, and 002 and 003 joined directly.

7. Enter the ceph button below the cluster button and a configuration button will pop up in the pop-up box. Click to configure the internal and external network IP and you are done (I use master001 as the management node) 

8. Then enter each ceph node (mine is also a Promox node) to view and bind the spare hard disk. Mine is /dev/sda, and view the binding according to the actual node.

root@masterxxx:~# fdisk -l
root@masterxxx:~# pveceph osd create /dev/sda

9. Create ceph's pool-k8s and pool-vm pools for later use. If you change the size, click Advanced and change pgs to the corresponding value. I created it directly by default, and finally uploaded the system image.

10. K8s basic system installation (select the hard disk and image that were created and uploaded in the previous step, follow the prompts to set up step by step, wait for the creation completion point and run on the right to complete the basic system installation), and finally remove the optical drive (without removing the migration exception) and then convert to template

3. 3 master and 3 slave k8s cluster construction

1. Introduction to environment

System: RockyLinux8.6 (4 cores 8G) * 6 units---->The basic system only needs 3.5G virtual memory to run at the same time

High availability middleware: nginx+KeepAlive (to achieve high availability of control nodes)

Network middleware: Calico

Time synchronization: chrony

Specific node details:

2. K8s installation diagram. My physical machine is in the 172.16.0 segment and the virtual machine is in the 172.16.1 segment. They can communicate with each other. You can set the same IP segment according to the actual situation.

3. Completely clone the template (do not link clone) a virtual machine (for k8s basic environment)

3.1. Basic tool installation

[root@anonymous ~]# yum install -y yum-utils device-mapper-persistent-data lvm2 wget net-tools nfs-utils lrzsz gcc gcc-c++ make cmake libxml2-devel openssl-devel curl curl-devel unzip sudo  libaio-devel wget vim ncurses-devel autoconf automake zlib-devel  epel-release openssh-server socat ipvsadm conntrack telnet

 3.2. Time synchronization configuration

[root@anonymous ~]# yum -y install chrony
[root@anonymous ~]# systemctl enable chronyd --now
[root@anonymous ~]# vim /etc/chrony.conf
删除:
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org iburst
server 2.centos.pool.ntp.org iburst
server 3.centos.pool.ntp.org iburst
在原来的位置,插入国内 NTP 服务器地址
server ntp1.aliyun.com iburst
server ntp2.aliyun.com iburst
server ntp1.tencent.com iburst
server ntp2.tencent.com iburst
[root@anonymous ~]# systemctl restart chronyd

3.3. Close the firewall and swap partition

[root@anonymous ~]# systemctl stop firewalld ; systemctl disable firewalld
# 永久关闭SELINUX 需要重启主机才会生效
[root@anonymous ~]# sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
# 临时关闭SELINUX 
[root@anonymous ~]# setenforce 0
 
#临时关闭swap
[root@anonymous ~]# swapoff -a
#永久关闭:注释 swap 挂载  /dev/mapper/centos-swap swap
[root@anonymous ~]# vim /etc/fstab

3.4. Modify kernel parameters

[root@anonymous ~]# modprobe br_netfilter
[root@anonymous ~]# lsmod | grep br_netfilter
 
[root@anonymous ~]# cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
 
[root@anonymous ~]# sysctl -p /etc/sysctl.d/k8s.conf

3.5. Configure the installation source

# docker的:
[root@anonymous ~]# yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# k8s的:
[root@anonymous ~]# tee /etc/yum.repos.d/kubernetes.repo <<-'EOF'
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
EOF

3.6. Install k8s and container runtime (the docker I selected)

#安装k8s
[root@anonymous ~]# yum install docker-ce -y
[root@anonymous ~]# systemctl start docker && systemctl enable docker.service
 
[root@anonymous ~]# tee /etc/docker/daemon.json << 'EOF'
{
 "registry-mirrors":["https://vh3bm52y.mirror.aliyuncs.com","https://registry.docker-cn.com","https://docker.mirrors.ustc.edu.cn","https://dockerhub.azk8s.cn","http://hub-mirror.c.163.com"],
  "exec-opts": ["native.cgroupdriver=systemd"]
} 
EOF
 
[root@anonymous ~]# systemctl daemon-reload
[root@anonymous ~]# systemctl restart docker
[root@anonymous ~]# systemctl enable docker
[root@anonymous ~]# systemctl status docker
 
# 安装k8s组件
[root@anonymous ~]# yum install -y kubelet-1.23.1 kubeadm-1.23.1 kubectl-1.23.1
[root@anonymous ~]# systemctl enable kubelet
 
# kubelet :运行在集群所有节点上,用于启动 Pod 和容器等对象的工具
# kubeadm :用于初始化集群,启动集群的命令工具
# kubectl :用于和集群通信的命令行,通过 kubectl 可以部署和管理应用,查看各种资源,创建、删除和更新各种组件

3.7. Close the current virtual machine generation template and right-click-->Clone-->Full Clone-->A total of 6 virtual machines will be generated until completion.

3.8. Configure a fixed IP and host name for the cloned template machine.

# 设置固定ip(我的网卡是eth0 具体网卡根据实际情况修改)
[root@192 network-scripts]# vim /etc/sysconfig/network-scripts/ifcfg-eth0
# 配置参考 x根据实际主机ip补齐
BOOTPROTO=static
IPADDR=172.16.1.X
NETMASK=255.255.255.0
GATEWAY=172.16.1.1
DNS1=172.16.1.1

# 设置主机名
[root@anonymous ~]# hostnamectl set-hostname master083 && bash
[root@master083 ~]# vim /etc/hosts
# 新增的主机解析配置
172.16.1.83 master083
172.16.1.78 master078
172.16.1.79 master079
172.16.1.80 worker080
172.16.1.81 worker081
172.16.1.82 worker082

4. Master node (master083, master078, master079) configuration

4.1. Select a virtual machine as the master node machine of the control node (I chose: 172.16.1.83 master083)

# 一路回车,不输入密码
[root@master083 ~]# ssh-keygen
# 把本地的 ssh 公钥文件安装到远程主机对应的账户,yes 输入远程机密码
[root@master083 ~]# ssh-copy-id master078
[root@master083 ~]# ssh-copy-id master079
[root@master083 ~]# ssh-copy-id master083
[root@master083 ~]# ssh-copy-id worker080
[root@master083 ~]# ssh-copy-id worker081
[root@master083 ~]# ssh-copy-id worker082

4.2. Master node nginx+KeepAlive (all 3 master nodes must be installed)

[root@master083 ~]# yum install nginx keepalived nginx-mod-stream -y

4.3. Modify or replace nignx configuration (change all 3 master nodes, file address: /etc/nginx/nginx.conf)

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
include /usr/share/nginx/modules/*.conf;
events {
    worker_connections 1024;
}
stream {
    log_format  main  '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';
    access_log  /var/log/nginx/k8s-access.log  main;
    upstream k8s-apiserver {
       server 172.16.1.83:6443 weight=5 max_fails=3 fail_timeout=30s;
       server 172.16.1.78:6443 weight=5 max_fails=3 fail_timeout=30s;
       server 172.16.1.79:6443 weight=5 max_fails=3 fail_timeout=30s;
    }
 
    server {
       listen 16443; # 由于nginx与master节点复用,这个监听端口不能是6443,否则会冲突
       proxy_pass k8s-apiserver;
    }
}
 
http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';
    access_log  /var/log/nginx/access.log  main;
    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 2048;
    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;
    server {
        listen       80 default_server;
        server_name  _;
 
        location / {
        }
    }
}

4.4. Modify or configure KeepAlive (change all three master nodes, file address: /etc/keepalived/keepalived.conf)

global_defs {
   notification_email {
     [email protected]
     [email protected]
     [email protected]
   }
   notification_email_from [email protected]
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id NGINX_MASTER
}
 
vrrp_script check_nginx {
    script "/etc/keepalived/check_nginx.sh"
}
 
vrrp_instance VI_1 {
    state MASTER           # 备用服务器用BACKUP(多个同为MASTER 会互ping不通)我83是MASTER 
    interface ens18        # 修改为实际网卡名
    virtual_router_id 83   # VRRP 路由 ID实例,每个实例是唯一的
    priority 100           # 优先级,备服务器设置 78和79可以设置90/80小于100优先级不一样就行
    advert_int 1           # 指定VRRP 心跳包通告间隔时间,默认1秒
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    # 虚拟IP
    virtual_ipaddress {
        172.16.1.83/24     # 3台控制节点虚拟统一暴露1台,我内网12这台没被占用,根据实际情况自己选择
    }
    track_script {
        check_nginx
    }
}

4.5. For check_nginx.sh reference, please add execution permissions.

#!/bin/bash
#1、判断Nginx是否存活
counter=`ps -C nginx --no-header | wc -l`
if [ $counter -eq 0 ]; then
    #2、如果不存活则尝试启动Nginx
    service nginx start
    sleep 2
    #3、等待2秒后再次获取一次Nginx状态
    counter=`ps -C nginx --no-header | wc -l`
    #4、再次进行判断,如Nginx还不存活则停止Keepalived,让地址进行漂移
    if [ $counter -eq 0 ]; then
        service  keepalived stop
    fi
fi

4.6. Start nginx and KeepAlive on 3 master nodes

[root@master083 ~]# chmod +x /etc/keepalived/check_nginx.sh
[root@master083 ~]# systemctl daemon-reload
[root@master083 ~]# systemctl enable nginx keepalived
[root@master083 ~]# systemctl start nginx
[root@master083 ~]# systemctl start keepalived

4.7. I am creating the k8s cluster configuration file (I selected the keepalived master-->master083)

# yaml配置参考
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.23.1
# 上面配置的虚拟主机ip
controlPlaneEndpoint: 172.16.1.46:16443
imageRepository: registry.aliyuncs.com/google_containers
apiServer:
  certSANs:
    - 172.16.1.86
    - 172.16.1.78
    - 172.16.1.79
    - 172.16.1.80
    - 172.16.1.81
    - 172.16.1.82
    - 172.16.1.46
networking:
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.10.0.0/16
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind:  KubeProxyConfiguration
mode: ipvs

4.8. Execute the configuration file to initialize the cluster (I selected the keepalived master-->master083)

# 初始化
[root@master083 ~]# kubeadm init --config kubeadm-config.yaml --ignore-preflight-errors=SystemVerification
# 拷贝admin用户配置
[root@master083 ~]# mkdir -p $HOME/.kube
[root@master083 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@master083 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 查看节点
[root@master083 ~]# kubectl get nodes

5. The remaining nodes are connected to the cluster

5.1. Control node access, taking master079 as an example (repeat this step for master078)

# 创建证书存放目录
[root@master079 ~]# cd /root && mkdir -p /etc/kubernetes/pki/etcd &&mkdir -p ~/.kube/
# 拷贝证书
[root@master083 ~]# scp /etc/kubernetes/pki/ca.crt master079 :/etc/kubernetes/pki/
[root@master083 ~]# scp /etc/kubernetes/pki/ca.key master079 :/etc/kubernetes/pki/
[root@master083 ~]# scp /etc/kubernetes/pki/sa.key master079 :/etc/kubernetes/pki/
[root@master083 ~]# scp /etc/kubernetes/pki/sa.pub master079 :/etc/kubernetes/pki/
[root@master083 ~]# scp /etc/kubernetes/pki/front-proxy-ca.crt master079 :/etc/kubernetes/pki/
[root@master083 ~]# scp /etc/kubernetes/pki/front-proxy-ca.key master079 :/etc/kubernetes/pki/
[root@master083 ~]# scp /etc/kubernetes/pki/etcd/ca.crt master079 :/etc/kubernetes/pki/etcd/
[root@master083 ~]# scp /etc/kubernetes/pki/etcd/ca.key master079 :/etc/kubernetes/pki/etcd/
# 查看加入信息的token和sha256
[root@master083 ~]# kubeadm token create --print-join-command
# xxx根据上条命令查看到的信息进行替换(系统回滚可能导致系统时间问题,你可以手动刷新下系统时间:chronyc -a makestep)
[root@master079 ~]# kubeadm join 172.16.1.83:16443  --token xxx --discovery-token-ca-cert-hash sha256:xxx --control-plane --ignore-preflight-errors=SystemVerification
# 复制拷贝admin
[root@master079 ~]# mkdir -p $HOME/.kube
[root@master079 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@master079 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 查看节点信息
[root@master079 ~]# kubectl get nodes

5.2. Work node node access

[root@worker080 ~]# kubeadm join 192.168.0.12:16443 --token *** --discovery-token-ca-cert-hash sha256:****** --ignore-preflight-errors=SystemVerification
[root@worker081 ~]# kubeadm join 192.168.0.12:16443 --token *** --discovery-token-ca-cert-hash sha256:****** --ignore-preflight-errors=SystemVerification
[root@worker082 ~]# kubeadm join 192.168.0.12:16443 --token *** --discovery-token-ca-cert-hash sha256:****** --ignore-preflight-errors=SystemVerification
# 为工作节点打标签标记下
[root@master083 ~]# kubectl label node worker080 node-role.kubernetes.io/worker=worker
[root@master083 ~]# kubectl label node worker081 node-role.kubernetes.io/worker=worker
[root@master083 ~]# kubectl label node worker082 node-role.kubernetes.io/worker=worker
[root@master083 ~]# kubectl get nodes

6. Network plug-in installation and detection. At this point, the basic deployment of k8s is completed.

6.1. Install calico (calico.yaml download address: https://docs.projectcalico.org/manifests/calico.yaml ), which will involve Google sources causing installation failure and some customized configurations. The calico file used in this article is referenced . If you are familiar with it, you can also make your own

# 安装
[root@master083 ~]# kubectl apply -f calico.yaml
# 等待自动配置后查看网络是否生效
[root@master083 ~]# kubectl get nodes -owide
NAME        STATUS   ROLES                  AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                           KERNEL-VERSION              CONTAINER-RUNTIME
master078   Ready    control-plane,master   37h   v1.23.1   172.16.1.78   <none>        Rocky Linux 8.6 (Green Obsidian)   4.18.0-372.9.1.el8.x86_64   docker://20.10.21
master079   Ready    control-plane,master   37h   v1.23.1   172.16.1.79   <none>        Rocky Linux 8.6 (Green Obsidian)   4.18.0-372.9.1.el8.x86_64   docker://20.10.21
master083   Ready    control-plane,master   37h   v1.23.1   172.16.1.83   <none>        Rocky Linux 8.6 (Green Obsidian)   4.18.0-372.9.1.el8.x86_64   docker://20.10.21
worker080   Ready    worker                 37h   v1.23.1   172.16.1.80   <none>        Rocky Linux 8.6 (Green Obsidian)   4.18.0-372.9.1.el8.x86_64   docker://20.10.21
worker081   Ready    worker                 37h   v1.23.1   172.16.1.81   <none>        Rocky Linux 8.6 (Green Obsidian)   4.18.0-372.9.1.el8.x86_64   docker://20.10.21
worker082   Ready    worker                 37h   v1.23.1   172.16.1.82   <none>        Rocky Linux 8.6 (Green Obsidian)   4.18.0-372.9.1.el8.x86_64   docker://20.10.21

6.2. Create and enter the pod to check whether the dns and external network are smooth.

[root@master083 ~]# docker pull busybox:1.28

# 外网检测
[root@master083 ~]# kubectl run busybox --image busybox:1.28 --restart=Never --rm -it busybox -- sh
/ # ping www.baidu.com

# dns检测
[root@master083 ~]# kubectl run busybox --image busybox:1.28 --restart=Never --rm -it busybox -- sh
/ # nslookup kubernetes.default.svc.cluster.local

7. Set up a k8s distributed storage environment (you can ignore this step if you use nfs shared storage)

7.1. Create a distributed storage operation account kube on the ceph master node.

root@master001:~# ceph auth get-or-create client.kube mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=pool-k8s' -o ceph.client.kube.keyring

# 2、查看kube用户和admin用户的key 
root@master001:~# ceph auth get-key client.admin
root@master001:~# ceph auth get-key client.kube

7.2. Associate ceph account information on the k8s master node

# 1、创建dev命名空间
[root@master083 ~]# kubectl create ns dev
# 2、创建 admin secret替换第7.2步client.admin获取到的值
[root@master083 ~]# kubectl create secret generic ceph-secret --type="kubernetes.io/rbd" \
--from-literal=key='AQDSdZBjX15VFBAA+zJDZ8reSLCm2UAxtEW+Gw==' \
--namespace=kube-system
# 3、在 dev 命名空间创建pvc用于访问ceph的 secret替换第7.2步client.kube获取到的值
[root@master083 ~]# kubectl create secret generic ceph-user-secret --type="kubernetes.io/rbd" \
--from-literal=key='AQCizZJjB19ADxAAmx0yYeL2QDJ5j3WsN/jyGA==' \
--namespace=dev 
# 4、查看 secret
[root@master083 ~]# kubectl get secret ceph-user-secret -o yaml -n dev
[root@master083 ~]# kubectl get secret ceph-secret -o yaml -n kube-system

7.3. Create an account, mainly used to manage the rights of ceph provisioner to run in k8s cluster (rbac-ceph.yaml)

apiVersion: v1
kind: ServiceAccount  #创建一个账户,主要用来管理ceph provisioner在k8s集群中运行的权
metadata:
  name: rbd-provisioner
  namespace: kube-system
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-provisioner
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
  - apiGroups: [""]
    resources: ["services"]
    resourceNames: ["kube-dns"]
    verbs: ["list", "get"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-provisioner
subjects:
  - kind: ServiceAccount
    name: rbd-provisioner
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: rbd-provisioner
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: rbd-provisioner
  namespace: kube-system
rules:
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: rbd-provisioner
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: rbd-provisioner
subjects:
  - kind: ServiceAccount
    name: rbd-provisioner
    namespace: kube-system

7.4. Create a ceph provider (provisioner-ceph.yaml)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rbd-provisioner
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: rbd-provisioner
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: rbd-provisioner
    spec:
      containers:
        - name: rbd-provisioner
          image: "quay.io/external_storage/rbd-provisioner:latest"
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: ceph-conf
              mountPath: /etc/ceph
          env:
            - name: PROVISIONER_NAME
              value: ceph.com/rbd
      serviceAccount: rbd-provisioner

      volumes:
        - name: ceph-conf
          hostPath:
            path: /etc/ceph

---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ceph-rbd
provisioner: ceph.com/rbd
parameters:
  monitors: 172.16.0.143:6789,172.16.0.211:6789,172.16.0.212:6789
  adminId: admin                      # k8s访问ceph的用户
  adminSecretName: ceph-secret        # secret名字
  adminSecretNamespace: kube-system   # secret加命名空间
  pool: pool-k8s       # ceph的rbd进程池
  userId: kube          # k8s访问ceph的用户
  userSecretName: ceph-user-secret    # secret名字,不需要加命名空间
  fsType: ext4
  imageFormat: "2"
  imageFeatures: "layering"
reclaimPolicy: Retain

7.5. Copy the files under /etc/ceph on the ceph server (mine is the physical machine master001) to each working node. Here is the /etc/ceph structure reference of the worker081 working node after copying.

7.6. Perform initial deployment operations on the k8s control node master083

[root@master083 ~]# kubectl apply -f rbac-ceph.yaml
[root@master083 ~]# kubectl apply -f provisioner-ceph.yaml
# 查看供应商启动节点信息
[root@master083 ~]#kubectl get pod -n kube-system -owide
................                           ...     .......   ........        ...
kube-proxy-s5mfz                           1/1     Running   4 (22h ago)     37h     172.16.1.81      worker081   <none>           <none>
kube-proxy-tqksh                           1/1     Running   4               38h     172.16.1.79      master079   <none>           <none>
kube-proxy-w4h57                           1/1     Running   4 (22h ago)     37h     172.16.1.82      worker082   <none>           <none>
kube-scheduler-master078                   1/1     Running   30              38h     172.16.1.78      master078   <none>           <none>
kube-scheduler-master079                   1/1     Running   24              38h     172.16.1.79      master079   <none>           <none>
kube-scheduler-master083                   1/1     Running   18 (127m ago)   38h     172.16.1.83      master083   <none>           <none>
rbd-provisioner-579d59bb7b-ssd8b           1/1     Running   6               3h45m   10.244.129.207   worker081   <none>           <none>

7.7. Troubleshooting summary (you can skip here and go to 7.8 directly)

7.7.1. If the supplier is not running for the first time, please go to the corresponding node to pull the image (I have to go to the worker081 node)

[root@worker081 ~]# docker pull quay.io/external_storage/rbd-provisioner:latest

 7.7.2. After copying, it is best to upgrade the ceph in the supplier.

# 进入容器
[root@master083 ~]# kubectl exec -it rbd-provisioner-579d59bb7b-ssd8b -c rbd-provisioner -n kube-system -- sh
# 查看版本太老了兼容不了
sh-4.2# ceph -v
# 更新容器yum源
sh-4.2# curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
# 2.3、配置ceph源
sh-4.2# cat >>/etc/yum.repos.d/ceph.repo<< eof
[ceph]
name=ceph
baseurl=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/x86_64/
gpgcheck=0
priority=1
enable=1
[ceph-noarch]
name=cephnoarch
baseurl=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch/
gpgcheck=0
priority=1
enable=1
[ceph-source]
name=Ceph source packages
baseurl=https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS/
gpgcheck=0
priority=1
enable=1
eof
# 更新
sh-4.2# yum -y update
# 查看可安装版本
sh-4.2# yum list ceph-common --showduplicates | sort -r
ceph-common.x86_64                     2:14.2.9-0.el7                      ceph 
ceph-common.x86_64                     2:14.2.8-0.el7                      ceph 
ceph-common.x86_64                     2:14.2.7-0.el7                      ceph 
ceph-common.x86_64                     2:14.2.6-0.el7                      ceph 
ceph-common.x86_64                     2:14.2.5-0.el7                      ceph 
ceph-common.x86_64                     2:14.2.4-0.el7                      ceph 
ceph-common.x86_64                     2:14.2.3-0.el7                      ceph 
ceph-common.x86_64                     2:14.2.22-0.el7                     ceph 
ceph-common.x86_64                     2:14.2.22-0.el7                     @ceph
ceph-common.x86_64                     2:14.2.21-0.el7                     ceph 
ceph-common.x86_64                     2:14.2.20-0.el7                     ceph 
ceph-common.x86_64                     2:14.2.2-0.el7                      ceph 
..................                     ...............                     ....
# 安装最新版本
sh-4.2# yum install -y ceph-common-14.2.21-0.el7
sh-4.2# ceph -v

# 进入到master081服务器,查看docker 容器并重新制作镜像(9fb54e49f9bf是运行容器的id不是镜像的id)docker images 和 docker ps 可手动找到运行的容器关联的镜像
[root@master081 ~]# sudo docker commit -m "update ceph-common 14.2.22 " -a "morik" 9fb54e49f9bf provisioner/ceph:14.2.22
[root@master081 ~]# docker save ceph_provisioner_14.2.22.tar.gz provisioner/ceph:14.2.22
# 分别去83和81上先后删除Deployment 和 docker镜像
[root@master083 ~]# kubectl delete Deployment rbd-provisioner -n rbd-provisioner 
[root@master081 ~]# docker rmi -f 9fb54e49f9bf
# 将新生成的ceph_provisioner_14.2.22.tar.gzdocker镜像分别拷贝到工作节点并加载
[root@master080 ~]# docker load -i /home/ceph_provisioner_14.2.22.tar.gz
[root@master082 ~]# docker load -i /home/ceph_provisioner_14.2.22.tar.gz

7.8. Modify the image name (you can directly use the docker image provided in this article , or you can generate it yourself according to the method in 7.6)

7.9. Test pvc (test-ceph.yaml)

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: ceph-sc-claim
  namespace: dev
spec:
  storageClassName: ceph-rbd
  accessModes:
    - ReadWriteOnce
    - ReadOnlyMany
  resources:
    requests:
      storage: 500Mi

8.0, test to see the effect, initial 12k--->38k after formatting

[root@master083 ~]# kubectl apply -f test-ceph.yaml
[root@master083 ~]# kubectl get pvc -n dev
NAME            STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
ceph-sc-claim   Bound    pvc-406871be-069e-4ac1-84c9-ccc1589fd880   500Mi      RWO,ROX        ceph-rbd       4h35m
[root@master083 ~]# kubectl get pv -n dev
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM               STORAGECLASS   REASON   AGE
pvc-406871be-069e-4ac1-84c9-ccc1589fd880   500Mi      RWO,ROX        Retain           Bound    dev/ceph-sc-claim   ceph-rbd                4h35m
[root@master083 ~]# 

Guess you like

Origin blog.csdn.net/qq_29653373/article/details/128264520