前言:
kubernetes的集群部署方式之一是使用kubeadm,该方式部署的集群性质比较接近shell里的一键安装部署脚本,各个组件耦合度非常高,比如etcd组件,kube-apiserver组件并不能高可用,因此,在2018年之前,kubernetes官方并不建议也不能将kubeadm用于生产环境,只是一个快速搭建测试集群的工具。
在2018年之后,通过官方以及社区的不断改进,kubeadm能够实现组件的解耦进而实现了kubernetes集群关键组件的可高用,因此,kubeadm和二进制集群部署方式成为了两大主流方式。
从现如今的角度来看,二进制部署集群仍然是一个首选方案,因为kubeadm部署方式里的关于kube-controller-manage和kube-scheduler 这两个次级关键组件仍不能实现高可用,但不得不说中小规模的kubernetes集群kubeadm这个部署工具仍然是适用的。
我们在使用kubeadm部署集群的时候,如果不做特别的指定,那么,这个集群将会使用其内部的静态pod来创建etcd,因此,此etcd将会是一个单实例模式,而这样的集群显然是不适合在实际的生产活动中使用的,因此,本文将分享如何解耦kubeadm里的etcd,直接使用外部的etcd集群来部署一个高可用的kubernetes集群。
注1:kubernetes集群的组件大体有kube-controller-manage ,kube-apiserver ,kube-scheduler,etcd,kube-proxy这么几个核心的关键组件,其中,为了保证生产环境的稳定可靠,也是为了应对集群的大流量冲击,etcd和kube-apiserver应该是高可用集群模式也必须是高可用集群模式。
注2:静态pod,特指集群内部的运行关键组件的专有pod,此pod一般不需要人为干预,只是作为集群的一个组件形式存在,是在集群初始化的时候生成的pod,例如下面这些pod,统一称为静态pod
[root@master bin]# kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-7f6cbbb7b8-4vrj7 1/1 Running 0 42m
kube-system coredns-7f6cbbb7b8-9x2kg 1/1 Running 0 42m
kube-system kube-apiserver-master 1/1 Running 0 43m
kube-system kube-controller-manager-master 1/1 Running 0 43m
kube-system kube-proxy-kvq2z 1/1 Running 0 41m
kube-system kube-proxy-rwtbz 1/1 Running 0 42m
kube-system kube-proxy-wp2ft 1/1 Running 0 42m
kube-system kube-scheduler-master 1/1 Running 0 43m
OK,下面就开始讲述如何在kubeadm中使用外部etcd集群。
一,环境介绍
服务器ip地址 | 服务器规格 | 部署的组件 | 操作系统版本 | 操作系统内核 | 基础环境和集群相关版本 |
192.168.217.19 | CPU:2c2核心 内存:4G 硬盘:100G VMware虚拟机 |
docker环境,二进制部署的etcd集群 yum方式安装的kubelet,kubeadm |
CentOS Linux release 7.4.1708 (Core) | Linux master 5.16.9-1.el7.elrepo.x86_64 | ntp时间服务器,服务器之间免密码,防火墙关闭,selinux关闭,本地yum仓库 kubernetes版本:1.22.2 docker版本:ce 20.10.5 etcd版本:etcd-v3.4.9-linux-amd64 |
192.168.217.20 | CPU:2c2核心 内存:4G 硬盘:100G VMware虚拟机 |
docker环境,二进制部署的etcd集群 yum方式安装的kubelet,kubeadm |
CentOS Linux release 7.4.1708 (Core) | Linux master 5.16.9-1.el7.elrepo.x86_64 | ntp时间服务器,服务器之间免密码,防火墙关闭,selinux关闭,本地yum仓库 kubernetes版本:1.22.2 docker版本:ce 20.10.5 etcd版本:etcd-v3.4.9-linux-amd64 |
192.168.217.21 | CPU:2c2核心 内存:4G 硬盘:100G VMware虚拟机 |
docker环境,二进制部署的etcd集群 yum方式安装的kubelet,kubeadm |
CentOS Linux release 7.4.1708 (Core) | Linux master 5.16.9-1.el7.elrepo.x86_64 | ntp时间服务器,服务器之间免密码,防火墙关闭,selinux关闭,本地yum仓库 kubernetes版本:1.22.2 docker版本:ce 20.10.5 etcd版本:etcd-v3.4.9-linux-amd64 |
主机名的定义:
[root@master ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.217.19 master k8s-master
192.168.217.20 node1 k8s-node1
192.168.217.21 node2 k8s-node2
集群基础环境的安装见我的博客:
云原生|kubernetes|kubeadm五分钟内部署完成集群(完全离线部署---适用于centos7全系列)_晚风_END的博客-CSDN博客
二,
etcd集群二进制部署
centos7操作系统 ---ansible剧本离线快速部署etcd集群_晚风_END的博客-CSDN博客_etcd离线安装
这里在强调一下,ansible_offlie.tar.gz 进入解压目录后 rpm -ivh * 即可,剩下的按照上面我写的博客操作即可。
ansible-deployment-etcd-3.3.tar对应etcd-v3.3.13-linux-amd64.tar
ansible-deployment-etcd-3.4.tar对应etcd-v3.4.9-linux-amd64.tar ,etcd安装包如果不想麻烦就放到root根目录下即可。
etcd集群部署完成后的最终测试:
[root@node2 ~]# ETCDCTL_API=3 /opt/etcd/bin/etcdctl --endpoints=https://192.168.217.19:2379,https://192.168.217.20:2379,https://192.168.217.21:2379 --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/server.pem --key=/opt/etcd/ssl/server-key.pem member list -w table
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
| 97c1c1003e0d4bf | started | etcd-1 | https://192.168.217.19:2380 | https://192.168.217.19:2379 | false |
| ef2fee107aafca91 | started | etcd-2 | https://192.168.217.20:2380 | https://192.168.217.20:2379 | false |
| f5b8cb45a0dcf520 | started | etcd-3 | https://192.168.217.21:2380 | https://192.168.217.21:2379 | false |
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
[root@node2 ~]# etcd_search member list -w table
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
| 97c1c1003e0d4bf | started | etcd-1 | https://192.168.217.19:2380 | https://192.168.217.19:2379 | false |
| ef2fee107aafca91 | started | etcd-2 | https://192.168.217.20:2380 | https://192.168.217.20:2379 | false |
| f5b8cb45a0dcf520 | started | etcd-3 | https://192.168.217.21:2380 | https://192.168.217.21:2379 | false |
+------------------+---------+--------+-----------------------------+-----------------------------+------------+
三,
kubernetes集群正式部署:
config配置文件形式初始化集群(在19服务器上执行):
需要更改的地方:
advertiseAddress: 192.168.217.19 修改成你自己的要作为master节点的IP地址
name:master 修改成你自己要作为master节点的主机名
hostnameOverride: "k8s-master" 修改成你自己要作为master节点的hosts文件内定义的主机名
dns: {} 这里表示使用默认也就是coredns,如果想显式定义可修改为:
dns:
type:coreDNS
有多少etcd节点就写多少个,一样的格式,etcd的证书路径后面处理:
- https://192.168.217.19:2379
- https://192.168.217.20:2379
- https://192.168.217.21:2379
caFile: /etc/kubernetes/pki/etcd/ca.pem
certFile: /etc/kubernetes/pki/etcd/apiserver-etcd-client.pem
keyFile: /etc/kubernetes/pki/etcd/apiserver-etcd-client-key.pem
[root@master ~]# ls -al /opt/etcd/ssl/
total 16
drwxr-xr-x 2 root root 78 Oct 21 15:21 .
drwxr-xr-x 5 root root 39 Oct 21 15:21 ..
-rw-r--r-- 1 root root 1675 Oct 21 15:21 ca-key.pem
-rw-r--r-- 1 root root 1265 Oct 21 15:21 ca.pem
-rw-r--r-- 1 root root 1679 Oct 21 15:21 server-key.pem
-rw-r--r-- 1 root root 1338 Oct 21 15:21 server.pem
完整的初始化清单文件:
[root@master ~]# cat kubeadm-init.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: "0"
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.217.19
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
imagePullPolicy: IfNotPresent
name: master
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
external:
endpoints: #下面为自定义etcd集群地址
- https://192.168.217.19:2379
- https://192.168.217.20:2379
- https://192.168.217.21:2379
caFile: /etc/kubernetes/pki/etcd/ca.pem
certFile: /etc/kubernetes/pki/etcd/apiserver-etcd-client.pem
keyFile: /etc/kubernetes/pki/etcd/apiserver-etcd-client-key.pem
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.22.2
networking:
dnsDomain: cluster.local
podSubnet: "10.244.0.0/16"
serviceSubnet: "10.96.0.0/12"
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
bindAddressHardFail: false
clientConnection:
acceptContentTypes: ""
burst: 0
contentType: ""
kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
qps: 0
clusterCIDR: "10.244.0.0/16"
configSyncPeriod: 0s
conntrack:
maxPerCore: null
min: null
tcpCloseWaitTimeout: null
tcpEstablishedTimeout: null
detectLocalMode: ""
enableProfiling: false
healthzBindAddress: ""
hostnameOverride: "k8s-master"
iptables:
masqueradeAll: false
masqueradeBit: null
minSyncPeriod: 0s
syncPeriod: 0s
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: ""
strictARP: false
syncPeriod: 0s
tcpFinTimeout: 0s
tcpTimeout: 0s
udpTimeout: 0s
kind: KubeProxyConfiguration
metricsBindAddress: ""
mode: ""
nodePortAddresses: null
oomScoreAdj: null
portRange: ""
showHiddenMetricsForVersion: ""
udpIdleTimeout: 0s
winkernel:
enableDSR: false
networkName: ""
sourceVip: ""
---
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 0s
cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
logging: {}
memorySwap: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s
外部etcd集群的证书处理:
(建立目录在三个节点都执行,在master节点拷贝好文件后,scp到工作节点20和21)
mkdir -p /etc/kubernetes/pki/etcd/
cp /opt/etcd/ssl/ca.pem /etc/kubernetes/pki/etcd/
cp /opt/etcd/ssl/server.pem /etc/kubernetes/pki/etcd/apiserver-etcd-client.pem
cp /opt/etcd/ssl/server-key.pem /etc/kubernetes/pki/etcd/apiserver-etcd-client-key.pem
scp /etc/kubernetes/pki/etcd/* node1:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/etcd/* node2:/etc/kubernetes/pki/etcd/
使用config清单文件:
kubeadm init --config=kubeadm-init.yaml
此命令的输出如下:
其中的这些输出表示已经启用了外部etcd集群,按官方来说,就是External etcd mode (扩展etcd模式),因此,etcd相关证书不生成,Skipping了嘛。
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate generation
[certs] External etcd mode: Skipping etcd/peer certificate generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
这个是kubelet环境变量配置,方便使用kubectl命令的
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
工作节点加入集群的命令,此命令复制后,在工作节点执行即可将工作节点加入集群:
kubeadm join 192.168.217.19:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:8245100f694c48a3171fd473fc2b9a1c6696394c89ff4ac902d4fde95c4740f1
两个网络相关插件coreDNS和kube-proxy以静态pod的方式部署在集群内了:
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
[init] Using Kubernetes version: v1.22.2
[preflight] Running pre-flight checks
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master] and IPs [10.96.0.1 192.168.217.19]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate generation
[certs] External etcd mode: Skipping etcd/peer certificate generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 16.008300 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.217.19:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:8245100f694c48a3171fd473fc2b9a1c6696394c89ff4ac902d4fde95c4740f1
四,
工作节点加入(在20和21服务器上都执行):
root@node1 ~]# kubeadm join 192.168.217.19:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:8245100f694c48a3171fd473fc2b9a1c6696394c89ff4ac902d4fde95c4740f1
[preflight] Running pre-flight checks
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
两个服务器都执行完毕后,就可以使用kubectl 查看节点和pod了,只是节点状态会是notready,需要安装网络插件比如,flannel或者calico,weave,canal等等任意一种即可。
五,
功能测试
网络插件的安装部署本文就省略了,因为本文主要就是讲述如何使用扩展etcd外部集群。假设正确安装了flannel。现查看集群的整体状态:
可以看到外部etcd集群可以在集群内看到:
[root@master bin]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused
controller-manager Healthy ok
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
etcd-0 Healthy {"health":"true"}
[root@master bin]# kubectl cluster-info
Kubernetes control plane is running at https://192.168.217.19:6443
CoreDNS is running at https://192.168.217.19:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
各个组件pod都运行正常,静态pod里没有etcd:
[root@master bin]# kubectl get po,svc -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-7f6cbbb7b8-4vrj7 1/1 Running 0 171m
kube-system pod/coredns-7f6cbbb7b8-9x2kg 1/1 Running 0 171m
kube-system pod/kube-apiserver-master 1/1 Running 0 171m
kube-system pod/kube-controller-manager-master 1/1 Running 0 171m
kube-system pod/kube-flannel-ds-c8d2t 1/1 Running 0 168m
kube-system pod/kube-flannel-ds-cxvxs 1/1 Running 0 168m
kube-system pod/kube-flannel-ds-v5s85 1/1 Running 0 168m
kube-system pod/kube-proxy-kvq2z 1/1 Running 0 170m
kube-system pod/kube-proxy-rwtbz 1/1 Running 0 171m
kube-system pod/kube-proxy-wp2ft 1/1 Running 0 171m
kube-system pod/kube-scheduler-master 1/1 Running 0 171m
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 171m
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 171m
DNS功能测试正常,并且pod正常生成,表明etcd功能完全正确,此次使用外部etcd集群完全成功:
kubectl run -it --image busybox:1.28.3 dns-test --restart=Never --rm
If you don't see a command prompt, try pressing enter.
/ # nslookup kubernetes
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local
/ # nslookup kubernetes.default
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local
/ # nslookup baidu.com
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: baidu.com
Address 1: 39.156.66.10
Address 2: 110.242.68.66