使用Kubeadm搭建Kubernetes(1.12.2)集群

Kubeadm是Kubernetes官方提供的用于快速安装Kubernetes集群的工具，伴随Kubernetes每个版本的发布都会同步更新，在2018年将进入GA状态，说明离生产环境中使用的距离越来越近了。

使用Kubeadm搭建Kubernetes集群本来是件很简单的事，但由于众所周知的原因，在中国大陆是无法访问 k8s.gcr.io的。这就使我们无法按照官方的教程来创建集群。而国内的教程参差不齐，大多也无法运行成功，我也是踩了很多坑，才部署成功，故在此分享出来。

准备

多台Ubuntu 16.04+、CentOS 7或HypriotOSv1.0.1+ 系统。
每台机器最少2GB内存，2CPUs。
集群中所有机器之间网络连接正常。
打开相应的端口，详见：Check required ports。

关闭防火墙和selinux。

# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld

# 禁用SELINUX
setenforce 0

vim /etc/selinux/config
SELINUX=disabled

关闭系统的Swap，Kubernetes 1.8开始要求必须禁用Swap，如果不关闭，默认配置下kubelet将无法启动。

# 关闭系统的Swap方法如下:
# 编辑`/etc/fstab`文件，注释掉引用`swap`的行，保存并重启后输入:
sudo swapoff -a

验证Mac地址和product_uuid是否唯一。

Kubernetes要求集群中所有机器具有不同的Mac地址、产品uuid、Hostname。可以使用如下命令查看：
```
# UUID
cat /sys/class/dmi/id/product_uuid

# Mac地址
ip link

# Hostname
cat /etc/hostname
```

在本示例中使用2台Ubuntu 18.04主机：

cat /etc/hosts

192.168.0.8 ubuntu1
192.168.0.7 ubuntu2

安装Docker

Kubernetes从1.6开始使用CRI(Container Runtime Interface)容器运行时接口。默认的容器运行时仍然是Docker，是使用kubelet中内置dockershim CRI来实现的。

Docker的安装可以参考之前的博客：Docker初体验。

需要注意的是，Kubernetes 1.12已经针对Docker的1.11.1, 1.12.1, 1.13.1, 17.03, 17.06, 17.09, 18.06等版本做了验证，最低支持的Docker版本是1.11.1，最高支持是18.06，而Docker最新版本已经是18.09了，故我们安装时需要指定版本为18.06.1-ce：
sudo apt install docker-ce=18.06.1~ce~3-0~ubuntu

安装kubeadm, kubelet 和 kubectl

部署之前，我们需要安装一下三个包：

kubeadm: 引导启动k8s集群的命令行工具。
kubelet: 在群集中所有节点上运行的核心组件, 用来执行如启动pods和containers等操作。
kubectl: 操作集群的命令行工具。

首先添加apt-key：

sudo apt update && sudo apt install -y apt-transport-https curl
curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -

添加kubernetes源：

sudo vim /etc/apt/sources.list.d/kubernetes.list

deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main

安装：

sudo apt update
sudo apt install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

使用kubeadm创建一个单Master集群

初始化Master节点

K8s的控制面板组件运行在Master节点上，包括etcd和API server（Kubectl便是通过API server与k8s通信）。

在执行初始化之前，我们还有一下3点需要注意：

选择一个网络插件，并检查它是否需要在初始化Master时指定一些参数，比如我们可能需要根据选择的插件来设置--pod-network-cidr参数。参考：Installing a pod network add-on。
kubeadm使用eth0的默认网络接口（通常是内网IP）做为Master节点的advertise address，如果我们想使用不同的网络接口，可以使用--apiserver-advertise-address=<ip-address>参数来设置。如果适应IPv6，则必须使用IPv6d的地址，如：--apiserver-advertise-address=fd00::101。
由于国内的网络问题，建议使用kubeadm config images pull来预先拉取初始化需要用到的镜像，并检查是否能连接到gcr.io的registries。

很明显，在国内并不能访问gcr.io，在上篇文章使用kubeadm搭建Kubernetes(1.10.2)集群（国内环境）中使用了打tag的方式，而这次，我们通过修改配置文件来拉实现。

在kubeadm v1.11+版本中，增加了一个kubeadm config print-default命令，可以让我们方便的将kubeadm的默认配置打印到文件中：

kubeadm config print-default > kubeadm.conf

然后我们修改kubeadm.conf中的镜像仓储地址：

sed -i "s/imageRepository: .*/imageRepository: registry.aliyuncs.com\/google_containers/g" kubeadm.conf

指定我们要的版本号，避免初始化时从https://dl.k8s.io/release/stable-1.12.txt读取，可使用如下命令来设置：

sed -i "s/kubernetesVersion: .*/kubernetesVersion: v1.12.2/g" kubeadm.conf

现在我们可以使用--config参数指定kubeadm.conf文件来运行kubeadm的images pull的命令：

kubeadm config images pull --config kubeadm.conf

W1103 06:10:18.782958   23149 common.go:105] WARNING: Detected resource kinds that may not apply: [InitConfiguration MasterConfiguration JoinConfiguration NodeConfiguration]
[config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1alpha3, Kind=JoinConfiguration
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.12.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.12.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.12.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.12.2
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.1
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.2.24
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:1.2.2

可以看到，已经成功拉取了需要的镜像。

但是，此处还有一个坑，基础镜像pause的拉取地址需要单独设置，否则还是会从k8s.gcr.io来拉取，导致init的时候卡住，并最终失败：

[init] this might take a minute or longer if the control plane images have to be pulled

Unfortunately, an error has occurred:
timed out waiting for the condition

This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'

解决办法有2种：

最简单就是打一个k8s.gcr.io/pause:3.1的Tag:

docker tag registry.aliyuncs.com/google_containers/pause:3.1 k8s.gcr.io/pause:3.1

其次可以通过修改kubeadm.conf中的InitConfiguration的nodeRegistration:kubeletExtraArgs:pod-infra-container-image参数来设置基础镜像，大约在14行，修改后如下：

kind: InitConfiguration
nodeRegistration:
  kubeletExtraArgs:
    pod-infra-container-image: registry.aliyuncs.com/google_containers/pause:3.1

通常，我们在执行init命令时，可能还需要指定advertiseAddress、--pod-network-cidr等参数，但是由于我们这里使用kubeadm.conf配置文件来初始化，就不能在命令行中指定其他参数了，因此需要我们在kubeadm.conf来设置。

如下，我们修改kubeadm.conf中与--apiserver-advertise-address参数对应的advertiseAddress参数，我的虚拟机IP是:192.168.0.8，大家根据自己的实际情况来设置：

sed -i "s/advertiseAddress: .*/advertiseAddress: 192.168.0.8/g" kubeadm.conf

在本示例中，我使用的是Canal网络插件，因此需要将--pod-network-cid设置为10.244.0.0/16，修改如下：

sed -i "s/podSubnet: .*/podSubnet: \"10.244.0.0\/16\"/g" kubeadm.conf

现在可以执行初始化命令了：

sudo kubeadm init --config kubeadm.conf

输出如下：

W1109 17:01:47.071494   42929 common.go:105] WARNING: Detected resource kinds that may not apply: [InitConfiguration MasterConfiguration JoinConfiguration NodeConfiguration]
[config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1alpha3, Kind=JoinConfiguration
[init] using Kubernetes version: v1.12.2
[preflight] running pre-flight checks
[preflight/images] Pulling images required for setting up a Kubernetes cluster
[preflight/images] This might take a minute or two, depending on the speed of your internet connection
[preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[preflight] Activating the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [ubuntu1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.8]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [ubuntu1 localhost] and IPs [127.0.0.1 ::1]
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [ubuntu1 localhost] and IPs [192.168.0.8 127.0.0.1 ::1]
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[certificates] Generated sa key and public key.
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests" 
[init] this might take a minute or longer if the control plane images have to be pulled
[apiclient] All control plane components are healthy after 57.002438 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.12" in namespace kube-system with the configuration for the kubelets in the cluster
[markmaster] Marking the node ubuntu1 as master by adding the label "node-role.kubernetes.io/master=''"
[markmaster] Marking the node ubuntu1 as master by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ubuntu1" as an annotation
[bootstraptoken] using token: abcdef.0123456789abcdef
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 192.168.0.8:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:67ea537411822fe684d1ddb984802da62a4f22aa1c32fefe7c3404bb8f3f52e0

如果我们想使用非root用户操作kubectl，可以使用以下命令，这也是kubeadm init输出的一部分：

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

安装网络插件

为了让Pods间可以相互通信，我们必须安装一个网络插件，并且必须在部署任何应用之前安装，CoreDNS也是在网络插件安装之后才会启动的。

网络的插件完整列表，请参考 Networking and Network Policy。

在安装之前，我们先查看一下当前Pods的状态：

kubectl get pods --all-namespaces

# 输出
NAMESPACE     NAME                              READY   STATUS    RESTARTS   AGE
kube-system   coredns-5c545769d8-j9vzw          0/1     Pending   0          110s
kube-system   coredns-5c545769d8-wqrlm          0/1     Pending   0          111s
kube-system   etcd-ubuntu1                      1/1     Running   0          75s
kube-system   kube-apiserver-ubuntu1            1/1     Running   0          87s
kube-system   kube-controller-manager-ubuntu1   1/1     Running   0          96s
kube-system   kube-proxy-snhqr                  1/1     Running   0          111s
kube-system   kube-scheduler-ubuntu1            1/1     Running   0          98s

如上，可以看到CoreDND的状态是Pending，就是因为我们还没有安装网络插件。

我是比较推荐的是Calico网络插件，但是由于我的虚拟机网段是192.168.0.x，无法使用Calico网络，所以使用了Canal网络插件，它是Calico和Flannel的结合体，在上面kubeadm init的时候我们已经指定了--pod-network-cidr=10.244.0.0/16，这是Canal插件所要求的。

可使用如下命令命令来安装Canal插件：

# 源地址：https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/canal/rbac.yaml
kubectl apply -f http://mirror.faasx.com/k8s/canal/v3.3/rbac.yaml

# 源地址：https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/canal/canal.yaml
# 只是将quay.io修改成了国内镜像
kubectl apply -f http://mirror.faasx.com/k8s/canal/v3.3/canal.yaml

关于更多Canal的信息，可以查看Installing Calico for policy and flannel for networking。

稍等片刻，再使用kubectl get pods --all-namespaces命令来查看网络插件的安装情况：

NAMESPACE     NAME                              READY   STATUS    RESTARTS   AGE
kube-system   canal-frf6b                       3/3     Running   3          25m
kube-system   coredns-5c545769d8-j9vzw          1/1     Running   2          9h
kube-system   coredns-5c545769d8-wqrlm          1/1     Running   2          9h
kube-system   etcd-ubuntu1                      1/1     Running   1          9h
kube-system   kube-apiserver-ubuntu1            1/1     Running   1          9h
kube-system   kube-controller-manager-ubuntu1   1/1     Running   1          9h
kube-system   kube-proxy-snhqr                  1/1     Running   1          9h
kube-system   kube-scheduler-ubuntu1            1/1     Running   1          9h

如上，STATUS全部变为了Running，表示安装成功，接下来就可以加入其他节点以及部署应用了。

Master隔离

默认情况下，由于安全原因，集群并不会将pods部署在Master节点上。但是在开发环境下，我们可能就只有一个Master节点，这时可以使用下面的命令来解除这个限制：

kubectl taint nodes --all node-role.kubernetes.io/master-

## 输出
node/ubuntu1 untainted

加入工作节点

要为群集添加工作节点，需要为每台计算机执行以下操作：

SSH到机器
成为root用户，(如: sudo su -)
运行上面的kubeadm init命令输出的：kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash>

如果我们忘记了Master节点的加入token，可以使用如下命令来查看：

kubeadm token list

# 输出：
# TOKEN                     TTL       EXPIRES                USAGES                   DESCRIPTION   EXTRA GROUPS
# abcdef.0123456789abcdef   22h       2018-11-10T14:24:51Z   authentication,signing   <none>        system:bootstrappers:kubeadm:default-node-token

默认情况下，token的有效期是24小时，如果我们的token已经过期的话，可以使用以下命令重新生成：

kubeadm token create

# 输出：
# 9w6mbu.3k2z7pprl3eaozk9

如果我们也没有--discovery-token-ca-cert-hash的值，可以使用以下命令生成：

openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'

# 输出：
# 9fcb02a0f4ab216866f87986106437b7305474850f0de81b9ac9c36a468f7c67

现在，我们登录到工作节点服务器，准备加入到集群。

但是还有最重要的一点就是，基础镜像pause需要单独设置，否则还是会从k8s.gcr.io来拉取，我们可以使用类似Init时修改配置文件的方式来实现，不过，由于就这一个镜像拉取有问题，我们可以简单的打个tag：

docker pull registry.aliyuncs.com/google_containers/pause:3.1
docker tag registry.aliyuncs.com/google_containers/pause:3.1 k8s.gcr.io/pause:3.1

然后运行如下命令加入集群：

sudo kubeadm join 192.168.0.8:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:67ea537411822fe684d1ddb984802da62a4f22aa1c32fefe7c3404bb8f3f52e0

输出如下：

[preflight] running pre-flight checks
    [WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh] or no builtin kernel ipvs support: map[ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{} ip_vs:{} ip_vs_rr:{}]
you can solve this problem with following methods:
 1. Run 'modprobe -- ' to load missing kernel modules;
2. Provide the missing builtin kernel ipvs support

[discovery] Trying to connect to API Server "192.168.0.8:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.0.8:6443"
[discovery] Requesting info from "https://192.168.0.8:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.0.8:6443"
[discovery] Successfully established connection with API Server "192.168.0.8:6443"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ubuntu2" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

等待一会，我们可以在Master节点上使用kubectl get nodes命令来查看节点的状态：

kubectl get nodes

# 输出：
# NAME      STATUS   ROLES    AGE     VERSION
# ubuntu1   Ready    master   9h      v1.12.2
# ubuntu2   Ready    <none>   2m24s   v1.12.2

如上全部Ready，大功告成，我们可以运行一些命令来测试一下。

测试

首先验证kube-apiserver, kube-controller-manager, kube-scheduler, pod network 是否正常：

# 部署一个 Nginx Deployment，包含两个Pod
# https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
kubectl create deployment nginx --image=nginx:alpine
kubectl scale deployment nginx --replicas=2

# 验证Nginx Pod是否正确运行，并且会分配10.244.开头的集群IP
kubectl get pods -l app=nginx -o wide

# 输出如下：
# NAME                     READY   STATUS    RESTARTS   AGE   IP           NODE      NOMINATED NODE
# nginx-65d5c4f7cc-7pzgp   1/1     Running   0          88s   10.244.1.2   ubuntu2   <none>
# nginx-65d5c4f7cc-l2h26   1/1     Running   0          82s   10.244.1.3   ubuntu2   <none>

再验证一下kube-proxy是否正常：

# 以 NodePort 方式对外提供服务 https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/
kubectl expose deployment nginx --port=80 --type=NodePort

# 查看集群外可访问的Port
kubectl get services nginx

# 输出如下：
# NAME    TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
# nginx   NodePort   10.110.142.125   <none>        80:30092/TCP   7s

# 可以通过任意 NodeIP:Port 在集群外部访问这个服务，本示例中部署的2台集群IP分别是192.168.0.8和192.168.0.7
curl http://192.168.0.8:30092
curl http://192.168.0.7:30092

最后验证一下dns, pod network是否正常：

# 运行Busybox并进入交互模式
kubectl run -it curl --image=radial/busyboxplus:curl

# 输入`nslookup nginx`查看是否可以正确解析出集群内的IP，已验证DNS是否正常
[ root@curl-5cc7b478b6-tlf46:/ ]$ nslookup nginx

# 输出如下：
# Server:    10.96.0.10
# Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
# 
# Name:      nginx
# Address 1: 10.110.142.125 nginx.default.svc.cluster.local

# 通过服务名进行访问，验证kube-proxy是否正常
[ root@curl-5cc7b478b6-tlf46:/ ]$ curl http://nginx/

# 输出如下：
# <!DOCTYPE html> ---省略

# 分别访问一下2个Pod的内网IP，验证跨Node的网络通信是否正常
[ root@curl-5cc7b478b6-tlf46:/ ]$ curl http://10.244.1.2/
[ root@curl-5cc7b478b6-tlf46:/ ]$ curl http://10.244.1.3/

验证通过，集群搭建成功，接下来我们就可以参考官方文档来部署其他服务，愉快的玩耍了。

卸载集群

想要撤销kubeadm执行的操作，首先要排除节点，并确保该节点为空, 然后再将其关闭。

在Master节点上运行：

kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
kubectl delete node <node name>

然后在需要移除的节点上，重置kubeadm的安装状态：

sudo kubeadm reset

如果你想重新配置集群，使用新的参数重新运行kubeadm init或者kubeadm join即可。

参考资料

Creating a single master cluster with kubeadm