Use Kubeadm to build k8s-v1.18.2 (including all error highlights analysis)

Construction of k8s
Use Kubeadm to build a master cluster. The previously installed version is v1.18.2.
Welcome new leaders in operation and maintenance to join the group, covering business operation and maintenance, application operation and maintenance, system operation and maintenance, network operation and maintenance, database operation Maintenance, desktop operation and maintenance, operation and maintenance development, etc., regardless of region, new group is being established, welcome to
join the group to exchange industry knowledge~ qq group number: 1027981908
Insert picture description here

System environment

system version
Kernel
Set the host on the three machines separately, the command is as follows:

 hostnamectl set-hostname k8s-master
 hostnamectl set-hostname k8s-node1
 hostnamectl set-hostname k8s-node2

Add the following content under /etc/hosts of the three node machines (corresponding to your own machine ip)
172.20.0.15 k8s-master
172.20.0.12 k8s-node1
172.20.0.43 k8s-node2

Installation details

The following are all running on three nodes

  • Disable firewall
 systemctl stop firewalld
 systemctl disable firewalld
  • selinux disabled

setenforce 0
cat /etc/selinux/config

SELINUX=disabled
  • Install docker
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum -y install docker-ce-18.09.9
  • Install Kubeadm
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
  • Install kubeadm, kubelet, kubectl,
    among which --disableexcludes bans other repositories except kubernetes
yum install -y kubelet-1.18.2 kubeadm-1.18.2 kubectl-1.18.2 --disableexcludes=kubernetes
kubeadm version
  • Start docker
systemctl start docker
  • Change the startup mode of docker
cat > /etc/docker/daemon.json <<EOF
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF
systemctl enable docker.service
systemctl daemon-reload
systemctl restart docker
  • Change the firewall forward link rule
    ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT The
    Insert picture description herefile is in /usr/lib/systemd/system/docker.service
    , please execute the following command:
sed -i '20i ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT' /usr/lib/systemd/system/docker.service
systemctl daemon-reload
systemctl restart docker

Check after the change,

iptables -L

Insert picture description here
这里我打算留坑,有些必要设置不做,跳坑的直接下滑跳至跳坑安装部分

Mine mine

The following is executed on the master

  • Download the image in advance , first start docker, or you can change the configuration file to change the domestic image source (for example, Ali's), I will pass the time here, pull down and change the tag one by one;
    ps: The following is because of kubeadm and kubelet during the first installation The version I installed is v1.16.2, so the mirror also pulls 1.16.2;
  • List the image that needs to be pulled
 kubeadm config images list

k8s.gcr.io/kube-apiserver:v1.16.2
k8s.gcr.io/kube-controller-manager:v1.16.2
k8s.gcr.io/kube-scheduler:v1.16.2
k8s.gcr.io/kube-proxy :v1.16.2
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.3.15-0
k8s.gcr.io/coredns:1.6.2
search one by one to see the download path of the domestic mirror source
and start Pull mirror

docker pull aiotceo/kube-apiserver:v1.16.2
docker pull aiotceo/kube-scheduler:v1.16.2
docker pull aiotceo/kube-controller-manager:v1.16.2
docker pull aiotceo/kube-proxy:v1.16.2
docker pull tangxu/etcd:3.3.15-0
docker pull aiotceo/coredns:1.6.2
docker pull aiotceo/pause:3.1

docker tag aiotceo/kube-apiserver:v1.16.2 k8s.gcr.io/kube-apiserver:v1.16.2
docker tag aiotceo/kube-scheduler:v1.16.2 k8s.gcr.io/kube-scheduler:v1.16.2
docker tag aiotceo/kube-controller-manager:v1.16.2 k8s.gcr.io/kube-controller-manager:v1.16.2
docker tag aiotceo/kube-proxy:v1.16.2 k8s.gcr.io/kube-proxy:v1.16.2
docker tag tangxu/etcd:3.3.15-0 k8s.gcr.io/etcd:3.3.15-0
docker tag aiotceo/coredns:1.6.2 k8s.gcr.io/coredns:1.6.2
docker tag aiotceo/pause:3.1 k8s.gcr.io/pause:3.1

Initialize the cluster, version v1.16.2, here is selected based on kubadm and other versions, some lower versions will report errors and not support; pod network segment settings, master's ip settings:
there are two ways to initialize the cluster, here this operation is performed on the master , The
first type: the
following figure is parameter startup, you can also use configuration files:

kubeadm init --kubernetes-version=v1.16.2 --pod-network-cidr=172.20.0.0/16 --apiserver-advertise-address=172.20.0.15

The second type:

kubeadm config print init-defaults > xxxx.yaml
kubeadm init --config xxxx.yaml

This automatically generated configuration file:

apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
 - groups:
 - system:bootstrappers:kubeadm:default-node-token
  token: hdp0kg.ab86i3ms07muvkaxbjcc3oui #这里建议更改,字母小写,格式是:小写字母与数字组合6位+.+16位的小写字母与数字组合;
  ttl: 24h0m0s
  usages:
 - signing
 - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 172.20.0.15 #api 的ip要更改,即master ip
  bindPort: 6443 #api的端口要更改
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  name: k8s-master #自动获取了
  taints:
 - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {
    
    }
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io #镜像地址,国外的,不想一个个tag,又不能科学上网的,可更换成阿里云的镜像源: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.16.2 #上述提供的国内镜像源无此版本对应镜像,选择替换上述镜像源的话,此版本最好用v1.18.2
networking:
  dnsDomain: cluster.k8stest #改不改都可以
  serviceSubnet: 10.10.0.0/16  
  podSubnet: 10.18.0.0/16 #pod网段,最好按自己需求设置一下,与上述网段最好不要重复
scheduler: {
    
    }
kubeadm init --kubernetes-version=v1.16.2 --pod-network-cidr=172.10.0.0/16 --apiserver-advertise-address=172.10.0.15 

Error Collection Analysis

  • If there is the following warning, it means that your docker startup method is not set or not done well. The startup method of docker and kubelet must be the same. Here, the recommended startup method is systemd;
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: [preflight] Some fatal errors occurred:
  • The following error is port occupancy (just turn off the occupancy service, I failed to start it once before)
[ERROR Port-10250]: Port 10250 is in use
  • This error is caused by the different startup methods of kubelet and docker, see context operation for the change method
 failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

Change the startup mode of docker back to systemd, edit
"exec-opts": ["native.cgroupdriver=systemd"]
the startup mode of kubelet in the file /etc/docker/daemon.json and change the
/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf bak/10-kubeadm.conf
following file as follows is the automatic installation script I wrote (uploaded to the csdn resource) which changed the kubelet startup Way, where the variable KUBESTART is the startup method of kubelet:

 counts=`grep -i "KUBELET_CGROUP_ARGS=–cgroup-driver=" /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf |
if [[ "$counts" != "0" ]] ; then
 sed -i s/–cgroup-driver=.*$/–cgroup-driver=$KUBESTART/g /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.con
else
 sed -i 5iEnvironment="KUBELET_CGROUP_ARGS=–cgroup-driver=$KUBESTART" /usr/lib/systemd/system/kubelet.service.d/1
fi

To change manually, edit the above file and add the following line:
Environment="KUBELET_CGROUP_ARGS=–cgroup-driver=systemd"
Same as above, restart

k8s-master kubelet: W0710 15:05:16.718248   17552 docker_service.go:563] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth" 
systemctl enable kubelet
 systemctl daemon-reload
 systemctl restart docker
  • The following type of error is the timeout of pulling the foreign mirror, so the step of pulling the mirror in advance of the above steps is necessary. If you do the above steps and still make an error, check the tag name and whether the mirror is pulled down
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-controller-manager:v1.16.2: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  • The error in the figure below is that the environment variable is not configured or not effective
[root@k8s-master logs]# kubectl get nodes
error: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

The solution is as follows

echo "export KUBECONFIG=/etc/kubernetes/admin.conf" > /etc/profile.d/kubeconfig.sh
source /etc/profile.d/kubeconfig.sh

as the picture shows
Insert picture description here

  • The error in the figure below is that the swap space is not closed. If you want to permanently modify the file /etc/fstab, cancel the swap space mount (this file is best not to be modified at will, the error may affect the partition and it will not be able to boot)
W0819 14:00:48.618646    3736 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.2
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR Swap]: running with swap on is not supported. Please disable swap
	[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
swapoff -a    #临时修改,即刻生效的关闭交换空间命令

OK, try again. The
console output information is as follows. At this time, check the kubelet startup log and the system log to see what the specific reason is. The symptomatic solution can be solved. Generally, the kubelet configuration file is wrong, or the domain name setting is wrong. Or it may be caused by the incompatible version and the ipvs firewall. Generally, kubelet starts normally, and the k8s cluster can basically be initialized normally.

[init] Using Kubernetes version: v1.16.2
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.20.0.15]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master localhost] and IPs [172.20.0.15 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master localhost] and IPs [172.20.0.15 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.

Try again

kubeadm init --kubernetes-version=v1.16.2 --pod-network-cidr=172.10.0.0/16 --apiserver-advertise-address=172.10.0.15 --control-plane-endpoint="172.10.0.15:6443"

The log is as follows, here is the previous run, you must reset to reinitialize;

[init] Using Kubernetes version: v1.16.2
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
	[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
	[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
	[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
kubeadm reset 
kubeadm init --kubernetes-version=v1.16.2 --pod-network-cidr=172.10.0.0/16 --apiserver-advertise-address=172.10.0.15 --control-plane-endpoint="172.10.0.15:6443"

k8s-master kubelet: W0710 15:05:16.718248 17552 docker_service.go:563] Hairpin mode set to “promiscuous-bridge” but kubenet is not enabled, falling back to “hairpin-veth”

systemctl enable kubelet

journalctl -u kubelet
journalctl -xue kubelet
View kubelet log
System log: /var/log/message
Modify the startup mode of
kubelet : KUBELET_CGROUP_ARGS=–cgroup-driver=systemd
in cat /usr/lib/systemd/system/kubelet.service.d /10-kubeadm.conf modified and added

Jump pit installation

Look at the log at this time and you will find that there are many errors. The pits left before, now make up;
Insert picture description hererun on three nodes
-enable bridging data forwarding and filtering on iptables.
Since the kernel ipv4 forwarding needs to be loaded with the br_netfilter module, load this module : br_netfilter
First load the module temporarily and take effect. After modifying the configuration file, restart to take effect; add both of these to avoid restarting;
first check whether the current configuration has the module.
Insert picture description here
The following command can be executed without executing it. If the above picture is loaded

modprobe br_netfilter

If the kernel ipv4 is turned on, it is convenient for subsequent reconstruction and removal. It is recommended not to modify
it directly under /etc/sysctl.conf to make it permanent. Create the /etc/sysctl.d/k8s.conf file and add the following content:

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1

Import

sysctl -p /etc/sysctl.d/k8s.conf

Next is the network implementation mode of k8s, here is ipvs (IP virtual server); the log error just now means that there is no operation here;
Insert picture description here
currently only nf_conntrack is loaded, here is a separate file to load, and put it in the path configured as follows, So that subsequent machines can be automatically loaded after restarting

cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF

Modify permissions

chmod 755 /etc/sysconfig/modules/ipvs.modules

Execute once and take effect immediately

bash /etc/sysconfig/modules/ipvs.modules

Check again if it is loaded

 lsmod | grep -e ip_vs -e nf_conntrack

Insert picture description here

Run on master

 kubeadm config print init-defaults > kubeadm.yaml
  • Initialize the configuration file-kubeadm.yaml
    corresponds to the parameter analysis of the above mine installation location for analysis
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
 - groups:
 - system:bootstrappers:kubeadm:default-node-token
  token: hdp0kg.ab86i3ms07muvkax
  ttl: 24h0m0s
  usages:
 - signing
 - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 172.20.0.15
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  name: k8s-master
  taints:
 - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.18.2
networking:
  dnsDomain: cluster.k8stest
  serviceSubnet: 10.10.0.0/12
  podSubnet: 10.18.0.0/16
scheduler: {}
  • Initialize the cluster
 kubeadm init --config kubeadm.yaml
  • Configure environment variables
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
  • test
  • kubectl get nodes
    can already see the master node at this time, and the network planning service needs to be installed at this time
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f kube-flannel.yml

This process takes a long time, you can view the detailed information of the establishment process through the following command

kubectl get pods -A -o wide
找到flannel对应的pod id
kubectl describe pods xxxxxx -n kube-system

Common errors include pull mirror failure timeout, systemd1 service connection timeout (check the system error log to solve the error message, usually caused by a service error that starts automatically, k8s, most of them are caused by kubelet, but kubelet can be temporarily installed Manually stop the service and restart the system), insufficient memory, insufficient disk space, etc.;
other nodes join the cluster, and execute the following command (the token is different for each machine, so you can’t directly copy and paste)
kubeadm join 172.20.0.15:6443- -token hdp0kg.ab86i3ms07muvkax --discovery-token-ca-cert-hash sha256:c58c4a30124c7c74140ade1bbe1e460552c20ddee07e79e6a197f660e9617111 If you
don’t remember the token, execute the following command:
kubeadm token list

Guess you like

Origin blog.csdn.net/qq_38774492/article/details/107223578