Question une
# /opt/kube/bin/kubectl get nodes
E0221 11:11:16.575740 23693 memcache.go:255] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0221 11:11:16.582180 23693 memcache.go:106] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0221 11:11:16.584653 23693 memcache.go:106] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0221 11:11:16.587073 23693 memcache.go:106] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
NAME STATUS ROLES AGE VERSION
master01 Ready,SchedulingDisabled master 49m v1.26.1
master02 Ready,SchedulingDisabled master 49m v1.26.1
master03 Ready,SchedulingDisabled master 49m v1.26.1
worker01 Ready node 46m v1.26.1
worker02 Ready node 46m v1.26.1
worker03 Ready node 46m v1.26.1
résoudre:
修改/etc/kubeasz/clusters/k8s-01/yml/metrics-server.yaml 文件
118 spec:
119 selector:
120 matchLabels:
121 k8s-app: metrics-server
122 strategy:
123 rollingUpdate:
124 maxUnavailable: 0
125 template:
126 metadata:
127 labels:
128 k8s-app: metrics-server
129 spec:
130 hostNetwork: true #####添加本行内容
131 containers:
132 - args:
133 - --cert-dir=/tmp
134 - --secure-port=4443
/opt/kube/bin/kubectl apply -f /etc/kubeasz/clusters/k8s-01/yml/metrics-server.yaml
[root@master01 yml]# /opt/kube/bin/kubectl get nodes
NAME STATUS ROLES AGE VERSION
master01 Ready,SchedulingDisabled master 49m v1.26.1
master02 Ready,SchedulingDisabled master 49m v1.26.1
master03 Ready,SchedulingDisabled master 49m v1.26.1
worker01 Ready node 46m v1.26.1
worker02 Ready node 46m v1.26.1
worker03 Ready node 46m v1.26.1
question deux,
FAILED - RETRYING: 轮询等待calico-node 运行 (15 retries left). FAILED - RETR
helm pre-upgrade hooks failed: timed out waiting for the condition
Journal d'erreurs :
Kubelet Impossible d'attacher ou de monter des volumes - délai d'attente dépassé
Solution : k8s, docker, cilium et de nombreuses autres fonctions et fonctionnalités nécessitent une prise en charge du noyau Linux plus récente, il est donc nécessaire de mettre à niveau le noyau avant le déploiement du cluster
Solution : mise à niveau
Centos7事例:
# 载入公钥
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
# 安装ELRepo
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
# 载入elrepo-kernel元数据
yum --disablerepo=\* --enablerepo=elrepo-kernel repolist
# 查看可用的rpm包
yum --disablerepo=\* --enablerepo=elrepo-kernel list kernel*
# 安装长期支持版本的kernel
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt.x86_64
# 删除旧版本工具包
yum remove kernel-tools-libs.x86_64 kernel-tools.x86_64 -y
# 安装新版本工具包
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt-tools.x86_64
#查看默认启动顺序
awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
CentOS Linux (4.4.183-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux (3.10.0-327.10.1.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-c52097a1078c403da03b8eddeac5080b) 7 (Core)
#默认启动的顺序是从0开始,新内核是从头插入(目前位置在0,而4.4.4的是在1),所以需要选择0。
grub2-set-default 0
#重启并检查
reboot
Question 3. Il indique qu'il n'y a pas de commande kubectl (autres commandes associées)
添加命令的环境变量
# vim /etc/profile
PATH=$PATH:/opt/kube/bin
export PATH
# source /etc/profile
Question 4. Quelques problèmes d'installation du réseau calico
#安装calico网络
如果需要安装calico,请在clusters/xxxx/hosts文件中设置变量 CLUSTER_NETWORK="calico"
在clusters/xxxx/config.yml
# [calico]设置calico 是否使用route reflectors
# 如果集群规模超过50个节点,建议启用该特性
CALICO_RR_ENABLED: true
# CALICO_RR_NODES 配置route reflectors的节点,如果未设置默认使用集群master节点
# CALICO_RR_NODES: ["192.168.1.1", "192.168.1.2"]
CALICO_RR_NODES: ["10.2.1.190", "10.2.1.191", "10.2.1.192", "10.2.1.193", "10.2.1.194", "10.2.1.195"]
错误
TASK [calico : node label] *******************************************************************************************************************************
fatal: [10.2.1.190]: FAILED! => {"changed": true, "cmd": "for ip in 10.2.1.190 10.2.1.191 10.2.1.192 10.2.1.193 10.2.1.194 10.2.1.195 ;do /etc/kubeasz/bin/kubectl label node \"$ip\" route-reflector=true --overwrite; done", "delta": "0:00:00.479707", "end": "2023-02-21 03:47:09.604375", "msg": "non-zero return code", "rc": 1, "start": "2023-02-21 03:47:09.124668", "stderr": "Error from server (NotFound): nodes \"10.2.1.190\" not found\nError from server (NotFound): nodes \"10.2.1.191\" not found\nError from server (NotFound): nodes \"10.2.1.192\" not found\nError from server (NotFound): nodes \"10.2.1.193\" not found\nError from server (NotFound): nodes \"10.2.1.194\" not found\nError from server (NotFound): nodes \"10.2.1.195\" not found", "stderr_lines": ["Error from server (NotFound): nodes \"10.2.1.190\" not found", "Error from server (NotFound): nodes \"10.2.1.191\" not found", "Error from server (NotFound): nodes \"10.2.1.192\" not found", "Error from server (NotFound): nodes \"10.2.1.193\" not found", "Error from server (NotFound): nodes \"10.2.1.194\" not found", "Error from server (NotFound): nodes \"10.2.1.195\" not found"], "stdout": "", "stdout_lines": []}
解决:查看k8s中node的name如果为主机名,需要在RR的nodes中添加主机名
# CALICO_RR_NODES 配置route reflectors的节点,如果未设置默认使用集群master节点
# CALICO_RR_NODES: ["192.168.1.1", "192.168.1.2"]
CALICO_RR_NODES: ["master01", "master02", "master03", "worker01", "worker02", "worker03"]
#执行安装
#dk ezctl setup k8s-01 06
[root@master01 kubeasz]# kubectl get node
NAME STATUS ROLES AGE VERSION
master01 Ready,SchedulingDisabled master 3h48m v1.26.1
master02 Ready,SchedulingDisabled master 3h48m v1.26.1
master03 Ready,SchedulingDisabled master 3h48m v1.26.1
worker01 Ready node 3h45m v1.26.1
worker02 Ready node 3h45m v1.26.1
worker03 Ready node 3h45m v1.26.1
[root@master01 kubeasz]# kubectl get pod -n kube-system -o wide | grep calico
calico-kube-controllers-7bbb6b796b-6cx6m 1/1 Running 1 (3h23m ago) 3h43m 10.2.1.195 worker03 <none> <none>
calico-node-2h65h 1/1 Running 0 3h43m 10.2.1.193 worker01 <none> <none>
calico-node-bxgdz 1/1 Running 0 3h43m 10.2.1.190 master01 <none> <none>
calico-node-d5mgg 1/1 Running 0 3h43m 10.2.1.194 worker02 <none> <none>
calico-node-hh2s4 1/1 Running 0 3h43m 10.2.1.195 worker03 <none> <none>
calico-node-l55rc 1/1 Running 0 3h43m 10.2.1.192 master03 <none> <none>
calico-node-vw9cr 1/1 Running 0 3h43m 10.2.1.191 master02 <none> <none>
[root@master01 kubeasz]# calicoctl get node -o wide
NAME ASN IPV4 IPV6
master01 (64512) 10.2.1.190/23
master02 (64512) 10.2.1.191/23
master03 (64512) 10.2.1.192/23
worker01 (64512) 10.2.1.193/23
worker02 (64512) 10.2.1.194/23
worker03 (64512) 10.2.1.195/23
可以在集群中选择1个或多个节点作为 rr 节点,这里先选择节点:k8s401
calicoctl patch node master01 -p '{"spec": {"bgp": {"routeReflectorClusterID": "244.0.0.1"}}}'
#配置node label
calicoctl patch node master01 -p '{"metadata": {"labels": {"route-reflector": "true"}}}'
[root@master01 BGP]# vim bgp.yml
kind: BGPPeer
apiVersion: projectcalico.org/v3
metadata:
name: peer-with-route-reflectors
spec:
nodeSelector: all()
peerSelector: route-reflector == 'true'
[root@master01 BGP]# calicoctl create -f bgp.yml
Successfully created 1 'BGPPeer' resource(s)
验证增加 rr 之后的bgp 连接情况
dk ansible -i /etc/kubeasz/clusters/k8s-01/hosts all -m shell -a '/opt/kube/bin/calicoctl node status'
依次添加其它的master与worker节点添加到rr
Question 5. Erreur signalée lorsque kubectl get nodes obtient des nœuds
[root@master02 ~]# kubectl get nodes
E0221 14:51:52.209187 17967 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:51:52.210167 17967 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:51:52.212104 17967 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:51:52.213813 17967 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:51:52.215677 17967 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
The connection to the server localhost:8080 was refused - did you specify the right host or port?
Résolution d'erreur :
1) le jeton expire
#kubeadm token generate #生成token
7r3l16.5yzfksso5ty2zzie #下面这条命令中会用到该结果
# kubeadm token create 7r3l16.5yzfksso5ty2zzie --print-join-command --ttl=0 #根据token输出添加命令
W0604 10:35:00.523781 14568 validation.go:28] Cannot validate kube-proxy config - no validator is available
W0604 10:35:00.523827 14568 validation.go:28] Cannot validate kubelet config - no validator is available
kubeadm join 192.168.254.100:6443 --token 7r3l16.5yzfksso5ty2zzie --discovery-token-ca-cert-hash sha256:56281a8be264fa334bb98cac5206aa190527a03180c9f397c253ece41d997e8a
2) Le serveur api k8s est inaccessible
[root@master01 ~]#setenforce 0
[root@master01 ~]#sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
[root@master01 ~]#systemctl disable firewalld --now
3) L'heure du nœud nœud et du nœud maître ne sont pas synchronisées
ntpdate time.ntp.org
4) Les fichiers d'exécution nécessaires ne sont pas créés après l'initialisation de k8s
[root@worker01 ~]# kubectl get nodes
E0221 14:55:32.706287 6388 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:55:32.706869 6388 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:55:32.708772 6388 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:55:32.710750 6388 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:55:32.712341 6388 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
The connection to the server localhost:8080 was refused - did you specify the right host or port?
[root@worker01 ~]# cd .kube/
[root@worker01 .kube]# ll -a
total 0
drwxr-xr-x 2 root root 6 Feb 21 10:15 .
dr-xr-x---. 6 root root 222 Feb 21 11:28 ..
####从正常节点(dev)复制kube这个文件到其它节点
[root@worker01 .kube]# ll -a
total 8
drwxr-xr-x 3 root root 33 Feb 21 14:55 .
dr-xr-x---. 6 root root 222 Feb 21 11:28 ..
drwxr-x--- 4 root root 35 Feb 21 14:55 cache
-r-------- 1 root root 6198 Feb 21 14:55 config
[root@worker01 .kube]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master01 Ready,SchedulingDisabled master 4h34m v1.26.1
master02 Ready,SchedulingDisabled master 4h34m v1.26.1
master03 Ready,SchedulingDisabled master 4h34m v1.26.1
worker01 Ready node 4h30m v1.26.1
worker02 Ready node 4h30m v1.26.1
worker03 Ready node 4h30m v1.26.1
Question 5. Après avoir installé le réseau calico, vérifiez qu'il n'y a qu'un seul nœud
[root@master-02 ~]# calicoctl get node -o wide
NAME ASN IPV4 IPV6
localhost.localdomain (64512) 10.2.1.195/23
[root@master-02 ~]# calicoctl node status
Calico process is running.
IPv4 BGP status
No IPv4 peers found.
IPv6 BGP status
No IPv6 peers found.
问题六、Une erreur non critique s'est produite lors de la récupération des ressources : les pods sont interdits
Non-critical error occurred during resource retrieval: namespaces is forbidden: User “system:serviceaccount:kubernetes-dashboard:kubernetes-dashboard” cannot list resource “namespaces” in API group “” at the cluster scope
Résolu : (supprimé, refactorisé)
kubectl delete -f admin-user-sa-rbac.yaml
kubectl delete -f read-user-sa-rbac.yaml
kubectl delete -f kubernetes-dashboard.yaml
kubectl apply -f admin-user-sa-rbac.yaml
kubectl apply -f kubernetes-dashboard.yaml
kubectl apply -f read-user-sa-rbac.yaml
#获取token
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')