Rancher 集群安装

一、Rancher 是什么

Rancher 是一个 Kubernetes 管理工具,用于在任何地方和任何提供商上部署和运行集群。

Rancher 可以从托管提供商调配 Kubernetes,调配计算节点,然后将 Kubernetes 安装到这些节点上,或者导入在任何地方运行的现有 Kubernetes 集群。

Rancher 在 Kubernetes 基础上增加了重要价值,首先是为所有集群集中验证和基于角色的访问控制(RBAC),使全球管理员能够从一个位置控制集群访问。

然后,它能够对集群及其资源进行详细监控和警报,向外部提供商发送日志,并通过应用目录直接与 Helm 集成。如果您有外部 CI/CD 系统,您可以将其插入 Rancher,但如果您没有,Rancher 甚至包括 Fleet,以帮助您自动部署和升级工作负载。

Rancher是一个完整的 Kubernetes 容器管理平台,为您提供在任何地方成功运行 Kubernetes 的工具。

二、为什么选择 Rancher

国内有 kuboard 这类的 kubernetes 管理界面,但是并没有解决使用 kubernetes 引入的复杂性,例如能够与 k8s 集群管理平台集成的 CI/CD、多集群管理、账号权限管理。

互联网上一堆检索,对比下来只有 Red Hat OpenShift Kubernetes Engine (OpenShit),VMware Tanzu,SUSE Rancher 三家,能免费使用的就只有 Rancher,所以没得选了。

三、部署 Rancher 服务器

Rancher 支持多种方法部署:

  1. AWS (uses Terraform)
  2. AWS Marketplace (uses Amazon EKS)
  3. Azure (uses Terraform)
  4. DigitalOcean (uses Terraform)
  5. GCP (uses Terraform)
  6. Hetzner Cloud (uses Terraform)
  7. Vagrant
  8. Equinix Metal
  9. Outscale (uses Terraform)

这里我们选择手动部署方式

3.1. 安装要求

准备两台 Ubuntu LTS 版本的服务器,这里使用 k3s 集群运行 Rancher 管理节点,k3s 的资源需求如下:

部署规模 管理的集群数量 管理的节点数量 k3s vCPUs 要求 k3s 内存要求 k3s 数据库要求
Small 最大 150 最大 1500 2 8 GB 2 cores, 4 GB + 1000 IOPS
Medium 最大 300 最大 3000 4 16 GB 2 cores, 4 GB + 1000 IOPS
Large 最大 500 最大 5000 8 32 GB 2 cores, 4 GB + 1000 IOPS
X-Large 最大 1000 最大 10,000 16 64 GB 2 cores, 4 GB + 1000 IOPS
XX-Large 最大 2000 最大 20,000 32 128 GB 2 cores, 4 GB + 1000 IOPS

3.2. 安装准备,禁用防火墙

systemctl disable ufw
systemctl stop ufw
ufw reset
iptables -P INPUT ACCEPT
iptables -P OUTPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -F

3.3. 安装 k3s

在两个服务器上安装 Rancher 管理节点集群 k3s,第一个节点命令执行成功,k3s 服务启动成功后,再在第二个节点执行,第二个节点会自动加入并创建集群。版本号可以选择最新稳定版的前一个版本,试过最新稳定版,无法通过中国区镜像下载。

另外,虽然 k3s 已经发布 1.27 版本了,但是 Rancher 支持的最高 k3s 版本还是 1.26 系列,所以仍然选择 1.26 系列版本。

3.3.1. 使用国内镜像网站的脚本安装 K3s

curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn INSTALL_K3S_VERSION="v1.26.5+k3s1" K3S_DATASTORE_ENDPOINT='postgres://k3s:[email protected]:5432/k3s?sslmode=disable' sh -s - server --token=k3stoken --tls-san 192.168.6.247 --tls-san 192.168.6.248 --tls-san rancher.myexample.com

命令响应如下:

[sudo] password for ubuntu: 
[INFO]  Using v1.26.5+k3s1 as release
[INFO]  Downloading hash rancher-mirror.rancher.cn/k3s/v1.26.5-k3s1/sha256sum-amd64.txt
[INFO]  Downloading binary rancher-mirror.rancher.cn/k3s/v1.26.5-k3s1/k3s
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Skipping installation of SELinux RPM
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Creating /usr/local/bin/ctr symlink to k3s
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO]  systemd: Starting k3s```

3.3.2. 离线安装 K3s

由于国内镜像网站存在同步延迟的问题,目前不支持最新的小版本号,这里也给出离线安装 K3s 的方法。

3.3.2.1. 本地创建目录,放置离线安装包

sudo mkdir -p /var/lib/rancher/k3s/agent/images/
sudo cp ./k3s-airgap-images-amd64.tar.gz /var/lib/rancher/k3s/agent/images/

3.3.2.2. 配置 k3s 可执行文件

从 https://github.com/k3s-io/k3s/releases 页面下载对应版本的 k3s 二进制文件,放置到/usr/local/bin目录,并添加可执行权限

chmod +x /usr/local/bin/k3s

3.3.2.3. 配置 install.sh 安装脚本

get.k3s.io下载安装脚本,命名为install.sh,并添加可执行权限

chmod +x install.sh

3.3.2.4. 使用脚本执行安装

INSTALL_K3S_SKIP_DOWNLOAD=true INSTALL_K3S_EXEC='server --token=k3stoken --tls-san 192.168.6.247 --tls-san 192.168.6.248 --tls-san rancher.myexample.com' \
K3S_DATASTORE_ENDPOINT='postgres://k3s:[email protected]:5432/k3s?sslmode=disable' \
./install.sh

3.4. 配置其中一个节点能够访问 k3s 集群

sudo -s
mkdir -p /home/ubuntu/.kube
sudo cp /etc/rancher/k3s/k3s.yaml /home/ubuntu/.kube/config
chmod 600 /home/ubuntu/.kube/config
chown ubuntu:ubuntu /home/ubuntu/.kube/config
export KUBECONFIG=~/.kube/config
echo "export KUBECONFIG=~/.kube/config" >> /home/ubuntu/.bashrc

3.5. 安装 helm

参考 https://github.com/helm/helm/releases

3.6. 添加 Rancher helm 资源库

helm repo add rancher-latest https://releases.rancher.com/server-charts/latest

3.7. 创建 namespace

kubectl create namespace cattle-system

3.8. 安装 cert-manager CustomResourceDefinitions

如果碰到服务器上访问 github 的问题,可以把文件下载下来,上传到服务器,将 https url 换成文件的相对路径就行,版本号需要和下面 helm 安装的 cert-manager 想同

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.12.2/cert-manager.crds.yaml

成功执行的命令响应

customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io created

3.9. 添加 cert-manager helm 资源库

helm repo add jetstack https://charts.jetstack.io
helm repo update

3.10. 安装 cert-manager

版本号可以根据官方文档或开源仓库选择最新稳定版

helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.12.2

成功安装的命令响应

NAME: cert-manager
LAST DEPLOYED: Tue Jul 11 06:32:41 2023
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager v1.11.4 has been deployed successfully!

In order to begin issuing certificates, you will need to set up a ClusterIssuer
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).

More information on the different types of issuers and how to configure them
can be found in our documentation:

https://cert-manager.io/docs/configuration/

For information on how to configure cert-manager to automatically provision
Certificates for Ingress resources, take a look at the `ingress-shim`
documentation:

https://cert-manager.io/docs/usage/ingress/

3.11. 安装 rancher

helm install rancher rancher-latest/rancher \
  --namespace cattle-system \
  --set hostname=rancher.myexample.com \
  --set replicas=1 \
  --set bootstrapPassword=admin

成功安装的命令响应

NAME: rancher
LAST DEPLOYED: Mon Apr 10 03:01:09 2023
NAMESPACE: cattle-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Rancher Server has been installed.

NOTE: Rancher may take several minutes to fully initialize. Please standby while Certificates are being issued, Containers are started and the Ingress rule comes up.

Check out our docs at https://rancher.com/docs/

If you provided your own bootstrap password during installation, browse to https://rancher.51bsi.com to get started.

If this is the first time you installed Rancher, get started by running this command and clicking the URL it generates:

echo https://rancher.myexample.com/dashboard/?setup=$(kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{
     
     {.data.bootstrapPassword|base64decode}}')

To get just the bootstrap password on its own, run:

kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{
    
    {.data.bootstrapPassword|base64decode}}{
    
    { "\n" }}'

Happy Containering!

安装成功之后,大约需要 2 至 3 分钟,再访问rancher 服务 https://rancher.myexample.com/dashboard/?setup=admin

四、常见的服务检查命令

4.1. 检查 cert-manager 服务运行情况

ubuntu@24:~$ kubectl get po -n cert-manager
NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-cainjector-56bbdd5c47-2gtq5   1/1     Running   0          96s
cert-manager-64f9f45d6f-8qxn6              1/1     Running   0          96s
cert-manager-webhook-d4f4545d7-cxnhf       1/1     Running   0          96s

4.2. 查看服务的日志

ubuntu@24:~$ kubectl -n cert-manager logs cert-manager-webhook-d4f4545d7-cxnhf
I0711 07:13:13.364269       1 feature_gate.go:249] feature gates: &{
    
    map[]}
W0711 07:13:13.364365       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0711 07:13:13.370407       1 webhook.go:129] cert-manager "msg"="using dynamic certificate generating using CA stored in Secret resource" "secret_name"="cert-manager-webhook-ca" "secret_namespace"="cert-manager"
I0711 07:13:13.370665       1 server.go:133] cert-manager/webhook "msg"="listening for insecure healthz connections" "address"=":6080"
I0711 07:13:13.370731       1 server.go:197] cert-manager/webhook "msg"="listening for secure connections" "address"=":10250"
I0711 07:13:14.376507       1 dynamic_source.go:266] cert-manager/webhook "msg"="Updated cert-manager webhook TLS certificate" "DNSNames"=["cert-manager-webhook","cert-manager-webhook.cert-manager","cert-manager-webhook.cert-manager.svc"]

4.3. 查看 cattle-system 命名空间的服务情况

ubuntu@24:~$ kubectl get pods --namespace cattle-system
NAME                               READY   STATUS              RESTARTS   AGE
helm-operation-ggq22               2/2     Running             0          53s
helm-operation-mxqjg               0/2     Completed           0          2m52s
helm-operation-sftw8               0/2     Completed           0          74s
helm-operation-wbd4s               0/2     Completed           0          110s
rancher-7c5dbf46fc-8fb5v           1/1     Running             0          4m51s
rancher-7c5dbf46fc-l92kc           1/1     Running             0          4m50s
rancher-7c5dbf46fc-wmx8h           1/1     Running             0          4m50s
rancher-webhook-577b778f8f-9wzr5   0/1     ContainerCreating   0          9s

查看所有命名空间的服务情况

ubuntu@247:~$ kubectl get pods --all-namespaces
NAMESPACE                   NAME                                       READY   STATUS      RESTARTS   AGE
kube-system                 local-path-provisioner-76d776f6f9-lql4s    1/1     Running     0          110m
kube-system                 svclb-traefik-23008fd2-l67lp               2/2     Running     0          109m
kube-system                 helm-install-traefik-crd-lfjs8             0/1     Completed   0          110m
kube-system                 helm-install-traefik-brj48                 0/1     Completed   1          110m
kube-system                 coredns-59b4f5bbd5-ddvjp                   1/1     Running     0          110m
kube-system                 traefik-57c84cf78d-fhxlh                   1/1     Running     0          109m
kube-system                 metrics-server-68cf49699b-zmrqr            1/1     Running     0          110m
cert-manager                cert-manager-cainjector-7f47598f9b-rvlwj   1/1     Running     0          23m
cert-manager                cert-manager-55b858df44-52ls9              1/1     Running     0          23m
cert-manager                cert-manager-webhook-7d694cd764-n5vhc      1/1     Running     0          23m
cattle-system               rancher-7769775dfb-gcghz                   1/1     Running     0          22m
cattle-fleet-system         gitjob-85b85d5df8-n74sp                    1/1     Running     0          19m
cattle-fleet-system         fleet-controller-775cd6657c-zxfq2          1/1     Running     0          19m
cattle-system               helm-operation-mrzg8                       0/2     Completed   0          19m
cattle-system               helm-operation-7nqmg                       0/2     Completed   0          18m
cattle-system               rancher-webhook-788c48b988-82j77           1/1     Running     0          18m
cattle-system               helm-operation-h82mk                       0/2     Completed   0          18m
cattle-system               helm-operation-sfxnn                       0/2     Completed   0          17m
kube-system                 svclb-traefik-23008fd2-59psn               2/2     Running     0          16m
cattle-fleet-local-system   fleet-agent-7f8d499f-4m4fc                 1/1     Running     0          9m49s

4.4. 查看 Rancher 部署情况

ubuntu@24:~$ kubectl -n cattle-system get deploy rancher
NAME      READY   UP-TO-DATE   AVAILABLE   AGE
rancher   1/1     1            1           5m42s

五、卸载 k3s

5.1. 执行卸载

sudo -s
/usr/local/bin/k3s-uninstall.sh
/usr/local/bin/k3s-agent-uninstall.sh

5.2. 删除对应的文件和目录

rm -rf /etc/ceph /etc/cni /etc/kubernetes /etc/rancher /opt/cni /opt/rke  /run/secrets/kubernetes.io /run/calico        /run/flannel        /var/lib/calico        /var/lib/etcd        /var/lib/cni        /var/lib/kubelet        /var/lib/rancher       /var/log/containers        /var/log/kube-audit        /var/log/pods        /var/run/calico /var/lib/longhorn

5.3. 清除数据库中对应的数据

DROP TABLE public.kine;

猜你喜欢

转载自blog.csdn.net/xieshaohu/article/details/131661880