Preface
In practice for the production environment, high availability is an unavoidable problem. K3s itself has undergone multiple iterations, and the HA solution has been continuously optimized, forming the current relatively stable HA solution.
Currently there are two official HA solutions:
-
High availability of embedded DB (experimental)
-
Use external databases to achieve high availability
The high availability of embedded DB is currently experimental. This article will not introduce too much, please refer to:
https://rancher.com/docs/k3s/latest/en/installation/ha-embedded/
Using an external database to achieve high availability requires building a highly available external database. Currently, K3s supports datastores such as SQLite/etcd/MySQL/PostgreSQL/DQLite, and different datastores face different usage scenarios.
At present, the most used public cloud environment in China should be Alibaba Cloud. We can use a virtual machine to build K3s HA on Alibaba Cloud, and then connect to Alibaba Cloud's RDS, which saves the trouble of maintaining a separate database. This article chooses the well-known MySQL to do HA practice. PostgreSQL is similar to MySQL, so I won't repeat it in this article.
Architecture diagram
As shown in the figure above, the end user accesses the SLB, and the SLB forwards the traffic to the two K3s master HAs on the backend. Two K3s master nodes connect to an external database created by the same RDS.
Create Alibaba Cloud instance
K3s needs at least two instances to form HA, so create at least two instances on Alibaba Cloud for demonstration:
Configure Alibaba Cloud RDS
1. To create an RDS instance, the instance type should be selected MySQL 5.7
. This version is officially supported by K3s. You can set other parameters according to your own needs.
2. Set the whitelist, the content of the whitelist can be set to the intranet IP of your K3s instance. After the setting is successful, we will get an internal network address for database connection:rm-2ze64ke7q33bkq3yt.mysql.rds.aliyuncs.com
3. Create an account to use 普通账号(ksd)
to
4. Create a database, set the database name (k3s), authorized account (ksd)
Before using mysql started by docker, there is no need to create a database in advance, because it will be created automatically when k3s is started. But on Alibaba Cloud RDS, the database required by K3s must first be created on the UI.
5. Modify database parameters
We need to innodb_large_prefix
set the data parameter to ON
, otherwise an error will be reported when starting K3s:
Jul 29 20:08:06 iZ2zed0v8rqape974mz8suZ systemd[1]: k3s.service: Service hold-off time over, scheduling restart.
Jul 29 20:08:06 iZ2zed0v8rqape974mz8suZ systemd[1]: k3s.service: Scheduled restart job, restart counter is at 11.
Jul 29 20:08:06 iZ2zed0v8rqape974mz8suZ systemd[1]: Stopped Lightweight Kubernetes.
Jul 29 20:08:06 iZ2zed0v8rqape974mz8suZ systemd[1]: Starting Lightweight Kubernetes...
Jul 29 20:08:07 iZ2zed0v8rqape974mz8suZ k3s[24934]: time="2020-07-29T20:08:07.145963348+08:00" level=info msg="Starting k3s v1.18.6+k3s1 (6f56fa1d)"
Jul 29 20:08:07 iZ2zed0v8rqape974mz8suZ k3s[24934]: time="2020-07-29T20:08:07.159363656+08:00" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: building kine: Error 1071: Specified key was too long; max key length is 767 bytes"
Jul 29 20:08:07 iZ2zed0v8rqape974mz8suZ systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Jul 29 20:08:07 iZ2zed0v8rqape974mz8suZ systemd[1]: k3s.service: Failed with result 'exit-code'.
Jul 29 20:08:07 iZ2zed0v8rqape974mz8suZ systemd[1]: Failed to start Lightweight Kubernetes.
The innodb_large_prefix
amended as ON
after, click on the upper right corner [submit Parameters when finished.
After the above steps are successful, the external database required by K3s has been prepared, let's start K3s HA.
Realize K3s HA
In k3s-master-1
and k3s-master-2
execute the same command:
curl -sfL https://docs.rancher.cn/k3s/k3s-install.sh | \
INSTALL_K3S_MIRROR=cn \
K3S_DATASTORE_ENDPOINT='mysql://ksd:your_password@tcp(rm-2ze64ke7q33bkq3yt.mysql.rds.aliyuncs.com:3306)/k3s' \
sh -s - server
After a while, a K3s HA environment has been started:
If the mirror of pull K3s on Alibaba Cloud is slow, you can configure mirror or download the corresponding version of the offline package from http://mirror.cnrancher.com, and then import the mirror by referring to the following link: https://rancher.com/docs /k3s/latest/en/installation/airgap/#prepare-the-images-directory-and-k3s-binary
root@k3s-master-2:~# kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system local-path-provisioner-6d59f47c7-tshfx 1/1 Running 0 16m 10.42.0.5 k3s-master-1 <none> <none>
kube-system metrics-server-7566d596c8-mrc94 1/1 Running 0 16m 10.42.0.2 k3s-master-1 <none> <none>
kube-system coredns-8655855d6-sxn7v 1/1 Running 0 16m 10.42.0.4 k3s-master-1 <none> <none>
kube-system helm-install-traefik-cmmsr 0/1 Completed 2 16m 10.42.0.3 k3s-master-1 <none> <none>
kube-system svclb-traefik-z6vlb 2/2 Running 0 11m 10.42.0.6 k3s-master-1 <none> <none>
kube-system svclb-traefik-f89x6 2/2 Running 0 11m 10.42.1.2 k3s-master-2 <none> <none>
kube-system traefik-758cd5fc85-chnbc 1/1 Running 0 11m 10.42.1.3 k3s-master-2 <none> <none>
root@k3s-master-2:~#
root@k3s-master-2:~# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k3s-master-1 Ready master 16m v1.18.6+k3s1 172.17.207.15 <none> Ubuntu 18.04.4 LTS 4.15.0-106-generic containerd://1.3.3-k3s2
k3s-master-2 Ready master 16m v1.18.6+k3s1 172.17.207.16 <none> Ubuntu 18.04.4 LTS 4.15.0-106-generic containerd://1.3.3-k3s2
Provide unified access through Alibaba Cloud SLB
Now we have highly available MySQL and K3s, but we still lack one to provide a unified access entry for multiple K3s servers. This can be achieved in the following ways:
-
L4 layer load balancer
-
Round-robin DNS
-
VIP or elastic IP
Therefore, we can directly use Alibaba Cloud's SLB for L4 layer load balancing and forward port 6443 to the two K3s masters on the backend.
Next, we can copy the k3s master node /etc/rancher/k3s/k3s.yaml
to the local ~/.kube/config
directory, and then modify the server address to server: https://39.106.185.201:6443
(SLB public network IP)
Then you can kubectl get nodes
test whether the traffic can be forwarded to K3s master through SLB:
ksd@Hailong-MacBook-Pro ~ kubectl get nodes
Unable to connect to the server: x509: certificate is valid for 10.43.0.1, 127.0.0.1, 172.17.207.15, 172.17.207.16, not 39.106.185.201
This error is because the certificate automatically created when K3s mster starts does not trust 39.106.185.201
the public IP of this SLB. To solve this problem, you can update K3s master and add parameters --tls-san 39.106.185.201
:
curl -sfL https://docs.rancher.cn/k3s/k3s-install.sh | \
INSTALL_K3S_MIRROR=cn \
K3S_DATASTORE_ENDPOINT='mysql://ksd:your_password@tcp(rm-2ze64ke7q33bkq3yt.mysql.rds.aliyuncs.com:3306)/k3s' \
sh -s - server \
--tls-san 39.106.185.201
Finally, go back to the local machine and execute it again kubectl get nodes
. If nothing happens, you should be able to get the node information.
ksd@Hailong-MacBook-Pro ~ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k3s-master-2 Ready master 65m v1.18.6+k3s1
k3s-master-1 Ready master 65m v1.18.6+k3s1
postscript
This article only introduces how to use Alibaba Cloud's SLB and RDS to achieve K3s HA. The operations of other public clouds are basically the same. Although they have not been tested in detail, they should be supported in theory. If it is a non-public cloud environment, you can choose the appropriate datastore and the corresponding HA method according to your own needs.