K8S installation process six: etcd cluster installation

1 Introduction to ETCD Basic Information

1.1 Official definition of ETCD

etcd is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. It gracefully handles leader elections during network partitions and can tolerate machine failure, even in the leader node.

1.2 ETCD cluster deployment form

  • Static discovery. When starting the etcd service, add all etcd node information through the parameter --initial-cluster.
  • ETCD dynamic discovery. Manage the etcd cluster that needs to be deployed currently through the existing etcd cluster.
  • DNS dynamic discovery.
    Usually, an etcd cluster size of 3 nodes or 5 nodes can meet the needs of many business scenarios. Therefore, deploying an etcd cluster in the form of static discovery is easier to implement than other methods.

2. Installation preparation

2.1 Machine information

node name Node IP OS version etcd version
node1 192.168.0.233 CentOS 7.9 v3.5.4
node2 192.168.0.200 CentOS 7.9 v3.5.4
node3 192.168.0.145 CentOS 7.9 v3.5.4

The etcd cluster adopts a one-master-multiple-slave architecture. Any node in the ETCD cluster may become the master node, and the election of the master node between nodes adopts the Raft consensus algorithm. The Raft algorithm requires more than half of the nodes to agree to become the master node. When the number of nodes is even, two nodes may get the same number of votes and the election will fail. Therefore, it is recommended to use an odd number of nodes in the ETCD cluster.

2.2 Environment preparation

Execute the following two processes on the above three nodes, mainly to complete the creation of the account and the download and update of the basic development kit.

2.2.1 Add user group and user

  • Use the root account to perform the following operations
groupadd kube
useradd -d /home/kube -m kube -s /bin/bash -g kube

2.2.2 Install basic development tools

yum groupinstall -y "Development Tools"

Note: The installation preparation process needs to be performed on the above three nodes. Since the root user has too much authority, it is not recommended to use the root user to deploy the etcd cluster. The kube user will be used to deploy etcd later.

3. ETCD cluster deployment

The etcd cluster deployment is mainly divided into two ways, one is to use etcd open source source code to compile and deploy, and the other is to use etcd precompiled package to install and deploy. The use of source code deployment requires the establishment of a golang development environment, and may involve the download of foreign dependent code libraries. If network conditions do not allow it, source code deployment is not recommended. Chapter 3.1.1 mainly introduces the source code deployment method, and Chapter 3.1.2 mainly introduces the precompiled package deployment method. It is recommended to choose the deployment method described in Section 3.1.2.

3.1 Node deployment

Please perform the following steps on the etcd server node, but note that the IP address in the configuration information must correspond to the actual IP address of the node.

3.1.1 etcd source code deployment mode (not recommended)

3.1.1.1 Basic tool preparation

yum install git
yum install golang
  • Install the git tool
  • Install the golang SDK

3.1.1.2 Get source code and compile

cd /opt
su - kube
git clone -b v3.5.4 https://github.com/etcd-io/etcd.git etcd-v3.5.4
cd etcd-v3.5.4
./build.sh
  • Change to the /opt directory
  • Switch to the kube user and refresh the kube user environment variables
  • Use the git tool to download the source code of the specified version of etcd
  • Enter the source code directory
  • Execute the compile script. The compilation process involves the download of dependent libraries, so when the network bandwidth is not large enough, it takes a certain amount of time to execute the compilation process.

3.1.2 etcd precompiled package installation (recommended)

cd /opt
su - kube
wget https://github.com/etcd-io/etcd/releases/download/v3.5.4/etcd-v3.5.4-linux-amd64.tar.gz
tar -xvf etcd-v3.5.4-linux-amd64.tar.gz 
mv etcd-v3.5.4-linux-amd64 etcd-v3.5.4
rm -rf etcd-v3.5.4-linux-amd64.tar.gz
  • Change to the /opt directory
  • Switch to the kube user and refresh the environment variables of the kube user
  • Download the etcd package
  • Unzip the software package to the directory etcd-v3.5.4 directory
  • Delete etcd compressed package

3.2 Environment variable setting

Modify the environment variable configuration file of the kube user. Profile path: /home/kube/…bash_profile

ETCD_HOME=/opt/etcd-v3.5.4
PATH=$PATH:$ETCD_HOME
export PATH

3.3 Modify etcd configuration information

  • Create etcd data storage directory
mkdir /opt/etcd-v3.5.4/data
  • Create etcd configuration file
cat >/opt/etcd-v3.5.4/etcd.conf <<EOF
#[Member]
ETCD_NAME="etcd1"
ETCD_DATA_DIR="/opt/etcd-v3.5.4/data"
ETCD_LISTEN_PEER_URLS="https://192.168.0.233:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.0.233:2379,https://127.0.0.1:2379"
ETCD_CERT_FILE=/etc/kubernetes/ssl/etcd.pem
ETCD_KEY_FILE=/etc/kubernetes/ssl/etcd-key.pem
ETCD_TRUSTED_CA_FILE=/etc/kubernetes/ssl/ca.pem
ETCD_CLIENT_CERT_AUTH=true
ETCD_PEER_CERT_FILE=/etc/kubernetes/ssl/etcd.pem
ETCD_PEER_KEY_FILE=/etc/kubernetes/ssl/etcd-key.pem
ETCD_PEER_TRUSTED_CA_FILE=/etc/kubernetes/ssl/ca.pem

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.0.233:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.0.233:2379"
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.0.233:2380,etcd2=https://192.168.0.200:2380,etcd3=https://192.168.0.145:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF

The ETCD_NAME variable above is the name of the node. Please give different names to different nodes, and the address information of all nodes is referenced in the ETCD_INITIAL_CLUSTER node. The two should correspond. Secondly, the IP address involved should be adjusted according to the IP address of the actual node. For example, 192.168.0.233 in the above configuration should be replaced with the IP address of the actual node.

  • Create etcd.service file in /usr/lib/systemd/system/etcd.service directory
cat > /usr/lib/systemd/system/etcd.service <<EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/opt/etcd-v3.5.4/data
EnvironmentFile=/opt/etcd-v3.5.4/etcd.conf
ExecStart=/opt/etcd-v3.5.4/etcd \\
  --auto-compaction-mode=periodic \\
  --auto-compaction-retention=1 \\
  --max-request-bytes=33554432 \\
  --quota-backend-bytes=6442450944 \\
  --heartbeat-interval=250 \\
  --election-timeout=2000
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

3.4 etcd service startup

  • Create systemctl service
systemctl enable etcd
  • Start etcd service
systemctl start etcd

When starting the first etcd node, the command line will be stuck. This is normal. When starting the second etcd node, the stuck state will disappear. That is, when the first etcd node in a cluster starts up for the first time, it blocks waiting for at least one other node to join the cluster. When both nodes join the cluster normally, the blocking state will disappear and the service will start successfully.

  • View service status
systemctl status etcd

4. ETCD Verification

After all nodes have finished installing etcd, you can check the health status of etcd nodes in the cluster by the following method.

  • Switch to kube user
su - kube
  • View etcd cluster members
ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/ssl/etcd.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem --endpoints="https://192.168.0.200:2379,https://192.168.0.145:2379,https://192.168.0.233:2379" endpoint status --write-out='table'

The displayed information is:
insert image description here

5. Problem summary

  • The cluster cannot be started. Due to a node data failure, the node cannot be started. The etcdcel and kubectl client tools cannot be used, and both request timeout
解决办法:
1。 删除该节点的数据
$ rm /opt/etcd-v3.5.4/data

2。 设置 ETCD_INITIAL_CLUSTER_STATE="existing"

3。 重启节点
$ systemctl restart etcd

Guess you like

Origin blog.csdn.net/hzwy23/article/details/128084797