Kubernetes cluster deployment—deploy etcd cluster (2)


Etcd is a distributed key-value storage system. Kubernetes uses Etcd for data storage, so first prepare an Etcd database. In order to solve the single point of failure of Etcd, it should be deployed in a cluster. Here, 3 machines are used to form a cluster, which can tolerate the failure of 1 machine. , of course, you can also use 5 machines to form a cluster, which can tolerate the failure of 2 machines.

Node name IP
etcd-1 10.20.17.20
etcd-2 10.20.17.21
etcd-3 10.20.17.22

Note: In order to save machines, this is reused with K8s node machines. It can also be deployed independently of the k8s cluster, as long as the apiserver can be connected.

1 Prepare the cfssl certificate generation tool

cfssl is an open source certificate management tool that uses json files to generate certificates, which is more convenient to use than openssl.

Find any server to operate. The Master node is used here.

# mkdir -p /opt/tools
# mkdir -p /opt/tools/cfssl
# cd /opt/tools/cfssl/
# wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
# wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
# wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
# chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
# mv cfssl_linux-amd64 /usr/local/bin/cfssl
# mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
# mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo

2 Generate etcd certificate

2.1 Self-signed Certificate Authority (CA)

Create working directory:

# mkdir -p /root/TLS/{etcd,k8s}
# mkdir -p /root/TLS/{etcd,k8s}
# cd /root/TLS/etcd

Self-signed CA:

cat > ca-config.json << EOF
{
  "signing": {
    "default": {
      "expiry": "87600h"
    },
    "profiles": {
      "www": {
         "expiry": "87600h",
         "usages": [
            "signing",
            "key encipherment",
            "server auth",
            "client auth"
        ]
      }
    }
  }
}
EOF

cat > ca-csr.json << EOF
{
    "CN": "etcd CA",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "Beijing",
            "ST": "Beijing"
        }
    ]
}
EOF

Generate certificate:

# cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
# ls *pem
ca-key.pem  ca.pem
2.2 Use self-signed CA to issue Etcd HTTPS certificate

Create certificate application file:

cat > server-csr.json << EOF
{
    "CN": "etcd",
    "hosts": [
    "10.20.17.20",
    "10.20.17.21",
    "10.20.17.22"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "BeiJing",
            "ST": "BeiJing"
        }
    ]
}
EOF

Note: The IP in the hosts field of the above file is the cluster internal communication IP of all etcd nodes, and no one is missing! In order to facilitate later expansion, you can write a few more reserved IPs.

Generate certificate:

# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | cfssljson -bare server
# ls server*pem
server-key.pem  server.pem

3 Download etcd binary from Github

Download address: https://github.com/etcd-io/etcd/releases/download/v3.4.9/etcd-v3.4.9-linux-amd64.tar.gz

4 Deploy etcd cluster

The following operations are performed on etcd node 1. To simplify the operation, all files generated by node 1 will be copied to node 2 and node 3 later.

4.1 Create a working directory and extract the binary package
# mkdir -p /opt/etcd/{bin,cfg,ssl} 
# tar zxvf /opt/tools/etcd-v3.4.9-linux-amd64.tar.gz
# mv /opt/tools/etcd-v3.4.9-linux-amd64/{etcd,etcdctl} /opt/etcd/bin/
4.2 Create etcd configuration file
cat > /opt/etcd/cfg/etcd.conf << EOF
#[Member]
ETCD_NAME="etcd-1"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://10.20.17.20:2380"
ETCD_LISTEN_CLIENT_URLS="https://10.20.17.20:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.20.17.20:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://10.20.17.20:2379"
ETCD_INITIAL_CLUSTER="etcd-1=https://10.20.17.20:2380,etcd-2=https://10.20.17.21:2380,etcd-3=https://10.20.17.22:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
  • ETCD_NAME: node name, unique in the cluster
  • ETCD_DATA_DIR: data directory
  • ETCD_LISTEN_PEER_URLS: cluster communication listening address
  • ETCD_LISTEN_CLIENT_URLS: Client access listening address
  • ETCD_INITIAL_ADVERTISE_PEER_URLS: Cluster advertisement address
  • ETCD_ADVERTISE_CLIENT_URLS: Client advertisement address
  • ETCD_INITIAL_CLUSTER: Cluster node address
  • ETCD_INITIAL_CLUSTER_TOKEN: Cluster Token
  • ETCD_INITIAL_CLUSTER_STATE: The current state of joining the cluster, new is a new cluster, and existing means joining an existing cluster.
4.3 systemd manages etcd
cat > /usr/lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \
--cert-file=/opt/etcd/ssl/server.pem \
--key-file=/opt/etcd/ssl/server-key.pem \
--peer-cert-file=/opt/etcd/ssl/server.pem \
--peer-key-file=/opt/etcd/ssl/server-key.pem \
--trusted-ca-file=/opt/etcd/ssl/ca.pem \
--peer-trusted-ca-file=/opt/etcd/ssl/ca.pem \
--logger=zap
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
4.4 Copy the certificate just generated
# cp /root/TLS/etcd/ca*pem /root/TLS/etcd/server*pem /opt/etcd/ssl/
4.5 Copy all the files generated by node 1 above to node 2 and node 3
scp -r /opt/etcd/ k8s-node1:/opt/
scp /usr/lib/systemd/system/etcd.service k8s-node1:/usr/lib/systemd/system/

scp -r /opt/etcd/ k8s-node2:/opt/
scp /usr/lib/systemd/system/etcd.service k8s-node2:/usr/lib/systemd/system/

Then modify the node name and current server IP in the etcd.conf configuration file on node 2 and node 3 respectively:

[root@k8s-node1 bin]# vim /opt/etcd/cfg/etcd.conf

#[Member]
ETCD_NAME="etcd-1"  # 修改此处,节点2改为etcd-2,节点3改为etcd-3
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://10.20.17.20:2380"  # 修改此处为当前服务器IP
ETCD_LISTEN_CLIENT_URLS="https://10.20.17.20:2379"  # 修改此处为当前服务器IP
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.20.17.20:2380"  # 修改此处为当前服务器IP
ETCD_ADVERTISE_CLIENT_URLS="https://10.20.17.20:2379"  # 修改此处为当前服务器IP
ETCD_INITIAL_CLUSTER="etcd-1=https://10.20.17.20:2380,etcd-2=https://10.20.17.21:2380,etcd-3=https://10.20.17.22:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
4.6 Start the etcd service and set it to start at boot
systemctl daemon-reload    # 配置生效
systemctl start etcd       # 启动etcd
systemctl status etcd      # 查看启动状态
systemctl enable etcd      # 设置开机启动

Note: The above three nodes need to start the etcd of other nodes first instead of the etcd of the master! ! !

After starting the etcd service of the master node, it is found that the service cannot be started. You can usejournalctl -xeCommand or view system logcat /var/log/messages
I saw the following error message about etcd:

"msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"427a09770fe3b784","rtt":"0s","error":"dial tcp 10.20.17.21:2380: connect: connection refused"

Reason for the error: The analysis is because ETCD_INITIAL_CLUSTER_STATE in etcd1's configuration file /etc/systemd/system/etcd.service startup script is new, and in the configuration, ETCD_INITIAL_CLUSTER writes the IP:PORT of etcd2/3, and etcd1 tries to connect at this time etcd2 and etcd3, but the etcd services of etcd2 and 3 have not been started yet, so you need to start the etcd services of etcd2 and 3 first, and then start etcd1.

4.7 View cluster status
# ETCDCTL_API=3 /opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/server.pem --key=/opt/etcd/ssl/server-key.pem --endpoints="https://10.20.17.2017.21:2379,https://10.20.17.22:2379" endpoint health

https://10.20.17.20:2379 is healthy: successfully committed proposal: took = 11.989312ms
https://10.20.17.21:2379 is healthy: successfully committed proposal: took = 12.942844ms
https://10.20.17.22:2379 is healthy: successfully committed proposal: took = 29.3212ms

If the above information is output, it means that the cluster deployment is successful. If there is a problem, the first step is to read the log: /var/log/message or journalctl -u etcd

Guess you like

Origin blog.csdn.net/cljdsc/article/details/134745846