Kubernetes build Master HA

  1. Press the node initializes master1 node, using kubeadm join command
  2. Arranged on the copy master to master1

    [Master1 root @ ~] scp [email protected]: /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf
    [Master1 root @ ~] -r scp [email protected]: / etc / kubernetes / PKI / etc / kubernetes
    [root @ Master1 ~] SCP [email protected]: ./ /etc/kubernetes/manifests/etcd.yaml

    对etcd.yaml 做些修改
     - etcd
        - --name=etcd-master1
        - --initial-advertise-peer-urls=http://192.168.0.249:2380
        - --listen-peer-urls=http://192.168.0.249:2380
        - --listen-client-urls=http://192.168.0.249:2379,http://127.0.0.1:2379
        - --advertise-client-urls=http://192.168.0.249:2379
        - --initial-cluster=etcd-master1=http://192.168.0.249:2380
        - --initial-cluster-state=new
        - --data-dir=/var/lib/etcd

    [root@master1 ~]cp etcd.yaml /etc/kubernetes/manifests/etcd.yaml
    [root@master ~]# systemctl daemon-reload
    [root@master1 ~]# systemctl restart kubelet
    kubectl exec -it etcd-master1.k8s sh -n kube-system
    export ETCDCTL_API=3
    etcdctl member list

    If there is no formation of clusters, delete rm -rf / var / lib / etcd / member /

    systemctl restart kubelet
    [root@master ~]# kubectl get pods --namespace=kube-system

    NAME                                    READY     STATUS    RESTARTS   AGE
    etcd-master.k8s                         1/1       Running   2          4d
    etcd-master1.k8s                        1/1       Running   0          13m
    kube-apiserver-master.k8s               1/1       Running   2          4d
    kube-controller-manager-master.k8s      1/1       Running   2          4d
    kube-dns-3913472980-tsq3r               3/3       Running   13         4d
    kube-flannel-ds-fm014                   2/2       Running   12         4d
    kube-flannel-ds-lcqrl                   2/2       Running   6          4d
    kube-flannel-ds-lxf1b                   2/2       Running   0          20m
    kube-proxy-8fppg                        1/1       Running   2          4d
    kube-proxy-bpn98                        1/1       Running   6          4d
    kube-proxy-gssrj                        1/1       Running   0          20m
    kube-scheduler-master.k8s               1/1       Running   2          4d
    kubernetes-dashboard-2039414953-r0pc3   1/1       Running   0          1d
    kubernetes-dashboard-2066150588-7z6vf   1/1       Running   0          1d

  3. Etcd data synchronization master to etcd on master1
    is on the master1 inaccessible etcd on the master, the master on the need to modify etcd.yaml

    [root@master ~]# vim /etc/kubernetes/manifests/etcd.yaml 

    [root@master ~]# systemctl daemon-reload
    [root@master ~]# systemctl restart kubelet

    Wait a minute, wait kubelet restart

    [root@master rpm]# kubectl exec -it etcd-master1.k8s -n kube-system sh

    cd /usr/local/bin

    /usr/local/bin # export ETCDCTL_API=3
    /usr/local/bin/# etcdctl endpoint status --endpoints=master1.k8s:2379

    192.168.0.250:2379, 8e9e05c52164694d, 3.0.17, 3.5 MB, true, 3, 14911

    /usr/local/bin # etcdctl endpoint status --endpoints=127.0.0.1:2379

    127.0.0.1:2379, 5e31d25f1f5fbb7f, 3.0.17, 25 kB, true, 2, 1434

    /usr/local/bin # etcdctl make-mirror 127.0.0.1:2379  --endpoints=master1.k8s:2379

    Error:  etcdserver: duplicate key given in txn request

    忽略这个error…
    /usr/local/bin # etcdctl get --from-key /api/v2/registry/clusterrolebindings/cluster-admin --endpoints=master.k8s:2379

    ……
    compact_rev_key
    6104

    /usr/local/bin # etcdctl get --from-key /api/v2/registry/clusterrolebindings/cluster-admin  --endpoints=127.0.0.1:2379

    ……
    compact_rev_key
    6104

    Two data are consistent with 6104, indicating that all the data synchronization over.

  4. The api-server connected to the master on the etcd-server master1

    [root@master ~]# vim /etc/kubernetes/manifests/kube-apiserver.yaml 
        - --etcd-servers=http://127.0.0.1:2379

    Revised to
        - --etcd-Servers = HTTP: //master1.k8s: 2379

    [root@master ~]# systemctl restart kubelet

    The following error appears reason is because kube-apiserver-master.k8s will reboot, wait a minute will be better

    [root@master ~]# kubectl get pods --namespace=kube-system
    The connection to the server 192.168.0.250:6443 was refused - did you specify the right host or port?

  5. etcd on the reconstruction master

    [root@master ~]# mv /etc/kubernetes/manifests/etcd.yaml ./
    [root@master ~]# rm -fr /var/lib/etcd

    [root@master ~]# kubectl exec -it etcd-master1.k8s sh -n kube-system
    cd /usr/local/bin/
    /usr/local/bin # export ETCDCTL_API=3
    /usr/local/bin # etcdctl member add etcd-master --peer-urls=http://master.k8s:2380

    [Root @ master ~] # vim etcd.yaml

        - etcd
        - --name=etcd-master
        - --initial-advertise-peer-urls=http://192.168.0.250:2380
        - --listen-peer-urls=http://192.168.0.250:2380
        - --listen-client-urls=http://192.168.0.250:2379,http://127.0.0.1:2379
        - --advertise-client-urls=http://192.128.0.250:2379
        - --initial-cluster=etcd-master=http://192.168.0.250:2380,etcd-master1=http://192.168.0.249:2380,etcd-master2=http://192.168.0.251:2380
        - --initial-cluster-state=existing
        - --data-dir=/var/lib/etcd

    [root@master ~]# cp etcd.yaml /etc/kubernetes/manifests/etcd.yaml
    [root@master ~]# systemctl daemon-reload
    [root@master ~]# systemctl restart kubelet

    And other blocks will see etcd-master.k8s this pod together

    [root@master ~]# kubectl exec -it etcd-master.k8s sh -n kube-system
    / cd /usr/local/bin/
    /usr/local/bin # export ETCDCTL_API=3
    /usr/local/bin # ./etcdctl endpoint status --endpoints=192.168.0.249:2379,192.168.0.250:2379

    192.168.0.249:2379, 4cfbf6559386ae97, 3.0.17, 2.0 MB, to true, 237, 30 759
    192.168.0.250:2379, 3d56d08a94c87332, 3.0.17, 2.0 MB, false, 237, 30 759
    to true cluster is a master City etcd

    /usr/local/bin # ./etcdctl endpoint health --endpoints=192.168.0.249:2379,192.168.0.250:2379

    Healthy IS 192.168.0.249:2379: successfully committed Proposal: Took = 27.179426ms
    192.168.0.250:2379 IS Healthy: successfully committed Proposal: Took = 94.162395ms
    two nodes are healthy. If the emergence of a healthy, unhealthy, then another. View the next etcd log whether there has been "the clock difference against peer ebd7965c7ef3629a is too high", the words appear please use the method described earlier ntpdate time synchronization server.

  6. Api-server on startup master1
    copy kube-apiserver.yaml master1 up to the master, and modify advertise-address etc-servers and the
    respective nodes connected on kubelet own api-server
  7. Creating master1 apiserver on master2 /

    [root@master1 ~]# scp [email protected]:/etc/kubernetes/manifests/kube-apiserver.yaml ./
    [root@master1 ~]# vim kube-apiserver.yaml 
       - --advertise-address=192.168.0.250

    [root@master1 ~]# systemctl daemon-reload
    [root@master1 ~]# systemctl restart kubelet

  8. master connection apiserver on its own node

    [root@master ~]# vim /etc/kubernetes/manifests/kube-apiserver.yaml
        - --etcd-servers=http://192.168.0.249:2379

    Changed

    - --etcd-servers=http://192.168.0.249:2379,http://192.168.0.250:2379,http://192.168.0.251:2379

    Kube-apiserve will automatically restart

    [root@master ~]# systemctl daemon-reload
    [root@master ~]# systemctl restart kubelet

  9. The kubelet on master1 / master2 connected to the node on their own apiserver

    [root @ Master1 ~] # vim /etc/kubernetes/kubelet.conf
    (only use IP)

     Server: https://192.168.0.250:6443 
    amended as
    Server: https://192.168.0.249:6443

    [root@master1 ~]# systemctl status kubelet -l

    …… 
    Jun 23 14:51:42 master1.k8s kubelet[25786]: E0623 14:51:42.080539   25786 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/kubelet.go:390: Failed to list *v1.Node: Get https://192.168.0.249:6443/api/v1/nodes?fieldSelector=metadata.name%3Dmaster1.k8s&resourceVersion=0: x509: certificate is valid for 10.96.0.1, 192.168.0.250, not 192.168.0.249
    …… 

    [root@master1 ~]# openssl genrsa -out apiserver-master1.key 2048

    Generating RSA private key, 2048 bit long modulus
    .....................................................................................................+++
    ......................+++
    e is 65537 (0x10001)

    [root@master1 ~]# openssl req -new -key apiserver-master1.key -subj "/CN=kube-apiserver," -out apiserver-master1.csr

    [root@master1 ~]# vim apiserver-master1.ext

    内容如下
    subjectAltName = DNS:master1.k8s,DNS:kubernetes,DNS:kubernetes.default,DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, IP:10.96.0.1, IP:192.168.0.249

    [root@master1 ~]# openssl x509 -req -in apiserver-master1.csr -CA /etc/kubernetes/pki/ca.crt -CAkey /etc/kubernetes/pki/ca.key -CAcreateserial -out apiserver-master1.crt -days 365 -extfile apiserver-master1.ext

    Signature ok
    subject=/CN=kube-apiserver,
    Getting CA Private Key

    [root@master1 ~]# openssl x509 -noout -text -in apiserver-master1.crt

    Certificate:
        Data:
            Version: 3 (0x2)
            Serial Number: 14573869911020153756 (0xca40c977e91c2b9c)
        Signature Algorithm: sha1WithRSAEncryption
            Issuer: CN=kubernetes
            Validity
                Not Before: Jun 23 07:16:06 2017 GMT
                Not After : Jun 23 07:16:06 2018 GMT
            Subject: CN=kube-apiserver,
            Subject Public Key Info:
                Public Key Algorithm: rsaEncryption
                    Public-Key: (2048 bit)
                    Modulus:
                        00:e6:60:eb:30:08:5f:75:e6:92:7c:54:9d:78:83:
                        ae:9d:b4:7b:85:1a:78:ee:9c:cf:19:f3:3e:1c:60:
                        3f: A7: F0: 9a: 83: A9: a1 's : 35: 9e: 3e: 86: 10: 25: 61: 7b:
                        2b: 81: Bb: 13: 32: B4: 67: 36: E1: 95: 10: B5: 06: A5: C4:
                        8a: A2: F5: 04: 30: E1: 56: Be: E0: the db: 2e: 30: F3: ed . : 78:
                        74: 0b: 17: 6b: C3: 61: , c2 , : 25: 4b: 1a: Bd: B3: 03: 48: D5:
                        AF: 8b: the f1: 0e: 64 - : 11: Ab: 7a: 7f: D0: 3c: 01: A0: F0: D3:
                        D5: 2f: 3e for : 7c: 71: Be: 9a: A6: 4d: 44: A2: 2e: 4a: 3a: Ab:
                        1a: 89: Ad: 6b: 96: 66: 9f: 94: Dd: 53: 2c: F7: 14: 3e: 2f:
                        05: 8b: EF: E8: 98: 43: 89: 89: 30: 89: 56: 8e: E7: B0: Forum a8:
                        3c: 4c: D4: FA: 57: 29: 3f: 43: 1d: E9: 81: 30: 35: 19: 94:
                        57: Bb: 46: 7d: 32: 79: FF: 45: D4: 3b: 77: a1 's : 54: 14: 87:
                        35: 48: a3: e8: aa: 6c: db: 20: 87: f5: b4: 6c: bd: b1: ed:
                        2b:36:29:16:80:d1:d6:a7:a9:12:9f:73:6d:ab:fc:
                        8d:64:11:67:b3:a0:fb:63:d8:d0:64:f1:36:8f:1d:
                        7e:29:5b:c1:1b:67:17:75:b6:1f:b1:a3:0b:5b:e2:
                        2e:5a:a3:e8:50:ef:26:c5:0c:c2:69:d1:1a:b8:19:
                        be:73
                    Exponent: 65537 (0x10001)
            X509v3 extensions:
                X509v3 Subject Alternative Name: 
                    DNS:master1.k8s, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, IP Address:10.96.0.1, IP Address:192.168.0.249
        Signature Algorithm: sha1WithRSAEncryption
             71:ef:2e:06:01:77:c5:90:8c:89:90:4d:ce:89:bf:9e:5c:e7:
             cc:2b:74:01:89:44:92:a0:4d:c9:b4:90:a2:67:af:b7:02:63:
             f1:b5:c6:6b:b2:ad:f0:84:79:50:bf:a3:70:5d:32:ac:98:3b::
             ca:c6:1f:fe:2e:9d:10:63:19:84:b9:b7:e6:43:00:90:a6:95:
             e8:c4:7c:86:1a:08:db:d0:be:99:d7:13:6c:8b:74:ea:1e:4b:
             7f:ba:65:50:c0:1b:0a:6b:8f:2a:34:5a:2c:d0:71:98:7b:67:
             af:e4:63:33:8b:af:15:5b:f0:04:50:83:f2:d1:21:71:b1:b4:
             35:f8:68:55:dd:f7:c8:fc:aa:90:05:b8:2c:14:c2:eb:1d:d7:
             09:1a:bc:0e:d5:03:31:0f:98:c1:4f:97:bd:f4:c2:58:21:77:
             d4:40:14:5c:28:21:e4:ee:cb:76:09:9d:15:bb:7e:63:84:11:
             6e:db:5c:49:d2:82:0f:7b:d4:8b:fa:f4:51:d2:8a:84:7f:34:
             04:d5:9f:f6:f5:39:fa:97:bc:b6:0c:9a:67:b0:1c:c1:17:3b:
             1a: 8e: cd: b0: 91: e9: 11: 3a: fb: 75: 01: 97: 97: fe: d3: 33: e0: a0:
             4e: 87: 0e: 66: 59: d4: b2: 02: 5f: a8: b8: 8d : b6: da: 56: 4e: c7: 1e:
             91: d6: 07: de

    [root@master1 ~]# cp apiserver-master1.key apiserver-master1.crt /etc/kubernetes/pki/

    [root@master1 ~]# vim /etc/kubernetes/manifests/kube-apiserver.yaml

        - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
        - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    修改为
        - --tls-cert-file=/etc/kubernetes/pki/apiserver-master1.crt
        - --tls-private-key-file=/etc/kubernetes/pki/apiserver-master1.key

    If it appears x509, then restart the machine master1

    Use kubectl command on master1 to verify the availability apiserver

    [root@master1 ~]# scp [email protected]:/etc/kubernetes/admin.conf ./
    [root@master1 ~]# vim admin.conf 

        Server: https://192.168.0.250:6443
    amended as
        Server: https://192.168.0.249:6443

    [root@master1 ~]# sudo cp /etc/kubernetes/admin.conf $HOME/
    [root@master1 ~]# sudo chown $(id -u):$(id -g) $HOME/admin.conf
    [root@master1 ~]# export KUBECONFIG=$HOME/admin.conf
    [root@master1 ~]# kubectl get nodes

    NAME          STATUS    AGE       VERSION
    master.k8s    Ready     20h       v1.6.4
    master1.k8s   Ready     20h       v1.6.4

  10. kube-controller-manager and kube-scheduler on startup master1

    [root @ master1 ~] # scp [email protected]: /etc/kubernetes/manifests/kube-controller-manager.yaml / etc / kubernetes / manifests /
      here do not modify controller-manager.conf, although there is Server: HTTPS : //192.168.0.250: 6443 only master and master1 point to the same APISERVER, before elections

    [root@master1 ~]# scp [email protected]:/etc/kubernetes/manifests/kube-scheduler.yaml /etc/kubernetes/manifests/
    [root@master1 ~]# scp [email protected]:/etc/kubernetes/scheduler.conf /etc/kubernetes/

        Do not modify this scheduler.conf, although there is Server: https://192.168.0.250:6443 , only the master and master1 point to the same APISERVER, before elections

    The kube-controller-manager and kube-scheduler configuration file inside the server all changed IP: 6443
    and then restart the computer

    [root@master1 ~]# systemctl daemon-reload
    [root@master1 ~]# systemctl restart kubelet

    [root@master ~]# kubectl get pod -n kube-system

    NAME                                    READY     STATUS    RESTARTS   AGE
    etcd-master.k8s                         1/1       Running   2          21h
    etcd-master1.k8s                        1/1       Running   5          1h
    kube-apiserver-master.k8s               1/1       Running   0          2h
    kube-apiserver-master1.k8s              1/1       Running   12         1h
    kube-controller-manager-master.k8s      1/1       Running   7          21h
    kube-controller-manager-master1.k8s     1/1       Running   8          39m
    kube-dns-3913472980-qhbjn               3/3       Running   0          21h
    kube-flannel-ds-b3mvc                   2/2       Running   0          21h
    kube-flannel-ds-kdzpv                   2/2       Running   2          21h
    kube-proxy-6zj1c                        1/1       Running   0          21h
    kube-proxy-lrxbn                        1/1       Running   1          21h
    kube-scheduler-master.k8s               1/1       Running   7          21h
    kube-scheduler-master1.k8s              1/1       Running   1          53s
    kubernetes-dashboard-2066150588-rwcbv   1/1       Running   0          2h

    View kube-controller-manager-master.k8s, kube-scheduler-master.k8s whether elected as leader

    [root@master ~]# kubectl logs kube-controller-manager-master.k8s -n kube-system | grep leader

    ……
    "kube-controller-manager": the object has been modified; please apply your changes to the latest version and try again
    I0624 09:19:06.113689       1 leaderelection.go:189] successfully acquired lease kube-system/kube-controller-manager
    I0624 09:19:06.113843       1 event.go:217] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"kube-system", Name:"kube-controller-manager", UID:"d33b5fa3-58ba-11e7-90ea-f48e387ca8b9", APIVersion:"v1", ResourceVersion:"219012", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' master.k8s became leader
    ……

    [root@master ~]# kubectl logs kube-scheduler-master.k8s  -n kube-system | grep leader

    ……
    I0624 09:19:03.975391       1 leaderelection.go:189] successfully acquired lease kube-system/kube-scheduler
    I0624 09:19:03.975982       1 event.go:217] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"kube-system", Name:"kube-scheduler", UID:"d348bfa4-58ba-11e7-90ea-f48e387ca8b9", APIVersion:"v1", ResourceVersion:"218995", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' master.k8s became leader
    ……

    If successful election log does not appear, try restarting master1

  11. Load Balance is APISERVER build
    their own machine is equipped with a nginx, not to engage in a nginx deployment.
  12. In addition to the maser node Node 3, the modified address apiservice
    vim /etc/kubernetes/kubelet.conf modified apiservce address kubernetes.default.svc, there can not fill in the Load Balance address directly, because the connection will apiserver authentication is not passed, before Private keys are generated for each master time joined a kubernetes.default.svc address, so this may be
    because of their own no apiserver, so to point to Load Balance, also modify / etc / hosts point to address Load Balance

    192.168.0.120 kubernetes.default.svc
    
  13. kube-proxy configuration changes (do not do this step will lead to a corresponding pod on the NODE can not resolve DNS) only need to change a place, because all the same

    1:查看
    root@shaolin:~# kubectl get configmap -n kube-system
    NAME DATA AGE
    kube-proxy 1 5d

    2: Save kube-proxy to the local file yaml
    kubectl get configmap / kube-proxy -n kube-system -o yaml> kube-proxy-configmap.yaml

    3: Modify and save the configuration
    Vim kube-proxy-configmap.yaml

    apiVersion: v1
    data:
    kubeconfig.conf: |

    apiVersion: v1 
    kind: Config
    clusters:
    - cluster:
        certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        server: https://kubernetes.default.svc:6443
      name: default
    

    root@shaolin:~# kubectl apply -f kube-proxy-configmap.yaml
    Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
    configmap "kube-proxy" configured

    4: view the updated configuration
    root @ Shaolin: ~ # kubectl GET ConfigMap / Kube Kube-System-Proxy -n -o YAML
    apiVersion: v1
    the Data:
    kubeconfig.conf: |

    apiVersion: v1
    kind: Config
    clusters:
    - cluster:
        certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        server: https://kubernetes.default.svc:6443
      name: default
    

    5: kubernetes.default.svc added to the hosts file
    6: Remove the pod to rebuild
    Kubectl Delete pod kube-proxy ..

  14. Kube-dns for HA
    let manual expansion DNS, validation try
    kubectl --namespace = kube-system scale deployment kube-dns --replicas = 3

    If not, consider the following article
    https://kubernetes.io/docs/tasks/administer-cluster/dns-horizontal-autoscaling/

  15. Non-master node to change /etc/kubernetes/kubelet.conf LB IP and port
    Server: HTTPS: //kubernetes.default.svc: 8443

    16: copy of the Master node set the environment variable
    export KUBECONFIG = / etc / kubernetes / admin.conf

    And export KUBECONFIG = / etc / kubernetes / admin.conf
    disposed boot (vim ~ / .bashrc)

    17: View cluster status

    View cluster information

    kubectl cluster-info

    Check each component information

    kubectl get componentstatuses

    HA test

     在每个master 上删除pod,在相应node上看容器是否被删除
    在每个master上修改pod的副本个数,在相应node上看是否有容器个数
    依次关闭各个master 查看,etcd,scheduler,controller-manager这个三个组件的leader情况
    
    node关机,pod是否被在别的node上重新创建
        现象:get node 可以很快看到对应node 状态变成no ready,但是pod状态还是一直是running,大约持续5分钟后关机的node上的pod状态变成 unknown,同时在其他node重建。
        重新开机后,刚才的不会恢复新的pod到原来的机器上
    逐个关闭maset1,然后逐个起来,看集,群是否能正常工作。
            现象:关闭master之后,master上面的pod的status也变成unknow,在master1上删除和创建pod都仍然有效。但是发现存活的etcd存在raft status不一致的情况,不知道正不正常。
        
        健康状态都是正常
        
        Node1的controller-manage和scheduler成为集群的leader
    
        重启master之后,master恢复功能
        但是关闭master和master1之后,集群出现问题
    

reference

https://kubernetes.io/docs/admin/high-availability/
https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/
http://tonybai.com/2017/05/15/setup-a-ha-kubernetes-cluster-based-on-kubeadm-part1/
http://tonybai.com/2017/05/15/setup-a-ha-kubernetes-cluster-based-on-kubeadm-part2/

Guess you like

Origin yq.aliyun.com/articles/704304
Recommended