<总结> 部署 Kubernetes+Heapster+InfluxDB+Grafana 详解

1. 部署 Kubernetes

一、安装步骤

准备工作

关闭防火墙

为了避免和Docker的iptables产生冲突，我们需要关闭node上的防火墙：

1 2	$ systemctl stop firewalld $ systemctl disable firewalld

安装NTP

为了让各个服务器的时间保持一致，还需要为所有的服务器安装NTP：

1 2	$ yum -y install ntp $ systemctl start ntpd $ systemctl enable ntpd

部署Master

安装etcd和kubernetes

1	$ yum -y install etcd kubernetes

配置etcd

修改etcd的配置文件/etc/etcd/etcd.conf：

1 2	ETCD_NAME=default ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_CLIENT_URLS="http://0.0.0.0:2379"

配置etcd中的网络

定义etcd中的网络配置，nodeN中的flannel service会拉取此配置

1	$ etcdctl mk /coreos.com/network/config '{"Network":"172.17.0.0/16"}'

配置Kubernetes API server

API_ADDRESS="--address=0.0.0.0"
KUBE_API_PORT="--port=8080"
KUBELET_PORT="--kubelet_port=10250"
KUBE_ETCD_SERVERS="--etcd_servers=http://10.0.222.2:2379"
KUBE_SERVICE_ADDRESSES="--portal_net=10.254.0.0/16"
KUBE_ADMISSION_CONTROL="--admission_control=NamespaceLifecycle,
NamespaceExists,LimitRanger,SecurityContextDeny,ResourceQuota"
KUBE_API_ARGS=""

这里需要注意原来KUBE_ADMISSION_CONTROL默认包含的ServiceAccount要删掉，不然启动API server的时候会报错。

启动服务

接下来，在Master上启动下面的服务：

1
2
3

$for SERVICES in etcd kube-apiserver kube-controller-manager kube-scheduler; do 
systemctl restart $SERVICES
systemctl enable $SERVICES
systemctl status $SERVICES 
done

部署Node

安装Kubernetes和Flannel

1	$ yum -y install flannel kubernetes

配置Flannel

修改Flannel的配置文件/etc/sysconfig/flanneld:

1 2	FLANNEL_ETCD="http://10.0.222.2:2379" FLANNEL_ETCD_KEY="/coreos.com/network" FLANNEL_OPTIONS="--iface=ens3"

这里需要注意FLANNEL_OPTIONS中的iface的值是你自己服务器的网卡，不同的服务器以及配置下和我的是不一样的。

启动Flannel

1 2	$systemctl restart flanneld $systemctl enable flanneld $systemctl status flanneld

上传网络配置

在当前目录下创建一个config.json，内容如下：

{
"Network": "172.17.0.0/16",
"SubnetLen": 24,
"Backend": {
     "Type": "vxlan",
     "VNI": 7890
     }
 }

然后将配置上传到etcd服务器上：

1	$ curl -L http://10.0.222.2:2379/v2/keys/coreos.com/network/config -XPUT --data-urlencode [email protected]

修改Kubernetes配置

修改kubernetes默认的配置文件/etc/kubernetes/config:

1	KUBE_MASTER="--master=http://10.0.222.2:8080"

修改kubelet配置

修改kubelet服务的配置文件/etc/kubernetes/kubelet:

KUBELET_ADDRESS="--address=0.0.0.0"
KUBELET_PORT="--port=10250"
# change the hostname to minion IP address
KUBELET_HOSTNAME="--hostname_override=node1"
KUBELET_API_SERVER="--api_servers=http://10.0.222.2:8080"
KUBELET_ARGS=""

不同node节点只需要更改KUBELET_HOSTNAME 为node的hostname即可。

启动node服务

1
2
3

$ for SERVICES in kube-proxy kubelet docker; do 
systemctl restart $SERVICES
systemctl enable $SERVICES
systemctl status $SERVICES 
done

创建快照，其他节点用快照安装（修改相应的hostname以及KUBELET_HOSTNAME即可）

查看集群nodes

部署完成之后，可以kubectl命令来查看整个集群的状态：

1 kubectl -s "http://10.0.222.2:8080" get nodes

2. 部署 InfluxDB

1.部署influxdb

     wget https://dl.influxdata.com/influxdb/releases/influxdb-0.10.3-1.x86_64.rpm 
    
 
   
 
    
 
   

     rpm -ivh  
    influxdb-0.10.3-1.x86_64.rpm 
   

     也可以配置yum源，下载最新版本 
   
 
     
   
 
     
      
       
           1 
         

           2 
         

           3 
         

           4 
         

           5 
         

           6 
         

           7 
         

           8 
         

           9 
         

           10 
         

           11 
         

           12 
         

           13 
         

           14 
         

           15 
         

           16 
         

           17 
         

           18 
         

           19 
         

           20 
         

           21 
         

           22 
         

           23 
         

           24 
         

           25 
         
 
        
          [ 
          root 
          @ 
          influxdb 
            
          ~ 
          ] 
          #  
         
 
          cat 
            
          << 
          EOF 
            
          | 
            
          sudo  
          tee 
            
          / 
          etc 
          / 
          yum 
          . 
          repos 
          . 
          d 
          / 
          influxdb 
          . 
          repo 
         
 
          [ 
          influxdb 
          ] 
         
 
          name 
            
          = 
            
          InfluxDB  
          Repository 
            
          - 
            
          RHEL 
            
          \ 
          $ 
          releasever 
         
 
          baseurl 
            
          = 
            
          https 
          : 
          //repos.influxdata.com/rhel/\$releasever/\$basearch/stable 
         
 
          enabled 
            
          = 
            
          1 
         
 
          gpgcheck 
            
          = 
            
          1 
         
 
          gpgkey 
            
          = 
            
          https 
          : 
          //repos.influxdata.com/influxdb.key 
         
 
          EOF 
         
 
          [ 
          root 
          @ 
          influxdb  
          yum 
          . 
          repos 
          . 
          d 
          ] 
          # yum install influxdb 
         
 
          [ 
          root 
          @ 
          influxdb  
          yum 
          . 
          repos 
          . 
          d 
          ] 
          # systemctl enable influxdb 
         
 
          [ 
          root 
          @ 
          influxdb  
          yum 
          . 
          repos 
          . 
          d 
          ] 
          # systemctl start influxdb 
         
 
          [ 
          root 
          @ 
          influxdb  
          yum 
          . 
          repos 
          . 
          d 
          ] 
          # systemctl status influxdb 
         

           ● 
            
          influxdb 
          . 
          service 
            
          - 
            
          InfluxDB  
          is 
            
          an  
          open 
          - 
          source 
          , 
            
          distributed 
          , 
            
          time  
          series  
          database 
         
 
              
          Loaded 
          : 
            
          loaded 
            
          ( 
          / 
          usr 
          / 
          lib 
          / 
          systemd 
          / 
          system 
          / 
          influxdb 
          . 
          service 
          ; 
            
          enabled 
          ; 
            
          vendor  
          preset 
          : 
            
          disabled 
          ) 
         
 
              
          Active 
          : 
            
          active 
            
          ( 
          running 
          ) 
            
          since  
          Wed 
            
          2016 
          - 
          03 
          - 
          16 
            
          03 
          : 
          42 
          : 
          16 
            
          EDT 
          ; 
            
          5min 
            
          ago 
         
 
                
          Docs 
          : 
            
          https 
          : 
          //influxdb.com/docs/ 
         
 
            
          Main  
          PID 
          : 
            
          2163 
            
          ( 
          sh 
          ) 
         
 
              
          CGroup 
          : 
            
          / 
          system 
          . 
          slice 
          / 
          influxdb 
          . 
          service 
         
 
                     ├─ 
          2163 
            
          / 
          bin 
          / 
          sh 
            
          - 
          c 
            
          / 
          usr 
          / 
          bin 
          / 
          influxd 
            
          - 
          config 
            
          / 
          etc 
          / 
          influxdb 
          / 
          influxdb 
          . 
          conf 
             
          >> 
          / 
          dev 
          / 
          null 
            
          2 
          >> 
          / 
          var 
          / 
          log 
          / 
          influxdb 
          / 
          influxd 
          . 
          log 
         
 
                     └─ 
          2164 
            
          / 
          usr 
          / 
          bin 
          / 
          influxd 
            
          - 
          config 
            
          / 
          etc 
          / 
          influxdb 
          / 
          influxdb 
          . 
          conf 
         

             
         
 
          Mar 
            
          16 
            
          03 
          : 
          42 
          : 
          16 
            
          influxdb  
          systemd 
          [ 
          1 
          ] 
          : 
            
          Started  
          InfluxDB  
          is 
            
          an  
          open 
          - 
          source 
          , 
            
          distributed 
          , 
            
          time  
          series  
          database 
          . 
         
 
          Mar 
            
          16 
            
          03 
          : 
          42 
          : 
          16 
            
          influxdb  
          systemd 
          [ 
          1 
          ] 
          : 
            
          Starting  
          InfluxDB  
          is 
            
          an  
          open 
          - 
          source 
          , 
            
          distributed 
          , 
            
          time  
          series  
          database 
          . 
          . 
          . 
         
 
          Mar 
            
          16 
            
          03 
          : 
          47 
          : 
          29 
            
          influxdb  
          systemd 
          [ 
          1 
          ] 
          : 
            
          Started  
          InfluxDB  
          is 
            
          an  
          open 
          - 
          source 
          , 
            
          distributed 
          , 
            
          time  
          series  
          database 
          . 
         
 
      
 
     
   

配置文件

 
           1 
         
           2 
         
           3 
         
          [ 
          root 
          @ 
          influxdb 
          yum 
          . 
          repos 
          . 
          d 
          ] 
          # rpm -qc influxdb 
         
          / 
          etc 
          / 
          influxdb 
          / 
          influxdb 
          . 
          conf 
         
          / 
          etc 
          / 
          logrotate 
          . 
          d 
          / 
          influxdb

编辑配置文件

 
           1 
         
           2 
         
          [ 
          root 
          @ 
          influxdb 
            
          ~ 
          ] 
          # vi /etc/influxdb/influxdb.conf 
         
          hostname 
            
          = 
            
          "192.168.12.172"

查看influxdb启动端口（8091,8083,8086,8088）

 
           1 
         
           2 
         
           3 
         
           4 
         
           5 
         
           6 
         
           7 
         
           8 
         
           9 
         
           10 
         
           11 
         
          [ 
          root 
          @ 
          influxdb 
           
          ~ 
          ] 
          # netstat -tnlp 
         
          Active 
          Internet 
          connections 
           
          ( 
          only 
          servers 
          ) 
         
          Proto 
          Recv 
          - 
          Q 
           
          Send 
          - 
          Q 
           
          Local 
          Address           
          Foreign 
          Address         
          State       
          PID 
          / 
          Program 
          name     
         
          tcp 
                   
          0 
                 
          0 
           
          0.0.0.0 
          : 
          22 
                         
          0.0.0.0 
          : 
          * 
                         
          LISTEN 
                 
          911 
          / 
          sshd             
         
          tcp 
                   
          0 
                 
          0 
           
          127.0.0.1 
          : 
          25 
                       
          0.0.0.0 
          : 
          * 
                         
          LISTEN 
                 
          1476 
          / 
          master         
         
          tcp 
                   
          0 
                 
          0 
           
          127.0.0.1 
          : 
          8091 
                     
          0.0.0.0 
          : 
          * 
                         
          LISTEN 
                 
          2424 
          / 
          influxd         
         
          tcp6 
                 
          0 
                 
          0 
           
          :: 
          : 
          8083 
                           
          :: 
          : 
          * 
                               
          LISTEN 
                 
          2424 
          / 
          influxd         
         
          tcp6 
                 
          0 
                 
          0 
           
          :: 
          : 
          8086 
                           
          :: 
          : 
          * 
                               
          LISTEN 
                 
          2424 
          / 
          influxd         
         
          tcp6 
                 
          0 
                 
          0 
           
          :: 
          : 
          22 
                             
          :: 
          : 
          * 
                               
          LISTEN 
                 
          911 
          / 
          sshd             
         
          tcp6 
                 
          0 
                 
          0 
           
          :: 
          : 
          8088 
                           
          :: 
          : 
          * 
                               
          LISTEN 
                 
          2424 
          / 
          influxd         
         
          tcp6 
                 
          0 
                 
          0 
           
          :: 
          1 
          : 
          25 
                             
          :: 
          : 
          * 
                               
          LISTEN 
                 
          1476 
          / 
          master

进入influx命令行

 
           1 
         
           2 
         
           3 
         
           4 
         
           5 
         
           6 
         
           7 
         
           8 
         
           9 
         
           10 
         
           11 
         
           12 
         
           13 
         
           14 
         
           15 
         
           16 
         
           17 
         
           18 
         
           19 
         
           20 
         
           21 
         
           22 
         
           23 
         
           24 
         
           25 
         
           26 
         
           27 
         
           28 
         
           29 
         
           30 
         
           31 
         
           32 
         
           33 
         
           34 
         
           35 
         
           36 
         
          [ 
          root 
          @ 
          influxdb 
            
          ~ 
          ] 
          # influx 
         
          Visit  
          https 
          : 
          //enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring. 
         
          Connected  
          to 
            
          http 
          : 
          //localhost:8086 version 0.10.3 
         
          InfluxDB  
          shell 
            
          0.10.3 
         
          > 
            
          SHOW  
          DATABASES 
         
          name 
          : 
            
          databases 
         
          -- 
          -- 
          -- 
          -- 
          -- 
          -- 
          -- 
          - 
         
          name 
         
          _internal 
         
           新建 
          testdb数据库 
         
          > 
            
          CREATE  
          DATABASE  
          testdb 
         
          > 
            
          SHOW  
          DATABASES 
         
          name 
          : 
            
          databases 
         
          -- 
          -- 
          -- 
          -- 
          -- 
          -- 
          -- 
          - 
         
          name 
         
          _internal 
         
          testdb 
         
          > 
            
          use 
            
          testdb 
         
          Using  
          database  
          testdb 
         
           新建 
          root用户 
         
          > 
            
          CREATE  
          USER 
            
          "root" 
            
          WITH  
          PASSWORD 
            
          'root' 
            
          WITH  
          ALL  
          PRIVILEGES 
         
          > 
            
          show  
          users 
         
          user     
          admin 
         
          root     
          true 
         
           插入一条测试数据 
         
          > 
            
          INSERT  
          cpu 
          , 
          host 
          = 
          test 
          , 
          region 
          = 
          us_west  
          value 
          = 
          0.64 
         
          > 
            
          SELECT * 
            
          FROM 
            
          / 
          . 
          * 
          / 
            
          LIMIT 
            
          1 
         
          name 
          : 
            
          cpu 
         
          -- 
          -- 
          -- 
          -- 
          - 
         
          time                     
          host     
          region   
          value 
         
          1458115488163303455 
                
          test     
          us 
          _west 
            
          0.64 
         
          >

3. 部署 Grafana

 
           1 
         
           2 
         
          [ 
          root 
          @ 
          influxdb 
           
          ~ 
          ] 
          # cd /usr/src 
         
          [ 
          root 
          @ 
          influxdb 
          src 
          ] 
          # yum install https://grafanarel.s3.amazonaws.com/builds/grafana-2.6.0-1.x86_64.rpm

查看配置文件：

 
     
   
 
     
      
       
           1 
         

           2 
         

           3 
         

           4 
         

           5 
         

           6 
         
 
        
          [ 
          root 
          @ 
          influxdb 
          src 
          ] 
          # rpm -qc grafana 
         
 
          / 
          etc 
          / 
          grafana 
          / 
          grafana 
          . 
          ini 
         
 
          / 
          etc 
          / 
          grafana 
          / 
          ldap 
          . 
          toml 
         
 
          / 
          etc 
          / 
          init 
          . 
          d 
          / 
          grafana 
          - 
          server 
         
 
          / 
          etc 
          / 
          sysconfig 
          / 
          grafana 
          - 
          server 
         
 
          / 
          usr 
          / 
          lib 
          / 
          systemd 
          / 
          system 
          / 
          grafana 
          - 
          server 
          . 
          service 
         
 
      
 
     
   

启动grafana服务

[root@influxdbsrc]# systemctl enable grafana-server
[root@influxdbsrc]# systemctl start grafana-server
[root@influxdbsrc]# systemctl status grafana-server
●grafana-server.service-Startsandstopsasinglegrafanainstanceonthissystem
  Loaded:loaded(/usr/lib/systemd/system/grafana-server.service;disabled;vendorpreset:disabled)
  Active:active(running)sinceWed2016-03-1604:29:11EDT;6sago
    Docs:http://docs.grafana.org
MainPID:2519(grafana-server)
  CGroup:/system.slice/grafana-server.service
          └─2519/usr/sbin/grafana-server--config=/etc/grafana/grafana.ini--pidfile=cfg:default.paths.logs=/var/log/grafanacfg:default.paths.data=/var/lib/graf...
 
Mar1604:29:12influxdbgrafana-server[2519]:2016/03/1604:29:12[I]Migrator:execmigrationid:droptabledashboard_snapshot_v4#1
Mar1604:29:12influxdbgrafana-server[2519]:2016/03/1604:29:12[I]Migrator:execmigrationid:createdashboard_snapshottablev5#2
Mar1604:29:12influxdbgrafana-server[2519]:2016/03/1604:29:12[I]Migrator:execmigrationid:createindexUQE_dashboard_snapshot_key-v5
Mar1604:29:12influxdbgrafana-server[2519]:2016/03/1604:29:12[I]Migrator:execmigrationid:createindexUQE_dashboard_snapshot_delete_key-v5
Mar1604:29:12influxdbgrafana-server[2519]:2016/03/1604:29:12[I]Migrator:execmigrationid:createindexIDX_dashboard_snapshot_user_id-v5
Mar1604:29:12influxdbgrafana-server[2519]:2016/03/1604:29:12[I]Migrator:execmigrationid:alterdashboard_snapshottomediumtextv2
Mar1604:29:12influxdbgrafana-server[2519]:2016/03/1604:29:12[I]Migrator:execmigrationid:createquotatablev1
Mar1604:29:12influxdbgrafana-server[2519]:2016/03/1604:29:12[I]Migrator:execmigrationid:createindexUQE_quota_org_id_user_id_target-v1
Mar1604:29:12influxdbgrafana-server[2519]:2016/03/1604:29:12[I]Createddefaultadminuser:admin
Mar1604:29:12influxdbgrafana-server[2519]:2016/03/1604:29:12[I]Listen:http://0.0.0.0:3000
[root@influxdbsrc]#

查看启动端口（3000）

[root@influxdbsrc]# netstat -tnlp
ActiveInternetconnections(onlyservers)
ProtoRecv-QSend-QLocalAddress          ForeignAddress        State      PID/Programname    
tcp        0      00.0.0.0:22              0.0.0.0:*              LISTEN      911/sshd            
tcp        0      0127.0.0.1:25            0.0.0.0:*              LISTEN      1476/master        
tcp        0      0127.0.0.1:8091          0.0.0.0:*              LISTEN      2424/influxd        
tcp6      0      0:::8083                :::*                    LISTEN      2424/influxd        
tcp6      0      0:::8086                :::*                    LISTEN      2424/influxd        
tcp6      0      0:::22                  :::*                    LISTEN      911/sshd            
tcp6      0      0:::3000                :::*                    LISTEN      2519/grafana-server
tcp6      0      0:::8088                :::*                    LISTEN      2424/influxd        
tcp6      0      0::1:25                  :::*                    LISTEN      1476/master        
[root@influxdbsrc]#

浏览器打开, http://192.168.12.172:3000

默认admin/admin

4. 下载 Heapster

docker pull index.tenxcloud.com/google_containers/heapster:v1.1.0

5. 部署使用

部署Heapster，连接influxdb，数据库名k8s

[root@k8s_masterk8s]# more heapster-controller.yaml 
apiVersion:v1
kind:ReplicationController
metadata:
  name:heapster
  labels:
    name:heapster
spec:
  replicas:1
  selector:
    name:heapster
  template:
    metadata:
      labels:
        name:heapster
    spec:
      containers:
      -name:heapster
        image:index.tenxcloud.com/google_containers/heapster:v1.1.0
        command:
          -/heapster
          ---source=kubernetes:http://192.168.12.174:8080?inClusterConfig=false&kubeletHttps=true&kubeletPort=10250&useServiceAccount=true&auth=
          ---sink=influxdb:http://192.168.12.172:8086
[root@k8s_masterk8s]# kubectl create -f heapster-controller.yaml 
replicationcontroller"heapster"created
[root@k8s_masterk8s]# kubectl get pods
NAME            READY    STATUS    RESTARTS  AGE
frontend-9yoz1  1/1      Running  0          22m
heapster-sgflr  1/1      Running  0          36m
[root@k8s_masterk8s]# kubectl logs heapster-sgflr 
I041803:05:33.857588      1heapster.go:61]/heapster--source=kubernetes:http://192.168.12.174:8080?inClusterConfig=false&kubeletHttps=true&kubeletPort=10250&useServiceAccount=true&auth= --sink=influxdb:http://192.168.12.172:8086
I041803:05:33.857856      1heapster.go:62]Heapsterversion 1.1.0
I041803:05:33.858113      1kube_factory.go:172]UsingKubernetesclientwithmaster"http://192.168.12.174:8080"andversion"v1"
I041803:05:33.858145      1kube_factory.go:173]Usingkubeletport10250
I041803:05:33.861330      1driver.go:316]createdinfluxdbsinkwithoptions:{rootroot192.168.12.172:8086k8sfalse}
I041803:05:33.862708      1driver.go:245]Createddatabase"k8s"oninfluxDBserverat"192.168.12.172:8086"
I041803:05:33.872579      1heapster.go:72]Startingheapsteronport8082
[root@k8s_masterk8s]#

–sink参数，请参照https://github.com/kubernetes/heapster/blob/master/docs/sink-configuration.md
–source参数，请参照https://github.com/kubernetes/heapster/blob/master/docs/source-configuration.md

4. 登陆grafana，修改连接的数据库为k8s

heapster-01

heapster-02

6. Heapster 1.1.0 获取数据说明

Heapster容器单独启动时，会连接influxdb，并创建k8s数据库

heapster收集的数据metric的分类有两种，【grafana搜索时，要注意】

1）、cumulative ：聚合的是【累计值】，包括cpu使用时间、网络流入流出量，

2）、gauge ：聚合的是【瞬时值】，包括内存使用量

	描述	分类
cpu/limit	cpu预设值，yaml文件可设置	瞬时值
cpu/node_reservation	kube节点的cpu预设值，类似cpu/limit	瞬时值
cpu/node_utilization	cpu利用率	瞬时值
cpu/request	cpu请求资源，yaml文件可设置	瞬时值
cpu/usage	cpu使用	累计值
cpu/usage_rate	cpu使用速率	瞬时值
filesystem/limit	文件系统限制	瞬时值
filesystem/usage	文件系统使用	瞬时值
memory/limit	内存限制，yaml文件可设置	瞬时值
memory/major_page_faults	内存主分页错误	累计值
memory/major_page_faults_rate	内存主分页错误速率	瞬时值
memory/node_reservation	节点内存预设值	瞬时值
memory/node_utilization	节点内存使用率	瞬时值
memory/page_faults	内存分页错误	瞬时值
memory/page_faults_rate	内存分页错误速率	瞬时值
memory/request	内存申请，yaml文件可设置	瞬时值
memory/usage	内存使用	瞬时值
memory/working_set	内存工作使用	瞬时值
network/rx	网络接收总流量	累计值
network/rx_errors	网络接收错误数	不确定
network/rx_errors_rate	网络接收错误数速率	瞬时值
network/rx_rate	网络接收速率	瞬时值
network/tx	网络发送总流量	累计值
network/tx_errors	网络发送错误数	不确定
network/tx_errors_rate	网络发送错误数速率	瞬时值
network/tx_rate	网络发送速率	瞬时值
uptime	容器启动时间，毫秒	瞬时值

参考文章

http://www.th7.cn/db/mssql/201608/200799.shtml

http://www.pangxie.space/docker/727

http://qoofan.com/read/PndPEP7vnJ.html