I. Overview
- Collecting data using the metric-server cluster k8s to the use, such as kubectl, hpa, scheduler, etc.
- Use prometheus-operator deployment prometheus, storage monitoring data
- Use kube-state-metrics collected k8s the cluster resource object data
- Using the data collected by each node in the cluster node_exporter
- Use prometheus collected apiserver, scheduler, controller-manager, kubelet component data
- Use alertmanager implement monitoring alarm
- Data visualization using grafana
1, metrics-server deployment
I always call this service is deployed in the master node above, this time need to modify the metrics-server-deployment.yaml
--- apiVersion: v1 kind: ServiceAccount metadata: name: metrics-server namespace: kube-system --- apiVersion: extensions/v1beta1 kind: Deployment metadata: name: metrics-server namespace: kube-system labels: k8s-app: metrics-server spec: selector: matchLabels: k8s-app: metrics-server template: metadata: name: metrics-server labels: k8s-app: metrics-server spec: serviceAccountName: metrics-server tolerations: - effect: NoSchedule key: node.kubernetes.io/unschedulable operator: Exists - key: NoSchedule operator: Exists effect: NoSchedule volumes: # mount in tmp so we can safely use from-scratch images and/or read-only containers - name: tmp-dir emptyDir: {} containers: - name: metrics-server image: k8s.gcr.io/metrics-server-amd64:v0.3.1 imagePullPolicy: Always command: - /metrics-server - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP volumeMounts: - name: tmp-dir mountPath: /tmp nodeSelector: metrics: "yes"
Add label for the master node
deploy
verification:
it's cool
Note: metrics-server node using the default host name, but there's no coredns resolve the host name of the physical machine, is to add a deployment time parameters:
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
The second is to use a build upstream dns dnsmasq service, reference https://www.cnblogs.com/cuishuai/p/9856843.html.
2, deployment prometheus
Download related documents:
cd k8s-monitor
1. build dynamic nfs service provides persistent storage
1.
yum install -y nfs-utils
1.1 master上安装nfs sudo
In nfs-deployment.yaml change the image pull path: lizhenliang / nfs-client-provisioner : latest
sudo vi /etc/exports
/data/opv *(rw,sync,no_root_squash,no_subtree_check) 注意将*换成自己的ip段,纯内网的话也可以用*,代替任意 sudo
systemctl restart rpcbind restart
sudo
systemctl restart nfs
sudo systemctl enable rpcbind nfs
1.2 客户端 Node节点上 安装: sudo yum install -y nfs-common mount -t nfs k8s-masterIP:/data/opv /data/opv -o proto=tcp -o nolock 为了方便使用将上面的mount命令直接放到.bashrc里面
2.
master上:
创建namesapce kubectl creaet -f nfs/monitoring-namepsace.yaml 3.为nfs创建rbac kubectl create -f nfs/rbac.yaml 4.创建deployment,将nfs的地址换成自己的 kubectl create -f nfs/nfs-deployment.yaml 5.创建storageclass kubectl create -f nfs/storageClass.yaml
2. Install Prometheus
cd k8s-monitor/Promutheus/prometheus
1.创建权限
kubectl create -f rbac.yaml
2.创建 node-exporter
kubectl create -f prometheus-node-exporter-daemonset.yaml
kubectl create -f prometheus-node-exporter-service.yaml
3.创建 kube-state-metrics kubectl create -f kube-state-metrics-deployment.yaml kubectl create -f kube-state-metrics-service.yaml 4.创建 node-directory-size-metrics kubectl create -f node-directory-size-metrics-daemonset.yaml 5.创建 prometheus kubectl create -f prometheus-pvc.yaml kubectl create -f prometheus-core-configmap.yaml kubectl create -f prometheus-core-deployment.yaml kubectl create -f prometheus-core-service.yaml kubectl create -f prometheus-rules-configmap.yaml 6.修改core-configmap里的etcd地址
# Correction file prometheus-core-configmap.yaml line 143 starts modified as follows: Note 2 Rigby insecure retracted on one line:
vim prometheus-core-configmap.yaml +143
- job_name: 'etcd'
scheme: https
tls_config:
insecure_skip_verify: true
7. If promethus Pod have a problem, first of all kubectl delete the following orderyaml文件:
kubectl delete -f prometheus-core-service.yaml
Kubectl create yaml then delete all the files.
3. Install Grafana
cd k8s-monitor / Promutheus / grafana
1.安装grafana service
kubectl create -f grafana-svc.yaml
2.创建configmap
kubectl create -f grafana-configmap.yaml
3.创建pvc kubectl create -f grafana-pvc.yaml 4.创建gragana deployment kubectl create -f grafana-deployment.yaml 5.创建dashboard configmap kubectl create configmap "grafana-import-dashboards" --from-file=dashboards/ --namespace=monitoring 6.创建job,导入dashboard等数据 kubectl create -f grafana-job.yaml
View Deployment:
prometheus and grafana are based on the way nodePort storm drain services, so you can directly access.
grafana default user name and password: admin / admin
QA:
1, the cluster is using kubeadm deployed, controller-manager and schedule are listening 127.0.0.1, resulting in prometheus not collect the relevant data?
You can modify its listening address before initialization:
apiVersion: kubeadm.k8s.io/v1beta1 kind: ClusterConfiguration controllerManager: extraArgs: address: 0.0.0.0 scheduler: extraArgs: address: 0.0.0.0
If you've built a cluster:
sed -e "s/- --address=127.0.0.1/- --address=0.0.0.0/" -i /etc/kubernetes/manifests/kube-controller-manager.yaml
sed -e "s/- --address=127.0.0.1/- --address=0.0.0.0/" -i /etc/kubernetes/manifests/kube-scheduler.yaml
2, metrics-server can not be used, an error can not resolve the host name of the node node?
Need to modify the deployment file,
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
3, metrics-server error, x509, non-trusted certificate?
command: - /metrics-server - --kubelet-insecure-tls
4, complete configuration file
containers: - name: metrics-server image: k8s.gcr.io/metrics-server-amd64:v0.3.1 command: - /metrics-server - --metric-resolution=30s - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP