[prometheus]-04 Easy to handle Prometheus Eureka service discovery
[prometheus]-03 Easy to handle Prometheus file service discovery
[prometheus]-02 A picture to thoroughly understand the Prometheus service discovery mechanism
【prometheus】- 01 Introduction to the monitoring system in the cloud native era
Node Performance Monitoring for Kubernetes Cloud Native Monitoring
overview
Prometheus
The initial design is an open source monitoring & alarm tool for cloud-native applications. Before analyzing Kubernetes
the service discovery protocol, let's sort out Prometheus
how to access cloud-native to monitor Kubernetes
clusters .
Kubernetes
Cloud-native cluster monitoring mainly involves the following three types of indicators: node
physical node indicators, pod & container
container resource indicators, and Kubernetes
cloud-native cluster resource indicators. There are relatively mature solutions for these three types of indicators, as shown in the figure below:
environmental information
Kubernetes
The cluster environment I built is as shown in the figure below, and the follow-up is based on the cluster demonstration:
node-exporter deployment
Physical node performance monitoring is generally node_exporter
obtained through , node_exporter
which is provided by Prometheus
the official website to collect various operating indicators of server nodes, and currently node_exporter
supports almost all common monitoring points.
On Kubernetes
the cloud- native cluster, we can DaemonSet
deploy the service through the controller, so that each node in the cloud-native cluster will automatically run one of these Pod
, and if we delete or add nodes from the cluster, it will also automatically expand.
1. Create the orchestration file of DaemonSet
the controller node-exporter-daemonset.yaml
:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
namespace: kube-system
labels:
name: node-exporter
spec:
selector:
matchLabels:
name: node-exporter
template:
metadata:
labels:
name: node-exporter
spec:
hostPID: true
hostIPC: true
hostNetwork: true
containers:
- name: node-exporter
image: prom/node-exporter
ports:
- containerPort: 9100
resources:
requests:
cpu: 0.15
securityContext:
privileged: true
args:
- --path.procfs
- /host/proc
- --path.sysfs
- /host/sys
- --collector.filesystem.ignored-mount-points
- '"^/(sys|proc|dev|host|etc)($|/)"'
volumeMounts:
- name: dev
mountPath: /host/dev
- name: proc
mountPath: /host/proc
- name: sys
mountPath: /host/sys
- name: rootfs
mountPath: /rootfs
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
volumes:
- name: proc
hostPath:
path: /proc
- name: dev
hostPath:
path: /dev
- name: sys
hostPath:
path: /sys
- name: rootfs
hostPath:
path: /
2. DaemonSet
Created through the controller Pod
:
kubectl create -f node-exporter-daemonset.yaml
3. Check whether the Pod is running normally:
[root@master k8s-demo]# kubectl get pod -n kube-system -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-6c89d944d5-hg47n 1/1 Running 0 15d 10.100.219.68 master <none> <none>
calico-node-247w2 1/1 Running 0 15d 192.168.52.151 master <none> <none>
calico-node-pt848 1/1 Running 0 15d 192.168.52.152 node1 <none> <none>
calico-node-z65m2 1/1 Running 0 15d 192.168.52.153 node2 <none> <none>
coredns-59c898cd69-f9858 1/1 Running 0 15d 10.100.219.65 master <none> <none>
coredns-59c898cd69-ghbdg 1/1 Running 0 15d 10.100.219.66 master <none> <none>
etcd-master 1/1 Running 0 15d 192.168.52.151 master <none> <none>
kube-apiserver-master 1/1 Running 1 15d 192.168.52.151 master <none> <none>
kube-controller-manager-master 1/1 Running 10 15d 192.168.52.151 master <none> <none>
kube-proxy-5thg7 1/1 Running 0 15d 192.168.52.152 node1 <none> <none>
kube-proxy-659zl 1/1 Running 0 15d 192.168.52.153 node2 <none> <none>
kube-proxy-p2vvz 1/1 Running 0 15d 192.168.52.151 master <none> <none>
kube-scheduler-master 1/1 Running 9 15d 192.168.52.151 master <none> <none>
kube-state-metrics-5f84848c58-v7v9z 1/1 Running 0 15d 10.100.166.135 node1 <none> <none>
kuboard-74c645f5df-zzwnm 1/1 Running 0 15d 10.100.104.2 node2 <none> <none>
metrics-server-7dbf6c4558-qhjw4 1/1 Running 0 15d 192.168.52.152 node1 <none> <none>
node-exporter-57djg 1/1 Running 0 3m13s 192.168.52.152 node1 <none> <none>
node-exporter-5kcnx 1/1 Running 0 3m13s 192.168.52.151 master <none> <none>
node-exporter-cz45t 1/1 Running 0 3m13s 192.168.52.153 node2 <none> <none>
In the above node-exporter-xxx
format, a total of three are created Pod
on the three nodes of the cluster, and the status Running
, we can obtain the performance indicators of the three nodes through the following links:
curl http://192.168.52.151:9100/metrics
curl http://192.168.52.152:9100/metrics
curl http://192.168.52.153:9100/metrics
token creation
node-exporter
After the deployment is completed on Kubernetes
the cloud-native cluster, when the dependent Kubernetes
service discovery mechanism will node-exporter
be accessed , authentication is required for interaction .Prometheus
Kubernetes
token
1. Definition ServiceAccount
, p8s_sa.yaml
the information is as follows:
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups:
- ""
resources:
- nodes
- services
- endpoints
- pods
- nodes/proxy
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
- nodes/metrics
verbs:
- get
- nonResourceURLs:
- /metrics
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: kube-system
2. Create ServiceAccount
:
[root@master k8s-demo]# kubectl apply -f p8s_sa.yaml
serviceaccount/prometheus created
Warning: rbac.authorization.k8s.io/v1beta1 ClusterRole is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRole
clusterrole.rbac.authorization.k8s.io/prometheus created
Warning: rbac.authorization.k8s.io/v1beta1 ClusterRoleBinding is deprecated in v1.17+, unavailable in v1.22+; use rbac.authorization.k8s.io/v1 ClusterRoleBinding
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
3. View ServiceAccount
information:
[root@master k8s-demo]# kubectl get sa prometheus -n kube-system -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"name":"prometheus","namespace":"kube-system"}}
creationTimestamp: "2021-07-21T04:47:10Z"
managedFields:
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:secrets:
.: {}
k:{"name":"prometheus-token-6hln9"}:
.: {}
f:name: {}
manager: kube-controller-manager
operation: Update
time: "2021-07-21T04:47:10Z"
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:kubectl.kubernetes.io/last-applied-configuration: {}
manager: kubectl-client-side-apply
operation: Update
time: "2021-07-21T04:47:10Z"
name: prometheus
namespace: kube-system
resourceVersion: "113843"
selfLink: /api/v1/namespaces/kube-system/serviceaccounts/prometheus
uid: cbfe8330-de8f-40fd-a9b3-5aa312bb9104
secrets:
- name: prometheus-token-6hln9
4. secrets.name
Obtain the secret key according to:
[root@master k8s-demo]# kubectl describe secret prometheus-token-6hln9 -n kube-system
Name: prometheus-token-6hln9
Namespace: kube-system
Labels: <none>
Annotations: kubernetes.io/service-account.name: prometheus
kubernetes.io/service-account.uid: cbfe8330-de8f-40fd-a9b3-5aa312bb9104
Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1066 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6Ikx6VHBOSXRwSmFCNmc2aXppS2tFeXFSTjlNVzJMNHhGX05fT3dLcXppSDQifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJwcm9tZXRoZXVzLXRva2VuLTZobG45Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6InByb21ldGhldXMiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJjYmZlODMzMC1kZThmLTQwZmQtYTliMy01YWEzMTJiYjkxMDQiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06cHJvbWV0aGV1cyJ9.my_sEOjhx4hxApeGRhZmpFwK7snRKuYDyjlToYzXZSytdefPugMiHP1lA0bkDvxPiS0Pces2_hSJlB0pRDacqAgipE2_hqIx2GUO6t35mfbTthB7k4wbf9rQT4lag9XUzjdInOEV3SF4nfCG1DcbSM8a9COSXJUXkshXfollPYj1AGvAmTVYSSmK_b898z64WsDk9JNMjyM7VrI-kj20fKVgc0Ngi4kV3XKqRkCuKIZXKudmuUaqthbeVhaOKWhXzfBW2wDaVsNzsHMLqzwp8vVRIfZbudQ9gVGVZoskgRYiyNoNJcLjbphdxRN1hhWoBTITKHHFyQhwZGzTBo_f6g
5. toke
Save to token.k8s
file
eyJhbGciOiJSUzI1NiIsImtpZCI6Ikx6VHBOSXRwSmFCNmc2aXppS2tFeXFSTjlNVzJMNHhGX05fT3dLcXppSDQifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJwcm9tZXRoZXVzLXRva2VuLTZobG45Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6InByb21ldGhldXMiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJjYmZlODMzMC1kZThmLTQwZmQtYTliMy01YWEzMTJiYjkxMDQiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06cHJvbWV0aGV1cyJ9.my_sEOjhx4hxApeGRhZmpFwK7snRKuYDyjlToYzXZSytdefPugMiHP1lA0bkDvxPiS0Pces2_hSJlB0pRDacqAgipE2_hqIx2GUO6t35mfbTthB7k4wbf9rQT4lag9XUzjdInOEV3SF4nfCG1DcbSM8a9COSXJUXkshXfollPYj1AGvAmTVYSSmK_b898z64WsDk9JNMjyM7VrI-kj20fKVgc0Ngi4kV3XKqRkCuKIZXKudmuUaqthbeVhaOKWhXzfBW2wDaVsNzsHMLqzwp8vVRIfZbudQ9gVGVZoskgRYiyNoNJcLjbphdxRN1hhWoBTITKHHFyQhwZGzTBo_f6g
Prometheus access
Next, we will access Kubernetes
the deployment completed above through the service discovery mechanism , and capture the performance indicators.node-exporter
Prometheus
node-exporter
1. prometheus.yml
Add a crawling task to the configuration job
:
- job_name: kubernetes-nodes
kubernetes_sd_configs:
- role: node
api_server: https://apiserver.simon:6443
bearer_token_file: /tools/token.k8s
tls_config:
insecure_skip_verify: true
bearer_token_file: /tools//token.k8s
tls_config:
insecure_skip_verify: true
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
action: replace
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
Remark:
a. bearer_token_file
It is to set the information file initialized in the previous step token
;
b. api_server
You can view /root/.kube/config
the file:
c, and configure in the /etc/hosts file:
192.168.52.151 apiserver.simon
2. Check whether the access is successful:
From Prometheus UI
the interface , we can see that the three nodes of the cloud-native cluster target
are connected in, and by retrieving the node memory indicators, it is found that the performance monitoring indicators of the three nodes have indeed been Prometheus
captured successfully:
grafana deployment
node-exporter
After the deployment of the cloud-native cluster is completed, and the performance monitoring indicators have been accessed Prometheus
and successfully captured , the final step is now: deployment and display by retrieving the performance indicator data .node
Grafana
PromQL
Prometheus
1. Before building Grafana
, we need to install and nfs
create :PV
PVC
Execute on the master:
# 在master上安装nfs服务
[root@master ~]# yum install nfs-utils -y
# 准备一个共享目录
[root@master ~]# mkdir /data/k8s -pv
# 将共享目录以读写权限暴露给网段中的所有主机
[root@master ~]# vi /etc/exports
[root@master ~]# more /etc/exports
/data/k8s *(rw,no_root_squash,no_all_squash,sync)
# 启动nfs服务
[root@master ~]# systemctl start nfs
node
Next, install it on each node of the cloud-native cluster nfs
, so that node
the node can drive nfs
the device:
# 在node上安装nfs服务,注意不需要启动
[root@master ~]# yum install nfs-utils -y
2. Create PV
and PVC
arrange files grafana-volume.yaml
:
apiVersion: v1
kind: PersistentVolume
metadata:
name: grafana
spec:
capacity:
storage: 512Mi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
nfs:
server: 192.168.52.151
path: /data/k8s
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana
namespace: kube-system
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 512Mi
3. View PV
and PVC
:
4. Deployment
Controller creation Grafana Pod
:
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: kube-system
labels:
app: grafana
spec:
revisionHistoryLimit: 10
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
name: grafana
env:
- name: GF_SECURITY_ADMIN_USER
value: admin
- name: GF_SECURITY_ADMIN_PASSWORD
value: admin
readinessProbe:
failureThreshold: 10
httpGet:
path: /api/health
port: 3000
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 30
livenessProbe:
failureThreshold: 3
httpGet:
path: /api/health
port: 3000
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: 100m
memory: 256Mi
requests:
cpu: 100m
memory: 256Mi
volumeMounts:
- mountPath: /var/lib/grafana
subPath: grafana
name: storage
securityContext:
fsGroup: 0
runAsUser: 0
volumes:
- name: storage
persistentVolumeClaim:
claimName: grafana
Two more important environment variables GF_SECURITY_ADMIN_USER
and GF_SECURITY_ADMIN_PASSWORD
are used to grafana
configure the administrator user and password. Since the data grafana
of dashboard
the plug-in is stored /var/lib/grafana
in this directory, if we need to do data persistence here, we need to volume
hang it The loading statement, the rest Deployment
are no .
5. Create Service
:
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: kube-system
labels:
app: grafana
spec:
type: NodePort
ports:
- port: 3000
selector:
app: grafana
6. Check Service
:
[root@master k8s-demo]# kubectl get svc -n kube-system -owide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
grafana NodePort 10.96.80.215 <none> 3000:30441/TCP 2m32s app=grafana
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 16d k8s-app=kube-dns
kube-state-metrics ClusterIP None <none> 8080/TCP,8081/TCP 15d app.kubernetes.io/name=kube-state-metrics
kuboard NodePort 10.96.52.49 <none> 80:32567/TCP 15d k8s.kuboard.cn/layer=monitor,k8s.kuboard.cn/name=kuboard
metrics-server ClusterIP 10.96.162.56 <none> 443/TCP 15d k8s-app=metrics-server
Dashboard configuration
1. Grafana
After the deployment is complete, we enter the cluster in the browser 任意节点IP:30441
and it will open Grafana UI
. Use admin/admin
the login:
2. Create Prometheus
a data source:
3. Import 8919 dashboard
, and Kubernetes
the cloud-native cluster node performance monitoring indicators are displayed on the template, as shown in the following figure: