[Prometheus] kubernetes GPU Cluster Performance Monitoring

Manually install the program - Ali cloud solutions (deployment fails):

https://www.jianshu.com/p/1c7ddf18e8b2

 

Manually install the program - (successful deployment, but only the CPU memory and other monitoring information, there is no monitoring information GPU's):

https://github.com/camilb/prometheus-kubernetes/tree/master

helm installation program --GPU-Monitoring-tools solutions (deployment success):

Reference: http: //fly-luck.github.io/2018/12/10/gpu-monitoring-tools%20Prometheus/

对应github:https://github.com/NVIDIA/gpu-monitoring-tools/tree/helm-charts

 

  1. gpu-monitoring-tools (hereinafter referred to as GMT) comprises a metrics acquisition several sets of programs:
    1. NVML Go Bindings(C API)。
    2. DCGM exporter(Prometheus metrics on DCGM)。
  2. gmt monitoring framework provides several sets of programs:
    1. The direct use of Prometheus DaemonSet DCGM exporter, only the collection and monitoring.
    2. Prometheus Operator + Kube-Prometheus (modified by Nvidia), comprising a complete acquisition, monitoring, alarm, and other graphical components.

We use the second set of monitoring framework program, and the function of this program are still valid for no GPU machine's CPU.
Proven, this program can be monitored simultaneously host hardware (CPU, GPU, memory, disk, etc.), Kubernetes core components (apiserver, controller-manager, scheduler, etc.) as well as business services on Kubernetes run the operation.

What is the Operator

  1. For stateless applications, resources (e.g. Deployment) the native Kubernetes well receiving support automatic scaling, automatic restart and upgrade.
  2. For stateful applications, such as databases, caching, monitoring systems, the need for different operation and maintenance operation according to the particular application.
  3. Operator The operation and maintenance operations into the software package for a specific application, and Kubernetes API extended by a third-party resources, allowing users to create, configure, manage applications, typically includes a series Kubernetes collection of the CRD.
  4. Similar Resource Controller and the correspondence between the Kubernetes, Operator Controller according to a request presented to the user, the actual number of instances and the instance state is maintained at the same effects as desired by the user, but many of the details of the operation of the package.

Pre-preparation

Mirroring

The following image imported to all nodes of the cluster:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# If you use the original kubernetes build Prometheus, using two mirrors to create a resource, but can only obtain metrics, no docking alarm monitoring 
# Docker pull NVIDIA / dcgm-Exporter: 1.4.6
# Docker pull quay.io/prometheus/node- Exporter: v0.16.0

# operator base image
Docker pull quay.io/coreos/prometheus-operator:v0.17.0
Docker pull quay.io/coreos/hyperkube:v1.7.6_coreos.0

# Exporters
Docker pull NVIDIA / dcgm-Exporter: 1.4.3
Docker pull quay.io/prometheus/node-exporter:v0.15.2

# Prometheus components
Docker pull quay.io/coreos/configmap-reload:v0.0.1
Docker pull quay.io/coreos/prometheus-config-reloader: v0.0.3
Docker pull gcr.io/google_containers/addon-resizer:1.7
Docker pull gcr.io/google_containers/kube-state-metrics:v1.2.0
Docker pull quay.io/coreos/grafana-watcher:v0.0.8
Docker pull grafana / grafana: 5.0.0
Docker pull quay.io/prometheus/prometheus:v2.2.1

 

helm template

Download and unzip the helm following template:

1
2
3
4
wget https://nvidia.github.io/gpu-monitoring-tools/helm-charts/kube-prometheus-0.0.43.tgz
tar zxvf kube-prometheus-0.0.43.tgz
wget https://nvidia.github.io/gpu-monitoring-tools/helm-charts/prometheus-operator-0.0.15.tgz
tar zxvf prometheus-operator-0.0.15.tgz

 

installation steps

1. Configuration
Node labels

The need for monitoring of GPU node marked with labels.

1
kubectl label no <nodename> hardware-type=NVIDIAGPU

 

Exogenous etcd

For etcd exogenous, i.e., in a manner not etcd container with Kubernetes cluster initialization starts, but outside the cluster prior to start etcd, etcd need to specify the address of the cluster.
Exogenous etcd assumed as an IP etcd0, etcd1, etcd2, external access port 2379, direct access using HTTP.

1
vim kube-prometheus/charts/exporter-kube-etcd/values.yaml

 

1
2
3
4
5
6
7
#etcdPort:  4001
etcdPort: 2379

#endpoints: []
endpoints: [etcd0,etcd1,etcd2]
scheme: http
...

Meanwhile, the need to insert the chart data grafana, Note:

  1. Add a line at 465 "."
  2. Before the line 465 to "title": "Crashlooping Control Plane Pods"the panel.
  3. Add the following line at 465 to keep the indentation.
    1
    vim kube-prometheus/charts/grafana/dashboards/kubernetes-cluster-status-dashboard.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": "prometheus",
"editable": true,
"format": "percent",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": true,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 5,
"w": 6,
"x": 0,
"y": 11
},
"id": 14,
"interval": null,
"links": [],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "(sum(up{job=\"kube-etcd\"} == 1) / count(up{job=\"kube-etcd\"})) * 100",
"format": "time_series",
"intervalFactor": 2,
"refId": "A",
"step": 600
}
],
"thresholds": "50, 80",
"title": "External etcd Servers UP",
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "current"
}
暴露端口

暴露prometheus、alertmanager、grafana的访问端口,以备排障。这些端口需要能从开发VPC直接访问。

1
vim kube-prometheus/values.yaml

 

1
2
3
4
5
6
7
8
9
10
11
12
alertmanager:
...
service:
...
nodePort: 30779
type: NodePort
prometheus:
...
service:
...
nodePort: 30778
type: NodePort
1
vim kube-prometheus/charts/grafana/values.yaml
1
2
3
service:
nodePort: 30780
type: NodePort
告警接收器

配置告警接收器,通常我们选择在同一个集群内的ControlCenter Service来接收,并将告警信息转换格式后转发给IMS。

1
vim kube-prometheus/values.yaml

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
alertmanager:
config:
route:
receiver: 'webhook_test'
routes:
- match:
alertname: DeadMansSwitch
receiver: 'webhook_test'
- match:
severity: critical
receiver: 'webhook_test'
- match:
severity: warning
receiver: 'webhook_test'
receivers:
- name: 'webhook_test'
webhook_configs:
- send_resolved: true
# short for controlcenter.default.svc or controlcenter.default.svc.cluster.local
url: "http://controlcenter.default:7777/api/alerts"
告警规则

平台监控包括Node硬件(CPU、内存、磁盘、网络、GPU)、K8s组件(Kube-Controller-Manager、Kube-Scheduler、Kubelet、API Server)、K8s应用(Deployment、StatefulSet、Pod)等。
由于篇幅较长,因此将监控告警规则放在附录。

2. 启动
1
2
cd prometheus-operator
helm install . --name prometheus-operator --namespace monitoring
1
2
cd kube-prometheus
helm install . --name kube-prometheus --namespace monitoring
3. 清理
1
helm delete --purge kube-prometheus
1
helm delete --purge prometheus-operator

常见问题

无法暴露Kubelet metrics

在1.13.0版本的kubernetes未出现此问题。

  1. 对于1.13.0之前的版本,需将获取kubelet metrics的方式由https改为http,否则Prometheus的kubelet targets将down掉。[github issue 926]
    1
    vim kube-prometheus/charts/exporter-kubelets/templates/servicemonitor.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
spec:
endpoints:
# - port: https-metrics
# scheme: https
- port: http-metrics
scheme: http
...
# - port: https-metrics
# scheme: https
- port: http-metrics
scheme: http
path: /metrics/cadvisor
...
  1. 验证
    在Prometheus页面可以看到kubelet target。
无法暴露controller-manager及scheduler的metrics
方法一

针对Kubernetes v1.13.0。

  1. 将下述内容添加到kubeadm.conf,并在kubeadm初始化时kubeadm init –config kubeadm.conf。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    apiVersion: kubeadm.k8s.io/v1alpha3
    kind: ClusterConfiguration
    kubernetesVersion: 1.13.0
    networking:
    podSubnet: 10.244.0.0/16
    controllerManagerExtraArgs:
    address: 0.0.0.0
    schedulerExtraArgs:
    address: 0.0.0.0
    ...
  2. 为pod打上label。

    1
    2
    3
    4
    kubectl get po -n kube-system
    kubectl -n kube-system label po kube-controller-manager-<nodename> k8s-app=kube-controller-manager
    kubectl -n kube-system label po kube-scheduler-<nodename> k8s-app=kube-scheduler
    kubectl get po -n kube-system --show-labels
  3. 验证
    在Prometheus页面可以看到kube-controller-manager及kube-scheduler两个target。
    在grafana页面可以看到controller-manager及scheduler的状态监控。

方法二

guide
针对1.13.0之前的Kubernetes。

  1. 修改kubeadm的核心配置。
    1
    kubeadm config view

将上述输出保存为newConfig.yaml,并添加以下两行:

1
2
3
4
controllerManagerExtraArgs:
address: 0.0.0.0
schedulerExtraArgs:
address: 0.0.0.0

 

应用新配置:

1
kubeadm config upload from-file --config newConfig.yaml

 

  1. 为pod打上label。

    1
    2
    3
    4
    kubectl get po -n kube-system
    kubectl label po kube-controller-manager-<nodename> k8s-app=kube-controller-manager
    kubectl label po kube-scheduler-<nodename> k8s-app=kube-scheduler
    kubectl get po -n kube-system --show-labels
  2. 重建exporters。

    1
    kubectl -n kube-system get svc

可以看到以下两个没有CLUSTER-IP的Service:

1
2
kube-prometheus-exporter-kube-controller-manager
kube-prometheus-exporter-kube-scheduler

 

1
2
kubectl -n kube-system get svc kube-prometheus-exporter-kube-controller-manager -o yaml
kubectl -n kube-system get svc kube-prometheus-exporter-kube-scheduler -o yaml

将上述输出分别保存为newKubeControllerManagerSvc.yaml和newKubeSchedulerSvc.yaml,删除一些非必要信息(如uid、selfLink、resourceVersion、creationTimestamp等)后重建。

1
2
3
kubectl delete -n kube-system svc kube-prometheus-exporter-kube-controller-manager kube-prometheus-exporter-kube-scheduler
kubectl apply -f newKubeControllerManagerSvc.yaml
kubectl apply -f newKubeSchedulerSvc.yaml

 

  1. 确保Prometheus pod到kube-controller-manager和kube-scheduler的NodePort 10251/10252的访问是通畅的。

  2. 验证与方法一相同。

无法暴露coredns

在Kubernetes v1.13.0中,集群DNS组件默认为coredns,因此需修改kube-prometheus的配置,才能监控到DNS服务的状态。

方法一
  1. 修改配置中的selectorLabel值与coredns的pod标签对应。
    1
    2
    3
    kubectl -n kube-system get po --show-labels | grep coredns
    # 输出
    coredns k8s-app=kube-dns
1
vim kube-prometheus/charts/exporter-coredns/values.yaml
1
2
#selectorLabel: coredns
selectorLabel: kube-dns
  1. 重启kube-prometheus。

    1
    2
    helm delete --purge kube-prometheus
    helm install --name kube-prometheus --namespace monitoring kube-prometheus
  2. 验证
    在Prometheus可以看到kube-dns target。

方法二
  1. 修改pod的标签与配置中的一致。

    1
    kubectl -n kube-system label po
  2. 验证与方法一相同。

 

部署成功后需要使用port-forward才能访问到grafana面板,可视化看到监控效果:

https://blog.csdn.net/aixiaoyang168/article/details/81661459

Guess you like

Origin www.cnblogs.com/zealousness/p/11116764.html