prometheus related services are deployed in monitoring this namespace
Service deployment prometheus
-
Create a namespace
$ cd /opt/k8s/prometheus $ cat>1-namespace.yml<<EOF apiVersion: v1 kind: Namespace metadata: name: monitoring EOF
-
Creating prometheus corresponding configuration file, use kubernetes of ConfigMap
$ cd /opt/k8s/prometheus $ cat>2-prom-cnfig.yml<<EOF apiVersion: v1 kind: ConfigMap metadata: name: prom-config namespace: monitoring data: prometheus.yml: | global: scrape_interval: 15s scrape_timeout: 15s scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] EOF
-
Pv prometheus created to store data, PVC (using local storage can be replaced by building a distributed file system NFS, GlusterFs etc.)
$ cd /opt/k8s/prometheus $ cat>3-prom-pv.yml<<EOF kind: PersistentVolume apiVersion: v1 metadata: namespace: monitoring name: prometheus labels: type: local app: prometheus spec: capacity: storage: 10Gi accessModes: - ReadWriteOnce hostPath: path: /opt/k8s/prometheus/data --- kind: PersistentVolumeClaim apiVersion: v1 metadata: namespace: monitoring name: prometheus-claim spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi EOF
-
Prometheus create a boot file to deploy form of deployment, external access by type of service NodePort
$ cd /opt/k8s/prometheus $ cat>4-prometheus.yml<<EOF apiVersion: apps/v1 kind: Deployment metadata: name: prometheus namespace: monitoring labels: app: prometheus spec: selector: matchLabels: app: prometheus replicas: 1 template: metadata: labels: app: prometheus spec: containers: - name: prometheus image: prom/prometheus:v2.16.0 args: - '--config.file=/etc/prometheus/prometheus.yml' - '--storage.tsdb.path=/prometheus' - "--storage.tsdb.retention=7d" - "--web.enable-lifecycle" ports: - containerPort: 9090 volumeMounts: - mountPath: "/prometheus" subPath: prometheus name: data - mountPath: "/etc/prometheus" name: config resources: requests: cpu: 500m memory: 2Gi limits: cpu: 500m memory: 2Gi volumes: - name: config configMap: name: prom-config - name: data persistentVolumeClaim: claimName: prometheus-claim --- apiVersion: v1 kind: Service metadata: namespace: monitoring name: prometheus spec: type: NodePort ports: - port: 9090 targetPort: 9090 nodePort: 9090 selector: app: prometheus EOF
- In Prometheus start command, passing parameters storage.tsdb.path, storage.tsdb.retention Prometheus specify the path data storage, storage time
- After web.enable-lifecycle configuration information when changes by / - / reload to reload the new configuration content, without having to restart the service
-
Start Prometheus Services
$ cd /opt/k8s/prometheus $ mkdir data & chmod -R 777 data $ kubectl create -f 1-namespace.yml -f 2-prom-cnfig.yml -f 3-prom-pv.yml -f 4-prometheus.yml
-
View component status, to ensure that all services start properly
$ kubectl get all -n monitoring NAME READY STATUS RESTARTS AGE pod/prometheus-57cf64764d-xqnvl 1/1 Running 0 51s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/prometheus NodePort 10.254.209.164 <none> 9090:9090/TCP 51s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/prometheus 1/1 1 1 51s NAME DESIRED CURRENT READY AGE replicaset.apps/prometheus-57cf64764d 1 1 1 51s
-
Interface Access
Kubernetes monitoring and related technologies
- cAdvisor Google open source a vessel monitoring program, collecting container itself using a variety of resources, performance-related information provided to the external form of resutful API's. kubernetes the cAdvisor integrated into the function corresponding to the kubelet, so there are no deployed separately, directly from kubelet, and further container kubelet Summary information, in units of pod for external calls.
- metrics-server is a core component kubernetes monitoring process is Heapster alternative, to obtain information from the interface kubelet indicators, mainly CPU, Memory, and then is exposed to by the API server. Mainly for kubectl top, HPA and other kubernetes components use. Only in memory to store the last acquired index information, is not responsible for data storage
- kube-state-metrics by listening to the API Server generates about resource objects of state indicators, such as Deployment, replica sets and so on. Only in memory to store the last acquired index information, is not responsible for data storage
- node-exporter index information, designed to provide an official collection Prometheus * NIX system itself, as well as the corresponding hardware
- kube-prometheus stop kubernetes monitoring program, the node-exporter, prometheus, kube- state-metrics, Grafana, metrics-server and other components collected, providing a more convenient script for the user to quickly build a complete monitoring platform .
kubernetes monitor the content
- Monitoring of the state of the cluster itself, such as the node's own CPU, Memory, IO, Network and other information
- Monitor systems such as self-assembly kubernetes kube-schedule-manager, kube-proxy, kubelet etc.
- Cluster operation monitoring vessel, container, Pod as the CPU unit, Memory information
- Monitoring cluster index corresponding orchestration component, such as Deployment, Daemonset etc.
In this paper, its own form of deploying components, see step by step how to set up a monitoring platform. There needs to quickly build, reference may be kube-prometheus
Deployment of node-exporter
Because each node to be monitored, so that the controller uses to deploy Daemonset node-exporter, run on each node in a Pod
-
Startup file
$ cd /opt/k8s/prometheus $ cat>5-node-exporter.yml<<EOF apiVersion: apps/v1 kind: DaemonSet metadata: labels: app: node-exporter name: node-exporter namespace: monitoring spec: selector: matchLabels: app: node-exporter template: metadata: labels: app: node-exporter spec: containers: - name: node-exporter image: 192.168.0.107/prometheus/node-exporter:v0.18.1 args: - --web.listen-address=:9100 - --path.procfs=/host/proc - --path.sysfs=/host/sys - --path.sysfs=/host/sys - --path.rootfs=/host/root - --no-collector.hwmon - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/) - --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$ resources: limits: cpu: 250m memory: 180Mi requests: cpu: 102m memory: 180Mi ports: - containerPort: 9100 volumeMounts: - mountPath: /host/proc name: proc readOnly: false - mountPath: /host/sys name: sys readOnly: false - mountPath: /host/root mountPropagation: HostToContainer name: root readOnly: true hostNetwork: true hostPID: true nodeSelector: kubernetes.io/os: linux securityContext: runAsNonRoot: true runAsUser: 65534 tolerations: - operator: Exists volumes: - hostPath: path: /proc name: proc - hostPath: path: /sys name: sys - hostPath: path: / name: root EOF
-
Start node-exporter
$ cd /opt/k8s/prometheus $ kubectl create -f 5-node-exporter.yml $ kubectl -n monitoring get pod | grep node node-exporter-854vr 1/1 Running 6 50m node-exporter-lv9pv 1/1 Running 0 50m
-
Collecting indicator information node-exporter by prometheus
Because dynamic expansion and may reduce after-node cluster, so the inconvenience in the form of static configuration, Prometheus provides us Kubernetes corresponding service discovery feature enables dynamic monitoring of the Kubernetes. Wherein the monitoring service node by node discovery mode, additionally follows prometheus configuration file (corresponding to the 2-prom-cnfig.yml also needs to be added, otherwise the reconstruction Configmap information would be lost)
$ kubectl -n monitoring edit configmaps prom-config
- job_name: "kubernetes-nodes" kubernetes_sd_configs: - role: node relabel_configs: - source_labels: [__address__] regex: '(.*):10250' replacement: '${1}:9100' target_label: __address__ action: replace - action: labelmap regex: __meta_kubernetes_node_label_(.+)
After the completion of additional execute the following command to reload the configuration item, as to why such a configuration, the rear section configuration principle to explain the specific
$ curl -XPOST http://192.168.0.107:9090/-/reload
At this point, prometheus cluster node will attempt to obtain information, see promethes log information, you will find the following error message
level=error ts=2020-03-22T10:37:13.856Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:333: Failed to list *v1.Node: nodes is forbidden: User \"system:serviceaccount:monitoring:default\" cannot list resource \"nodes\" in API group \"\" at the cluster scope"
Means that the list can not use the default serviceaccount * v1.Node, therefore we need to re-create serviceaccount to Prometheus, and given the appropriate permissions
-
Creating prometheus corresponding serviceaccount, and given the appropriate permissions
$ cd /opt/k8s/prometheus $ cat>6-prometheus-serivceaccount-role.yaml<<EOF apiVersion: v1 kind: ServiceAccount metadata: name: prometheus-k8s namespace: monitoring --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: prometheus-k8s rules: - apiGroups: [""] resources: - nodes/proxy - nodes - namespaces - endpoints - pods - services verbs: ["get","list","watch"] - apiGroups: [""] resources: - nodes/metrics verbs: ["get"] - nonResourceURLs: - /metrics verbs: ["get"] - apiGroups: - extensions resources: - ingresses verbs: ["get", "list", "watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: prometheus-k8s roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: prometheus-k8s subjects: - kind: ServiceAccount name: prometheus-k8s namespace: monitoring EOF
$ cd /opt/k8s/prometheus $ kubectl create -f 6-prometheus-serivceaccount-role.yaml
Prometheus modify the start yaml, additional serivceaccount configuration, reboot Prometheus
... spec: serviceAccountName: prometheus-k8s containers: - name: prometheus ...
View prometheus monitoring object list
Detailed configuration principle kubernetes_sd_config
-
Find a destination address
When the role kubernetes_sd_config node configuration is, after prometheus will start the LIST Node API calls kubernetes acquired node information, and obtain IP and PORT from node object to constitute a monitoring address.
Wherein obtaining IP lookup in the following order InternalIP, ExternalIP, LegacyHostIP, HostName
Port value defaults Kubelet the HTTP port.
IP and Port can view information list Node Interface returned the following command
$ kubectl get node -o=jsonpath='{range .items[*]}{.status.addresses}{"\t"}{.status.daemonEndpoints}{"\n"}{end}' [map[address:192.168.0.107 type:InternalIP] map[address:master type:Hostname]] map[kubeletEndpoint:map[Port:10250]] [map[address:192.168.0.114 type:InternalIP] map[address:slave type:Hostname]] map[kubeletEndpoint:map[Port:10250]]
The return result in the cluster is two nodes, taget constituting respectively the address
192.168.0.107:10250 192.168.0.114:10250
-
relabe_configs
Relabeling can make dynamic changes to the value of the label of Prometheus before crawling data. Prometheus has a number of default tags, and several of us dealing with the following
__address__
: Will be set as the target address corresponding to the initialization<host>:<port>
instance
:__address__
Value of the tag after Relabel stage will be set to the taginstance
, whichinstance
is the__address__
value of the label after Relabel__scheme__
Default: http__metrics_path__
Defaults / metrics
Prometheus pull index of the destination address information is put a few labels to connect
__scheme__://instance/__metrics_path__
-
We start node-expeorter, in each node
:9100/metrics
on the exposed node index information, and then added to a configuration in segment prometheus- job_name: "kubernetes-nodes" kubernetes_sd_configs: - role: node relabel_configs: - source_labels: [__address__] regex: '(.*):10250' replacement: '${1}:9100' target_label: __address__ action: replace - action: labelmap regex: __meta_kubernetes_node_label_(.+)
Wherein the first profile segment relabel_configs
- source_labels: [__address__] regex: '(.*):10250' replacement: '${1}:9100' target_label: __address__ action: replace
- From regex by
__address__
matching the IP address - replacement: set to a value corresponding to
${IP}:9100
- target_label: The
__address__
values replaced with replacement, i.e.,${IP}:9100
After these steps, get spliced into the index address
[http://192.168.0.107:9100/metrics, http://192.168.0.114:9100/metrics]
, and address our index node-exporter exposed match, you can pull the index information of the NodeMoreover, because the node corresponding to the tag prometheus will become
__meta_kubernetes_node_label_<labelname>
, so the operation of a labelmap added, the names of these tags then restore them - From regex by
-
Full configuration example of a reference prometheus-kubernetes
Additional indicators collected kubelete provided
kubelet collect certain information indicators api server, etcd and other services, can be viewed with the following command
$ kubectl get --raw https://192.168.0.107:10250/metrics
- Where 10250 is the default listening port of kubelet
Followed by an additional configuration in promehteus, so that promehteus pull this information
- job_name: "kubernetes-kubelet"
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
Additional collection cAdvisor indicators to achieve control of the cluster container
kubelet integrated cAdvisor default, to the cluster information collection container, Kubernetes 1.7.3 version and thereafter, the index information collected cAdvisor (beginning container_) from the corresponding Kubelet / metrics removed, so require additional configuration of a job cAdvisor collection. cAdvisor indicators invoke the command
$ kubectl get --raw https://192.168.0.107:6443/api/v1/nodes/master/proxy/metrics/cadvisor
- Where 10250 is the default listening port of kubelet
Followed by an additional configuration in promehteus, so that promehteus pull this information
- job_name: "kubernetes-cadvisor"
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
Re-load the configuration file after the addition is complete, view the list of objects to monitor prometheus
Full configuration file follows
$ cd /opt/k8s/prometheus
$ cat 2-prom-cnfig.yml
apiVersion: v1
kind: ConfigMap
metadata:
name: prom-config
namespace: monitoring
data:
level=info ts=2020-03-22T12:15:02.551Z caller=head.go:625 component=tsdb msg="WAL segment loaded" segment=10 maxSegment=15
prometheus.yml: |
global:
scrape_interval: 15s
scrape_timeout: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: "kubernetes-nodes"
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:9100'
target_label: __address__
action: replace
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: "kubernetes-kubelet"
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: "kubernetes-cadvisor"
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
Department grafana
-
Create pv grafana for storing data, PVC (using local storage can be replaced by building a distributed file system NFS, GlusterFs etc.)
$ cd /opt/k8s/prometheus $ cat>7-grafana-pv.yml<<EOF kind: PersistentVolume apiVersion: v1 metadata: namespace: monitoring name: grafana labels: type: local app: grafana spec: capacity: storage: 10Gi accessModes: - ReadWriteOnce hostPath: path: /opt/k8s/prometheus/grafana-pvc --- kind: PersistentVolumeClaim apiVersion: v1 metadata: namespace: monitoring name: grafana-claim spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi EOF
-
grafana deployment file
$ cd /opt/k8s/prometheus $ cat>8-grafana.yml<<EOF apiVersion: apps/v1 kind: Deployment metadata: labels: app: grafana name: grafana namespace: monitoring spec: replicas: 1 selector: matchLabels: app: grafana template: metadata: labels: app: grafana spec: containers: - image: grafana/grafana:6.6.2 name: grafana ports: - containerPort: 3000 name: http readinessProbe: httpGet: path: /api/health port: http resources: limits: cpu: 200m memory: 400Mi requests: cpu: 100m memory: 200Mi volumeMounts: - mountPath: /var/lib/grafana name: grafana-pvc readOnly: false subPath: data - mountPath: /etc/grafana/provisioning/datasources name: grafana-pvc readOnly: false subPath: datasources - mountPath: /etc/grafana/provisioning/dashboards name: grafana-pvc readOnly: false subPath: dashboards-pro - mountPath: /grafana-dashboard-definitions/0 name: grafana-pvc readOnly: false subPath: dashboards nodeSelector: beta.kubernetes.io/os: linux securityContext: runAsNonRoot: true runAsUser: 65534 volumes: - name: grafana-pvc persistentVolumeClaim: claimName: grafana-claim --- apiVersion: v1 kind: Service metadata: namespace: monitoring name: grafana spec: type: NodePort ports: - port: 3000 targetPort: 3000 nodePort: 3000 selector: app: grafana EOF
-
Start grafana
-
Create a directory to mount grafana
$ cd /opt/k8s/prometheus $ mkdir -p grafana-pvc/data $ mkdir -p grafana-pvc/datasources $ mkdir -p grafana-pvc/dashboards-pro $ mkdir -p grafana-pvc/dashboards $ chmod -R 777 grafana-pvc
- data directory data stored grafana
- datasources storing predefined data source
- dashboards-pro management file stored dashboards, wherein the configuration file address points to the dashboard of grafana-pvc / dashboards container mounted to the address / grafana-dashboard-definitions / 0
- dashboards storage real dashboards definition file (json)
-
Create a default data source file
$ cd /opt/k8s/prometheus/grafana-pvc/datasources $ cat > datasource.yaml<<EOF apiVersion: 1 datasources: - name: Prometheus type: prometheus access: proxy url: http://prometheus.monitoring.svc:9090 EOF
-
Create a default file management dashboards
$ cd /opt/k8s/prometheus/grafana-pvc/dashboards-pro $ cat >dashboards.yaml<<EOF apiVersion: 1 providers: - name: '0' orgId: 1 folder: '' type: file editable: true updateIntervalSeconds: 10 allowUiUpdates: false options: path: /grafana-dashboard-definitions/0 EOF
-
Create a default dashboard definition file
Be the A Collection of Shared Dashboards , find dashboard template they need, download the corresponding json file corresponding to the file stored in / opt / k8s / prometheus / grafana -pvc / dashboards), here by way of example, downloaded 1 Node Exporter for Prometheus Dashboard v20191102 the CN , the corresponding ID is 8919.
$ cd /opt/k8s/prometheus/grafana-pvc/dashboards $ wget https://grafana.com/api/dashboards/8919/revisions/11/download -o node-exporter-k8s.json
Because the default template data sources is used
${DS_PROMETHEUS_111}
, there is an alternate configuration item when imported from the interface, we json Download files, by directly modifying the file, the data source into our/opt/k8s/prometheus/grafana-pvc/datasources
configuration in a data source$ cd /opt/k8s/prometheus/grafana-pvc/dashboards $ sed -i "s/\${DS_PROMETHEUS_111}/Prometheus/g" node-exporter-k8s.json
Modify title
... "timezone": "browser", "title": "k8s-node-monitoring", ...
-
start up
$ cd /opt/k8s/prometheus/ $ kubectl create -f 7-grafana-pv.yml 8-grafana.yml
-
-
By interface to view, because we have a data source, dashboard and other information over the default settings, you can directly view the corresponding dashboard
When deploying grafana, we configured the default data source, dashboard and other information, mainly in order to achieve these default monitoring indicators can be directly observed after system deployment, implementation does not require onsite configuration.
Other monitoring using e.g. Kube-state-metrics and cAdvisor metrics achieved cluster Deployment, StatefulSet, container, pod monitoring may also be implemented in this form. As can use 1. Kubernetes Deployment Statefulset Daemonset metrics as a template, slightly modified to meet the needs of our monitoring, there will no longer show the specific steps, readers can try on their own.