Actuator + Prometheus + Grafana build micro-service monitoring platform

[TOC]


Foreword

About Actuator :

Spring Boot had knowledge of the monitoring capabilities of small partners should know Spring Boot Actuator This component, which provides a powerful monitoring capabilities to applications. Starting Spring Boot 2.x, Actuator change the underlying Micrometer, it provides a stronger, more flexible monitoring capabilities. Micrometer is a monitoring facade, it can be likened to Slf4j monitoring industry. With Micrometer, application capable of interfacing various monitoring systems , such as this article is to introduce: Prometheus

About Prometheus :

Prometheus is an open source system developed by the SoundCloud monitor + Sequence Database (TSDB), Prometheus most components use the Go language when warning + is Google BorgMon monitoring system open source version. Currently in CNCF Foundation hosted, and has successfully hatched. In the open source community Prometheus is currently very active in the performance of Prometheus enough scale to support tens of thousands of clusters.

Prometheus features:

  • Multidimensional data model names and values ​​with a measure of time-series data identification
  • It has a flexible query language: PromQL
  • Not rely on distributed storage, a single server node is autonomous
  • By way of the data capture timing pull HTTP-based
  • When the push sequence data may be performed by a gateway intermediate
  • Support Services or discovered by static configuration to discover target clients
  • Support a variety of charts and interface display, such as Grafana etc.

More Reference: official documents , GitHub repository

About Grafana :

Grafana is an open source application written in GO language, supports cross-platform measurement and analysis and visualization + alarm. Data collected by the query and then visually demonstrate and inform. Grafana supports a variety of data sources and presentation, a word cool is a powerful visualization tool monitoring indicators.

More Reference: official documents , GitHub repository


Create a project

The main purpose of this paper is to achieve control micro services, simple understanding of the concept of the tool, we take a hands-on look. First, create a simple Spring Boot project, which is mainly dependent on the following:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
  • Tips: If you want to dock here other monitoring systems, only need to change the dependent package name. For example, would like to meet Influx, it will rely instead micrometer-registry-influxcan be.

Editing project configuration:

server:
  port: 9562
spring:
  application:
    # 指定应用名
    name: prometheus-demo
management:
  endpoints:
    web:
      exposure:
        # 将 Actuator 的 /actuator/prometheus 端点暴露出来
        include: 'prometheus'
  metrics:
    tags:
      # 为指标设置一个Tag,这里设置为应用名,Tag是Prometheus提供的一种能力,从而实现更加灵活的筛选
      application: ${spring.application.name}

After completing the above steps, perform a simple test to see if the endpoint monitoring data can be returned to normal. Start the project, visit /actuator/prometheusendpoint. It will return the following content under normal circumstances:

# HELP process_start_time_seconds Start time of the process since unix epoch.
# TYPE process_start_time_seconds gauge
process_start_time_seconds{application="prometheus-demo",} 1.577697308142E9
# HELP jvm_buffer_memory_used_bytes An estimate of the memory that the Java virtual machine is using for this buffer pool
# TYPE jvm_buffer_memory_used_bytes gauge
jvm_buffer_memory_used_bytes{application="prometheus-demo",id="mapped",} 0.0
jvm_buffer_memory_used_bytes{application="prometheus-demo",id="direct",} 16384.0
# HELP tomcat_sessions_expired_sessions_total  
# TYPE tomcat_sessions_expired_sessions_total counter
tomcat_sessions_expired_sessions_total{application="prometheus-demo",} 0.0
# HELP jvm_gc_pause_seconds Time spent in GC pause
# TYPE jvm_gc_pause_seconds summary
jvm_gc_pause_seconds_count{action="end of minor GC",application="prometheus-demo",cause="Metadata GC Threshold",} 1.0
jvm_gc_pause_seconds_sum{action="end of minor GC",application="prometheus-demo",cause="Metadata GC Threshold",} 0.006
jvm_gc_pause_seconds_count{action="end of major GC",application="prometheus-demo",cause="Metadata GC Threshold",} 1.0
jvm_gc_pause_seconds_sum{action="end of major GC",application="prometheus-demo",cause="Metadata GC Threshold",} 0.032
jvm_gc_pause_seconds_count{action="end of minor GC",application="prometheus-demo",cause="Allocation Failure",} 1.0
jvm_gc_pause_seconds_sum{action="end of minor GC",application="prometheus-demo",cause="Allocation Failure",} 0.008
# HELP jvm_gc_pause_seconds_max Time spent in GC pause
# TYPE jvm_gc_pause_seconds_max gauge
jvm_gc_pause_seconds_max{action="end of minor GC",application="prometheus-demo",cause="Metadata GC Threshold",} 0.006
jvm_gc_pause_seconds_max{action="end of major GC",application="prometheus-demo",cause="Metadata GC Threshold",} 0.032
jvm_gc_pause_seconds_max{action="end of minor GC",application="prometheus-demo",cause="Allocation Failure",} 0.008
# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge
jvm_memory_used_bytes{application="prometheus-demo",area="heap",id="PS Survivor Space",} 0.0
jvm_memory_used_bytes{application="prometheus-demo",area="heap",id="PS Old Gen",} 1.3801776E7
jvm_memory_used_bytes{application="prometheus-demo",area="nonheap",id="Metaspace",} 3.522832E7
jvm_memory_used_bytes{application="prometheus-demo",area="nonheap",id="Code Cache",} 6860800.0
jvm_memory_used_bytes{application="prometheus-demo",area="heap",id="PS Eden Space",} 1.9782928E7
jvm_memory_used_bytes{application="prometheus-demo",area="nonheap",id="Compressed Class Space",} 4825568.0
# HELP logback_events_total Number of error level events that made it to the logs
# TYPE logback_events_total counter
logback_events_total{application="prometheus-demo",level="info",} 7.0
logback_events_total{application="prometheus-demo",level="trace",} 0.0
logback_events_total{application="prometheus-demo",level="warn",} 0.0
logback_events_total{application="prometheus-demo",level="debug",} 0.0
logback_events_total{application="prometheus-demo",level="error",} 0.0
# HELP process_uptime_seconds The uptime of the Java virtual machine
# TYPE process_uptime_seconds gauge
process_uptime_seconds{application="prometheus-demo",} 30.499
# HELP jvm_buffer_count_buffers An estimate of the number of buffers in the pool
# TYPE jvm_buffer_count_buffers gauge
jvm_buffer_count_buffers{application="prometheus-demo",id="mapped",} 0.0
jvm_buffer_count_buffers{application="prometheus-demo",id="direct",} 2.0
# HELP system_cpu_count The number of processors available to the Java virtual machine
# TYPE system_cpu_count gauge
system_cpu_count{application="prometheus-demo",} 6.0
# HELP jvm_threads_peak_threads The peak live thread count since the Java virtual machine started or peak was reset
# TYPE jvm_threads_peak_threads gauge
jvm_threads_peak_threads{application="prometheus-demo",} 22.0
# HELP tomcat_sessions_alive_max_seconds  
# TYPE tomcat_sessions_alive_max_seconds gauge
tomcat_sessions_alive_max_seconds{application="prometheus-demo",} 0.0
# HELP jvm_memory_committed_bytes The amount of memory in bytes that is committed for the Java virtual machine to use
# TYPE jvm_memory_committed_bytes gauge
jvm_memory_committed_bytes{application="prometheus-demo",area="heap",id="PS Survivor Space",} 1.5204352E7
jvm_memory_committed_bytes{application="prometheus-demo",area="heap",id="PS Old Gen",} 1.31596288E8
jvm_memory_committed_bytes{application="prometheus-demo",area="nonheap",id="Metaspace",} 3.7879808E7
jvm_memory_committed_bytes{application="prometheus-demo",area="nonheap",id="Code Cache",} 6881280.0
jvm_memory_committed_bytes{application="prometheus-demo",area="heap",id="PS Eden Space",} 1.76685056E8
jvm_memory_committed_bytes{application="prometheus-demo",area="nonheap",id="Compressed Class Space",} 5373952.0
# HELP jvm_buffer_total_capacity_bytes An estimate of the total capacity of the buffers in this pool
# TYPE jvm_buffer_total_capacity_bytes gauge
jvm_buffer_total_capacity_bytes{application="prometheus-demo",id="mapped",} 0.0
jvm_buffer_total_capacity_bytes{application="prometheus-demo",id="direct",} 16384.0
# HELP jvm_gc_live_data_size_bytes Size of old generation memory pool after a full GC
# TYPE jvm_gc_live_data_size_bytes gauge
jvm_gc_live_data_size_bytes{application="prometheus-demo",} 1.3801776E7
# HELP jvm_memory_max_bytes The maximum amount of memory in bytes that can be used for memory management
# TYPE jvm_memory_max_bytes gauge
jvm_memory_max_bytes{application="prometheus-demo",area="heap",id="PS Survivor Space",} 1.5204352E7
jvm_memory_max_bytes{application="prometheus-demo",area="heap",id="PS Old Gen",} 2.841116672E9
jvm_memory_max_bytes{application="prometheus-demo",area="nonheap",id="Metaspace",} -1.0
jvm_memory_max_bytes{application="prometheus-demo",area="nonheap",id="Code Cache",} 2.5165824E8
jvm_memory_max_bytes{application="prometheus-demo",area="heap",id="PS Eden Space",} 1.390411776E9
jvm_memory_max_bytes{application="prometheus-demo",area="nonheap",id="Compressed Class Space",} 1.073741824E9
# HELP jvm_threads_daemon_threads The current number of live daemon threads
# TYPE jvm_threads_daemon_threads gauge
jvm_threads_daemon_threads{application="prometheus-demo",} 18.0
# HELP jvm_threads_states_threads The current number of threads having NEW state
# TYPE jvm_threads_states_threads gauge
jvm_threads_states_threads{application="prometheus-demo",state="runnable",} 8.0
jvm_threads_states_threads{application="prometheus-demo",state="new",} 0.0
jvm_threads_states_threads{application="prometheus-demo",state="timed-waiting",} 2.0
jvm_threads_states_threads{application="prometheus-demo",state="blocked",} 0.0
jvm_threads_states_threads{application="prometheus-demo",state="waiting",} 12.0
jvm_threads_states_threads{application="prometheus-demo",state="terminated",} 0.0
# HELP jvm_gc_memory_promoted_bytes_total Count of positive increases in the size of the old generation memory pool before GC to after GC
# TYPE jvm_gc_memory_promoted_bytes_total counter
jvm_gc_memory_promoted_bytes_total{application="prometheus-demo",} 8296848.0
# HELP tomcat_sessions_active_max_sessions  
# TYPE tomcat_sessions_active_max_sessions gauge
tomcat_sessions_active_max_sessions{application="prometheus-demo",} 0.0
# HELP tomcat_sessions_created_sessions_total  
# TYPE tomcat_sessions_created_sessions_total counter
tomcat_sessions_created_sessions_total{application="prometheus-demo",} 0.0
# HELP jvm_gc_memory_allocated_bytes_total Incremented for an increase in the size of the young generation memory pool after one GC to before the next
# TYPE jvm_gc_memory_allocated_bytes_total counter
jvm_gc_memory_allocated_bytes_total{application="prometheus-demo",} 1.36924824E8
# HELP process_cpu_usage The "recent cpu usage" for the Java Virtual Machine process
# TYPE process_cpu_usage gauge
process_cpu_usage{application="prometheus-demo",} 0.10024585094452443
# HELP system_cpu_usage The "recent cpu usage" for the whole system
# TYPE system_cpu_usage gauge
system_cpu_usage{application="prometheus-demo",} 0.38661791030714154
# HELP tomcat_sessions_active_current_sessions  
# TYPE tomcat_sessions_active_current_sessions gauge
tomcat_sessions_active_current_sessions{application="prometheus-demo",} 0.0
# HELP jvm_classes_loaded_classes The number of classes that are currently loaded in the Java virtual machine
# TYPE jvm_classes_loaded_classes gauge
jvm_classes_loaded_classes{application="prometheus-demo",} 7195.0
# HELP http_server_requests_seconds  
# TYPE http_server_requests_seconds summary
http_server_requests_seconds_count{application="prometheus-demo",exception="None",method="GET",outcome="CLIENT_ERROR",status="404",uri="/**",} 1.0
http_server_requests_seconds_sum{application="prometheus-demo",exception="None",method="GET",outcome="CLIENT_ERROR",status="404",uri="/**",} 0.012429856
# HELP http_server_requests_seconds_max  
# TYPE http_server_requests_seconds_max gauge
http_server_requests_seconds_max{application="prometheus-demo",exception="None",method="GET",outcome="CLIENT_ERROR",status="404",uri="/**",} 0.012429856
# HELP jvm_gc_max_data_size_bytes Max size of old generation memory pool
# TYPE jvm_gc_max_data_size_bytes gauge
jvm_gc_max_data_size_bytes{application="prometheus-demo",} 2.841116672E9
# HELP jvm_threads_live_threads The current number of live threads including both daemon and non-daemon threads
# TYPE jvm_threads_live_threads gauge
jvm_threads_live_threads{application="prometheus-demo",} 22.0
# HELP jvm_classes_unloaded_classes_total The total number of classes unloaded since the Java virtual machine has started execution
# TYPE jvm_classes_unloaded_classes_total counter
jvm_classes_unloaded_classes_total{application="prometheus-demo",} 1.0
# HELP tomcat_sessions_rejected_sessions_total  
# TYPE tomcat_sessions_rejected_sessions_total counter
tomcat_sessions_rejected_sessions_total{application="prometheus-demo",} 0.0

The Prometheus endpoint data is returned to use. Each has a corresponding comments explaining its meaning, I believe is not difficult to understand. E.g:

# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge
jvm_memory_used_bytes{application="prometheus-demo",area="heap",id="PS Survivor Space",} 0.0

It represents: prometheus-demoApplication of heap memory PS Survivor Spacespace is occupied area 0.0bytes.


Prometheus installation services

The next step is to install Prometheus service on the server, for collecting monitoring data from the monitoring endpoint the exposed micro-services. For simplicity, I used here installation docker, and other installation methods can refer to the official installation documentation .

First, as Prometheus prepare a configuration file:

[root@localhost ~]# mkdir /etc/prometheus
[root@localhost ~]# vim /etc/prometheus/prometheus.yml
scrape_configs:
# 任意写,建议英文,不要包含特殊字符
- job_name: 'spring'
  # 多久采集一次数据
  scrape_interval: 15s
  # 采集时的超时时间
  scrape_timeout: 10s
  # 采集的端点
  metrics_path: '/actuator/prometheus'
  # 被采集的服务地址,即微服务的ip及端口
  static_configs:
  - targets: ['192.168.1.252:9562']

The purpose of this profile is to allow Prometheus service requests automatically every 15 seconds http://192.168.1.252:9562/actuator/prometheus. More configuration items Reference: Prometheus the Configuration official documents

Finally docker start Prometheus service command as follows:

[root@localhost ~]# docker run -d -p 9090:9090 -v /etc/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus --config.file=/etc/prometheus/prometheus.yml

After a successful start, access under normal circumstances http://{ip}:9090, you can see Prometheus home page:
Actuator + Prometheus + Grafana build micro-service monitoring platform

Click Insert metric at cursor, you can choose to monitor metrics; click Graph, you can make indicators graphically display; click the Executebutton to see results similar to the following:
Actuator + Prometheus + Grafana build micro-service monitoring platform

Function Description:

  • Insert metric at cursor: Select to show indicators
  • Graph: Let the indicators shown graphically
  • Execute: A metric to chart information
  • Add Graph: Draw more chart indicators

Grafana Allowed 视化

The previous section we have successfully built a Prometheus service, and a brief introduction to Prometheus's own monitoring data visualization interface, but experience is not good, function is relatively small. Let's integrated Grafana achieve a more friendly and closer monitoring of production data visualization platform.

Grafana also need to install the service on the server, for simplicity's sake, I've still using the installation docker. Other installation methods can refer to the official installation documentation .

Use docker only one command can start Grafana, as follows:

[root@localhost ~]# docker run -d -p 3000:3000 grafana/grafana

Configuration monitoring data source

After a successful start Grafana, access http://{ip}:3000/loginto log in, the default account passwords are admin:
Actuator + Prometheus + Grafana build micro-service monitoring platform

After a successful login Home as follows:
Actuator + Prometheus + Grafana build micro-service monitoring platform

First, you need to add the source of monitoring data, click on the home page Add data source, you can see a screen similar to the following:
Actuator + Prometheus + Grafana build micro-service monitoring platform

Click here to Prometheus, you can see a screen like interface, where the configuration information related to Prometheus services:
Actuator + Prometheus + Grafana build micro-service monitoring platform

After successfully saved will have the following tips:
Actuator + Prometheus + Grafana build micro-service monitoring platform


Creating Monitoring Dashboard

Click on the navigation bar +button and click Dashboard, you will see an interface similar to the following:
Actuator + Prometheus + Grafana build micro-service monitoring platform

Click Add Query, you can see an interface similar to the following:
Actuator + Prometheus + Grafana build micro-service monitoring platform

Add index query, the index value in the red box marked location See Spring Boot application /actuator/prometheusendpoints, such as jvm_memory_used_bytes, jvm_threads_states_threads, jvm_threads_live_threadsand so on.

Grafana会给你较好的提示,并且支持较为复杂的计算,例如聚合、求和、平均等。如果想要绘制多个线条,可点击Add Query 按钮。如上图所示,笔者为图表绘制了两条线,分别代表daemon以及peak线程。

点击下图的按钮,并填入Title,即可设置图表标题:
Actuator + Prometheus + Grafana build micro-service monitoring platform

若需要为Dashboard添加新的图表则点击上图中的左上角按钮:
Actuator + Prometheus + Grafana build micro-service monitoring platform

并按下图步骤操作即可:
Actuator + Prometheus + Grafana build micro-service monitoring platform

如果需要保存该Dashboard,则点击右上角的保存按钮即可:
Actuator + Prometheus + Grafana build micro-service monitoring platform
Actuator + Prometheus + Grafana build micro-service monitoring platform


Dashboard市场

至此,我们已经成功将Grafana与Prometheus集成,实现了较为丰富的图表展示——将关心的监控指标放置到Dashboard上,并且非常灵活!然而,这个配置的操作虽然不难,但还是挺费时间的。

那么是否有配置好的又强大、又通用、拿来即用的Dashboard呢?答案是肯定的!前往 Grafana Lab - Dashboards ,输入关键词即可搜索指定Dashboard:
Actuator + Prometheus + Grafana build micro-service monitoring platform

如上图所示,可以找到若干款以 Prometheus 作为数据源,支持Micrometer的Dashboard。下面,简单演示一下如何使用 JVM(Micrometer) 这个Dashboard。点击 JVM(Micrometer) 进入Dashboard详情介绍页,如下图所示:
Actuator + Prometheus + Grafana build micro-service monitoring platform

如图已详细描述了该Dashboard的特性、配置。其中的management.metrics.tags.application ,前面安装Prometheus服务时已经配置过了。该页的右上角用红框标注的 4701 是一个非常重要的数字,因为这是该Dashboard的id。

回到Grafana的首页,我们来导入这个Dashboard,按下图步骤操作:
Actuator + Prometheus + Grafana build micro-service monitoring platform

输入后即可看到类似如下的界面,选择数据源,并点击Import:
Actuator + Prometheus + Grafana build micro-service monitoring platform

此时,即可看到类似如下的界面,我们常关心的指标该Dashboard均已支持:
Actuator + Prometheus + Grafana build micro-service monitoring platform

在上方的选项栏中可以选择查看不同的服务/应用:
Actuator + Prometheus + Grafana build micro-service monitoring platform

In addition, there are some relatively easy to use Dashboard, you can learn about yourself will not go into here:

Guess you like

Origin blog.51cto.com/zero01/2463452