Eureka Server prometheus health status monitoring service

background

  Process monitoring service related components are generally dealt with early business appear DB resource-specific services used by more than the amount of dosing, resulting in failure to detect the health service after another off the assembly line from Eureka, service monitoring is not routed to a particular node in time, or routing to a specific node but did not hit the scene will not trigger an alarm threshold, which means business transient normal service after another off the assembly line; Eureka server as a registry service can be perceived earlier registration status, instance node linked to the (few instances on registration a), the node status scenario non-uP

Monitoring program

  • Eureka timing acquisition registration information, examples of nodes, node status information Examples
  • prometheus regular data collection Eureka server collected
  • grafana inquiries and alarm data

Eureka registration information data collection

metric data structure definition

  • Statistics node status

    type:Gauge

eureka_instance_status{client="{client}",status="{status}"}

client: eureka client application name

status enumeration

status Enumeration values
UP 1
DOWN 5
STARTING 2
OUT_OF_SERVICE 3
UNKNOW 4

Average is greater than n 1 in the last time, indicating an abnormality, performs alarm

  • The number of nodes statistics

    type:Gauge

eureka_instance_count{client="{client}",count="{count}"}

client: eureka client application name

count: client count

java pom-dependent

<!-- boot2.x 兼容-->
<!-- The client -->
<dependency>
    <groupId>io.prometheus</groupId>
    <artifactId>simpleclient</artifactId>
    <version>0.6.0</version>
</dependency>
<!-- Hotspot JVM metrics-->
<dependency>
    <groupId>io.prometheus</groupId>
    <artifactId>simpleclient_hotspot</artifactId>
    <version>0.6.0</version>
</dependency>
<!-- Exposition HTTPServer-->
<dependency>
    <groupId>io.prometheus</groupId>
    <artifactId>simpleclient_httpserver</artifactId>
    <version>0.6.0</version>
</dependency>
<!-- Pushgateway exposition-->
<dependency>
    <groupId>io.prometheus</groupId>
    <artifactId>simpleclient_pushgateway</artifactId>
    <version>0.6.0</version>
</dependency>
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
    <version>1.1.4</version>
</dependency>
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-core</artifactId>
    <version>1.1.4</version>
</dependency>

java code

@Component
public class InstanceStateCollector {

    @Autowired
    PeerAwareInstanceRegistry registry;

    private static final Logger log = LoggerFactory.getLogger(InstanceStateCollector.class);

    @Scheduled(cron = "*/5 * * * * ?")
    public void collect() {

        Applications applications = registry.getApplications();

        applications.getRegisteredApplications().forEach((registeredApplication) -> {
            Integer count = registeredApplication.size();
            String client = registeredApplication.getName();

            log.debug("client :{}, count :{}", client, count);
            PrometheusMetricsUtils.metricInstanceCount(client, count);

            registeredApplication.getInstances().forEach((instance) -> {
                String instanceId = instance.getInstanceId();
                log.debug("client :{}, instance :{}, status :{}", client, instanceId, instance.getStatus());
                PrometheusMetricsUtils.metricInstanceStatus(client, instanceId, instance.getStatus());

            });
        });
    }
}
@Service
public class PrometheusMetricsService {

    /**
     * 实例状态统计
     * eureka_instance_status{client="{client}",status="{status}"}
     */
    private static final String EUREKA_INSTANCE_STATUS = "mall_eureka_instance_status";

    /**
     * 实例数量统计
     * eureka_instance_count{client="{client}",count="{count}"}
     */
    private static final String EUREKA_INSTANCE_COUNT = "mall_eureka_instance_count";

    private static final String LABEL_CLIENT = "client";

    private final Gauge instanceStatusGauge;
    private final Gauge instanceCountGauge;


    public PrometheusMetricsService(CollectorRegistry registry) {
        instanceStatusGauge = Gauge
                .build(EUREKA_INSTANCE_STATUS, "instance status")
                .labelNames(LABEL_CLIENT)
                .register(registry);

        instanceCountGauge = Gauge
                .build(EUREKA_INSTANCE_COUNT, "instance count")
                .labelNames(LABEL_CLIENT)
                .register(registry);
    }

    /**
     * 实例状态埋点
     *
     * @param client   client name || application name
     * @param statusValue   status
     */
    void metricInstanceStatus(String client, Integer statusValue) {
        instanceStatusGauge.labels(client).set(statusValue);
    }

    /**
     * 实例数量埋点
     *
     * @param client client name || application name
     * @param count  count
     */
    void metricInstanceCount(String client, Integer count) {
        instanceCountGauge.labels(client).set(count);
    }



}

Eureka server data acquisition Prometheus

prometheus.yml

  - job_name: 'mgmall-eureka'
    scrape_interval: 10s 
    metrics_path: '/actuator/prometheus'
    static_configs:
      - targets: ['10.124.129.42:19110']

Grafana Report Maintenance

Report form

mall_eureka_instance_count{client="MGMALL-CONFIG"}
.....

![image-20190531140258528](/Users/yugj/Library/Application Support/typora-user-images/image-20190531140258528.png)

monitor

avg() query(A,10s,now) is below 1

![image-20190531140319350](/Users/yugj/Library/Application Support/typora-user-images/image-20190531140319350.png)

Reproduced in: https: //my.oschina.net/yugj/blog/3056695

Guess you like

Origin blog.csdn.net/weixin_33796177/article/details/91629932