background
Process monitoring service related components are generally dealt with early business appear DB resource-specific services used by more than the amount of dosing, resulting in failure to detect the health service after another off the assembly line from Eureka, service monitoring is not routed to a particular node in time, or routing to a specific node but did not hit the scene will not trigger an alarm threshold, which means business transient normal service after another off the assembly line; Eureka server as a registry service can be perceived earlier registration status, instance node linked to the (few instances on registration a), the node status scenario non-uP
Monitoring program
- Eureka timing acquisition registration information, examples of nodes, node status information Examples
- prometheus regular data collection Eureka server collected
- grafana inquiries and alarm data
Eureka registration information data collection
metric data structure definition
-
Statistics node status
type:Gauge
eureka_instance_status{client="{client}",status="{status}"}
client: eureka client application name
status enumeration
status | Enumeration values |
---|---|
UP | 1 |
DOWN | 5 |
STARTING | 2 |
OUT_OF_SERVICE | 3 |
UNKNOW | 4 |
Average is greater than n 1 in the last time, indicating an abnormality, performs alarm
-
The number of nodes statistics
type:Gauge
eureka_instance_count{client="{client}",count="{count}"}
client: eureka client application name
count: client count
java pom-dependent
<!-- boot2.x 兼容-->
<!-- The client -->
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient</artifactId>
<version>0.6.0</version>
</dependency>
<!-- Hotspot JVM metrics-->
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient_hotspot</artifactId>
<version>0.6.0</version>
</dependency>
<!-- Exposition HTTPServer-->
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient_httpserver</artifactId>
<version>0.6.0</version>
</dependency>
<!-- Pushgateway exposition-->
<dependency>
<groupId>io.prometheus</groupId>
<artifactId>simpleclient_pushgateway</artifactId>
<version>0.6.0</version>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>1.1.4</version>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-core</artifactId>
<version>1.1.4</version>
</dependency>
java code
@Component
public class InstanceStateCollector {
@Autowired
PeerAwareInstanceRegistry registry;
private static final Logger log = LoggerFactory.getLogger(InstanceStateCollector.class);
@Scheduled(cron = "*/5 * * * * ?")
public void collect() {
Applications applications = registry.getApplications();
applications.getRegisteredApplications().forEach((registeredApplication) -> {
Integer count = registeredApplication.size();
String client = registeredApplication.getName();
log.debug("client :{}, count :{}", client, count);
PrometheusMetricsUtils.metricInstanceCount(client, count);
registeredApplication.getInstances().forEach((instance) -> {
String instanceId = instance.getInstanceId();
log.debug("client :{}, instance :{}, status :{}", client, instanceId, instance.getStatus());
PrometheusMetricsUtils.metricInstanceStatus(client, instanceId, instance.getStatus());
});
});
}
}
@Service
public class PrometheusMetricsService {
/**
* 实例状态统计
* eureka_instance_status{client="{client}",status="{status}"}
*/
private static final String EUREKA_INSTANCE_STATUS = "mall_eureka_instance_status";
/**
* 实例数量统计
* eureka_instance_count{client="{client}",count="{count}"}
*/
private static final String EUREKA_INSTANCE_COUNT = "mall_eureka_instance_count";
private static final String LABEL_CLIENT = "client";
private final Gauge instanceStatusGauge;
private final Gauge instanceCountGauge;
public PrometheusMetricsService(CollectorRegistry registry) {
instanceStatusGauge = Gauge
.build(EUREKA_INSTANCE_STATUS, "instance status")
.labelNames(LABEL_CLIENT)
.register(registry);
instanceCountGauge = Gauge
.build(EUREKA_INSTANCE_COUNT, "instance count")
.labelNames(LABEL_CLIENT)
.register(registry);
}
/**
* 实例状态埋点
*
* @param client client name || application name
* @param statusValue status
*/
void metricInstanceStatus(String client, Integer statusValue) {
instanceStatusGauge.labels(client).set(statusValue);
}
/**
* 实例数量埋点
*
* @param client client name || application name
* @param count count
*/
void metricInstanceCount(String client, Integer count) {
instanceCountGauge.labels(client).set(count);
}
}
Eureka server data acquisition Prometheus
prometheus.yml
- job_name: 'mgmall-eureka'
scrape_interval: 10s
metrics_path: '/actuator/prometheus'
static_configs:
- targets: ['10.124.129.42:19110']
Grafana Report Maintenance
Report form
mall_eureka_instance_count{client="MGMALL-CONFIG"}
.....
![image-20190531140258528](/Users/yugj/Library/Application Support/typora-user-images/image-20190531140258528.png)
monitor
avg() query(A,10s,now) is below 1
![image-20190531140319350](/Users/yugj/Library/Application Support/typora-user-images/image-20190531140319350.png)
Reproduced in: https: //my.oschina.net/yugj/blog/3056695