最近几个月天天加班,好久没更新博客了。
这两天排查一个CPU飙高导致服务假死的问题,常规套路:
jps
top -H -p 进程ID
jstack -l 高CPU线程
以上常规套路相信大家都比较熟练,但是当节点比较多的时候,就需要挨个进到机器或容器内,查看jvm堆栈情况,耗时耗力,如果能使用可视化工具就方便多了。Arthas(阿尔萨斯)需要单独部署成进程,部署复杂;springboot actuator作为信息采集客户端,springboot admin作为可视化页面,侵入性较小,因此选择后者,废话不多,上代码
1、业务服务(监控客户端)
1)pom
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>de.codecentric</groupId>
<artifactId>spring-boot-admin-starter-client</artifactId>
<version>2.3.1</version>
</dependency>
2)配置
server:
port: 8080
spring:
application:
name: test_server
admin:
enabled: true
boot:
admin:
client:
# 如果不是在同一个节点,只要使用内网地址和端口即可,容器部署需要将容器端口映射到宿主机
# springboot admin server 地址
url: http://127.0.0.1:3001
# springboot admin client 地址
instance:
service-url: http://127.0.0.1:8080
# springboot actuator
management:
endpoints:
web:
exposure:
include: info,health,metrics,httptrace,env,scheduledtasks,threaddump,heapdump
endpoint:
health:
show-details: always
2、监控及可视化服务(监控服务端)
1)pom
<dependencyManagement>
<dependencies>
<dependency>
<groupId>de.codecentric</groupId>
<artifactId>spring-boot-admin-dependencies</artifactId>
<version>${
spring-boot-admin.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>de.codecentric</groupId>
<artifactId>spring-boot-admin-starter-server</artifactId>
</dependency>
</dependencies>
2)配置
server:
port: 3001
spring:
application:
name: springboot-admin-server
admin:
enabled: true
3)启动类
import de.codecentric.boot.admin.server.config.EnableAdminServer;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
@EnableAdminServer
public class SpringbootAdminApplication {
public static void main(String[] args) {
SpringApplication.run(SpringbootAdminApplication.class, args);
}
}
注:springboot actuator除默认监控指标外,还支持自定义监控指标,例如监控数据库连接池使用情况,可在代码中添加
import com.alibaba.druid.pool.DruidDataSource;
import com.baomidou.dynamic.datasource.DynamicRoutingDataSource;
import com.baomidou.dynamic.datasource.ds.ItemDataSource;
import com.zaxxer.hikari.HikariDataSource;
import com.zaxxer.hikari.HikariPoolMXBean;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.actuate.health.AbstractHealthIndicator;
import org.springframework.boot.actuate.health.Health;
import org.springframework.stereotype.Component;
import javax.sql.DataSource;
import java.util.Map;
@Component
public class HealthIndicato extends AbstractHealthIndicator {
@Autowired
private DataSource dataSource;
@Autowired(required = false)
private HikariDataSource hikariDataSource;
@Autowired(required = false)
private DynamicRoutingDataSource dynamicRoutingDataSource;
@Override
protected void doHealthCheck(Health.Builder builder) throws Exception {
// MySQL连接监控
if (dataSource != null) {
if (dataSource instanceof HikariDataSource) {
HikariDataSource hikariDataSource = (HikariDataSource) dataSource;
HikariPoolMXBean mysqlPool = hikariDataSource.getHikariPoolMXBean();
builder.up()
.withDetail("mysql-total-connections", mysqlPool.getTotalConnections())
.withDetail("mysql-active-connections", mysqlPool.getActiveConnections())
.withDetail("mysql-idle-connections", mysqlPool.getIdleConnections())
.withDetail("mysql-threads-awaiting-connection", mysqlPool.getThreadsAwaitingConnection());
} else if (dataSource instanceof DynamicRoutingDataSource) {
DynamicRoutingDataSource dynamicRoutingDataSource = (DynamicRoutingDataSource) dataSource;
Map<String, DataSource> dataSourceMap = dynamicRoutingDataSource.getCurrentDataSources();
for (Map.Entry<String, DataSource> entry : dataSourceMap.entrySet()) {
String key = entry.getKey();
ItemDataSource itemDataSource = (ItemDataSource) entry.getValue();
DruidDataSource druidDataSource = (DruidDataSource) itemDataSource.getRealDataSource();
builder.up()
.withDetail("mysql-" + key + "-ConnectCount", druidDataSource.getConnectCount())
.withDetail("mysql-" + key + "-CloseCount", druidDataSource.getCloseCount())
.withDetail("mysql-" + key + "-ConnectErrorCount", druidDataSource.getConnectErrorCount())
.withDetail("mysql-" + key + "-PoolingCount", druidDataSource.getPoolingCount())
.withDetail("mysql-" + key + "-ActiveCount", druidDataSource.getActiveCount())
.withDetail("mysql-" + key + "-NotEmptyWaitCount", druidDataSource.getNotEmptyWaitCount())
.withDetail("mysql-" + key + "-NotEmptyWaitThreadCount", druidDataSource.getNotEmptyWaitThreadCount());
}
}
}
}
}