Alibaba's open source Java diagnostic tool Arthas-actual combat

14934601:

reference

Alibaba's open source Java diagnostic tool Arthas-Advanced Tutorial
Falling in love with the Java diagnostic tool Arthas
arthas-idea-plugin
user practice

1. Start

# 避免中文乱码
wget https://arthas.aliyun.com/arthas-boot.jar;java -jar arthas-boot.jar --target-ip 0.0.0.0
java -Dfile.encoding=UTF-8 -jar arthas-boot.jar

insert image description here

2. Supported ognl expressions

  • loader
  • clazz
  • method
  • target
  • params
  • returnObj
  • throwExp
  • isBefore
  • isThrow
  • isReturn

3. Monitoring parameters

# 监听所有参数
watch com.xxx.iot.web.DeviceController * '{params}' -x 2
# 监听所有参数
watch com.xxx.iot.web.DeviceController * params -x 2
# 监听第几个参数
watch com.xxx.iot.web.DeviceController * params[0] -x 2

-xIndicates the traversal depth, which can be adjusted to print specific parameters and result content. The default value is 1.

Monitor Controller parameters and return values

Listen for complete parameters and return values

# 只监听参数
watch com.xxx.iot.web.DeviceController * '{params}' -x 2
# 或者
watch com.xxx.iot.web.DeviceController * params -x 2
# 监听所有方法 
watch com.xxx.iot.web.DeviceController * '{params, target, returnObj}' -x 2
# 监听对应方法
watch com.xxx.iot.web.DeviceController getOnlineByCode '{params, target, returnObj}' -x 2

When the request is as follows

Result<PageDTO<DriverDTO>> fetchDriverByDeviceCode(@Validated @Parameter(description = "实体", name = "param", required = true) @RequestBody DevicePageQueryDTO param)

insert image description here

arthas is displayed as follows:

method=com.xxx.iot.web.DeviceController.getOnlineByCode location=AtExit
ts=2023-04-10 16:57:49; [cost=16.8166ms] result=@ArrayList[
@Object[][
@DeviceCodeBatchQueryDTO[DeviceCodeBatchQueryDTO(deviceCodes=[1584381959985602561, 1589581864182480897, 150077101, TJ0205103, TJ0205211])],
],
@DeviceController[
log=@Logger[Logger[com.xxx.iot.web.DeviceController]],
deviceService=@DeviceServiceImpl[com.xxx.iot.service.impl.DeviceServiceImpl@5236007a],
],
@Result[
serialVersionUID=@Long[1],
traceId=null,
code=@Integer[0],
msg=null,
data=@ArrayList[isEmpty=false;size=5],
uri=null,
],
]

Monitor kafka consumption

# 方法 public void process(JSONObject messageBody,PhysicalDataModelDTO tdl,DeviceType deviceType,ProductDTO product,String productKey,String deviceKey, String messageId,String deviceMessageId) {)
watch com.xxx.iot.receiver.listener.KafkaPropertyReceiver process '{params, target, returnObj}' -x 2

Listen for a single parameter

watch com.xxx.iot.receiver.listener.KafkaPropertyReceiver process '{params[0], target, returnObj}' -x 2

The effect is as follows:

method=com.xxx.xxx.receiver.listener.KafkaPropertyReceiver.process location=AtExit
ts=2023-04-10 17:14:30; [cost=24.4472ms] result=@ArrayList[
    @JSONObject[
        @String[deviceKey]:@String[yangchen1],
        @String[messageId]:@String[54691],
        @String[params]:@JSONObject[isEmpty=false;size=11],
        @String[productKey]:@String[cu9f6bf82fc4444cc18774f2bc7d370685],
        @String[ts]:@Long[1681118069925],
        @String[version]:@String[1.0],
    ],
    @KafkaPropertyReceiver[
        log=@Logger[Logger[com.xxx.iot.receiver.listener.KafkaPropertyReceiver]],
        pushService=@PushServiceImpl[com.xxx.iot.push.impl.PushServiceImpl@1703b898],
        mongoTemplate=@MongoTemplate[org.springframework.data.mongodb.core.MongoTemplate@13857408],
        PARAMS=@String[params],
        productService=@ProductServiceImpl[com.xxx.iot.service.impl.ProductServiceImpl@196a3471],
        log=@Logger[Logger[com.xxx.iot.receiver.listener.KafkaPropertyReceiver]],
        title=@String[属性上报],
    ],
    null,
]

Listen for exceptions

The watch command supports the -e option, which means only capture requests when an exception is thrown:

watch com.example.demo.arthas.user.UserController * "{params[0],throwExp}" -e

filter by time

The watch command supports filtering by request time, for example:

watch com.example.demo.arthas.user.UserController * '{params, returnObj}' '#cost>200'

Monitor parameter comparison

When accessing user/1, the watch command has no output

When accessing user/101, watch will print out the result.

watch com.example.demo.arthas.user.UserController * returnObj 'params[0] > 100'

4. Variables and methods

Query static member variable value

ognl @com.xxx.iot.ArthasTest@hashSet

Query the value of the specific attribute of the configuration class

## 方法一
ognl '@com.xxx.common.core.util.ServiceHelper@getBean("mqConfig").eventConcurrency'

View all properties of objects in the Spring container through the class loader

# 1、获取类加载器
[arthas@17860]$ sc -d *MqConfig | grep class-loader
class-loader      +-sun.misc.Launcher$AppClassLoader@18b4aac2
# 2、获取对象属性
[arthas@17860]$ vmtool --action getInstances -c 18b4aac2 --className com.xxx.iot.mq.config.MqConfig  --limit 10 -x 2
@MqConfig[][
    @MqConfig[
        propertyConcurrency=@Integer[3],
        eventConcurrency=@Integer[1],
    ],
    @MqConfig$$EnhancerBySpringCGLIB$$1[
  • -c [hash of class loader]
  • -x set level
# 设置x = 2 无法查看具体
[arthas@7]$ vmtool --action getInstances -c 31221be2 --className com.xxx.iot.config.IotConfig --limit 10 -x 2
@IotConfig[][
    @IotConfig[
        timeout=@Integer[10],
        whiteList=@LinkedHashSet[isEmpty=false;size=3],
        aiProductKey=@String[cu623ef0a8e8564602b464b8021d30f9ab],
        aiHost=@String[172.17.0.1,1xx.x0.2x4.xx7],
        apps=@ArrayList[isEmpty=false;size=4],
    ],

# 设置x = 3 可以查看第二层数据
[arthas@7]$ vmtool --action getInstances -c 31221be2 --className com.xxx.iot.config.IotConfig --limit 10 -x 3
@IotConfig[][
    @IotConfig[
        timeout=@Integer[10],
        whiteList=@LinkedHashSet[
            @String[1xx.x0.2x4.xx7],
            @String[127.0.0.1],
        ],
        aiProductKey=@String[cu623ef0a8e8564602b464b8021d30f9ab],
        aiHost=@String[172.17.0.1,1xx.x0.2x4.xx7],
        apps=@ArrayList[
            @AppDTO[AppDTO(title=null, appKey=12201, appSecret=12301, whiteList=null)],
            @AppDTO[AppDTO(title=null, appKey=33, appSecret=23342, whiteList=null)],
            @AppDTO[AppDTO(title=本地测试, appKey=158173395456, appSecret=O0x7M7TE6AjDsUqxIfZ8zg0Y, whiteList=[127.0.0.1, 192.168.0.44, 192.168.0.88 ])],
            @AppDTO[AppDTO(title=xx信息, appKey=637173395456, appSecret=O0x7M7T6AjwrerwsdX8xgXa, whiteList=[127.0.0.1, 192.168.0.44, 192.168.0.88])],
        ],
    ]

execute static method

# 1、获取类加载的hash
sc -d com.xxx.iot.util.IotCacheUtil
# 2、执行带参数
ognl -c 18b4aac2 '@com.xxx.iot.util.IotCacheUtil@getTdl("cu623ef0a8e8564602b464b8021d30f9ab")' -x 4

Five, decompile

jad com.example.demo.arthas.user.UserController
# 反编译到制定文件
jad --source-only com.example.demo.arthas.user.UserController > /tmp/UserController.java

6. Modify the logLevel log level

view the class loader of a class

# 下面是模糊查找,也可以精确查找sc -d com.xxx.iot.web.DeviceController | grep class-loader
[arthas@17860]$ sc -d *DeviceController | grep class-loader
 class-loader      +-sun.misc.Launcher$AppClassLoader@18b4aac2
 class-loader      +-sun.misc.Launcher$AppClassLoader@18b4aac2

Get logger with ognl

[arthas@17860]$ ognl --classLoaderClass sun.misc.Launcher$AppClassLoader '@com.xxx.iot.web.DeviceController@log'
@Logger[
    serialVersionUID=@Long[5454405123156820674],
    FQCN=@String[ch.qos.logback.classic.Logger],
    name=@String[com.xxx.iot.web.DeviceController],
    level=null,
    effectiveLevelInt=@Integer[10000],
    parent=@Logger[Logger[com.xxx.iot.web]],
    childrenList=null,
    aai=null,
    additive=@Boolean[true],
    loggerContext=@LoggerContext[ch.qos.logback.classic.LoggerContext[logback]],
    lastUpdateCheckTime=@Long[1681119603828],
]

You can know that DeviceController@logger actually uses logback. You can see that level=null, it means that the actual final level comes from the root logger.

set level

Set the logger level of DeviceController separately

ognl --classLoaderClass sun.misc.Launcher$AppClassLoader '@[email protected](@ch.qos.logback.classic.Level@DEBUG)'

Get DeviceController@logger again, you can find that it is already DEBUG:

[arthas@17860]$ ognl --classLoaderClass sun.misc.Launcher$AppClassLoader '@com.xxx.iot.web.DeviceController@lo'
@Logger[
    serialVersionUID=@Long[5454405123156820674],
    FQCN=@String[ch.qos.logback.classic.Logger],
    name=@String[com.xxx.iot.web.DeviceController],
    level=@Level[DEBUG],
    effectiveLevelInt=@Integer[10000],
    parent=@Logger[Logger[com.xxx.iot.web]],
    childrenList=null,
    aai=null,
    additive=@Boolean[true],
    loggerContext=@LoggerContext[ch.qos.logback.classic.LoggerContext[logback]],
    lastUpdateCheckTime=@Long[1681119814259],
]

Modify the global logger level of logback (not recommended)

By obtaining the root logger, you can modify the global logger level:

ognl --classLoaderClass sun.misc.Launcher$AppClassLoader '@org.slf4j.LoggerFactory@getLogger("root").setLevel(@ch.qos.logback.classic.Level@DEBUG)'

Modify the log level of logback through logger to correspond to the log level of logging in yml

logging:
  level:
    org.springframework.data.mongodb.core.MongoTemplate: DEBUG

Change to info

# 查看日志级别
logger --name org.springframework.data.mongodb.core.MongoTemplate
# 修改
logger --name org.springframework.data.mongodb.core.MongoTemplate --level info

7. Exit

# 退出当前监听
exit
# 退出整个程序
stop

Eight, cpu is too high, thread deadlock actual combat

reference

Use Arthas to accurately locate the problem of high CPU load in Java applications #1202

test code

public class ArthasTest {
    
    
    private static HashSet hashSet = new HashSet();

    public static void main(String[] args) {
    
    
        // 模拟 CPU 过高
        cpuHigh();
        // 模拟线程死锁
        deadThread();
        // 不断的向 hashSet 集合增加数据
        addHashSetThread();
    }

    /**
     * 不断的向 hashSet 集合添加数据
     */
    public static void addHashSetThread() {
    
    
// 初始化常量
        new Thread(() -> {
    
    
            int count = 0;
            while (true) {
    
    
                try {
    
    
                    hashSet.add("count" + count);
                    Thread.sleep(1000);
                    count++;
                } catch (InterruptedException e) {
    
    
                    e.printStackTrace();
                }
            }
        }).start();
    }

    public static void cpuHigh() {
    
    
        new Thread(() -> {
    
    
            while (true) {
    
    

            }
        }).start();
    }

    /**
     * 死锁
     */
    private static void deadThread() {
    
    
        /** 创建资源 */
        Object resourceA = new Object();
        Object resourceB = new Object();
        // 创建线程
        Thread threadA = new Thread(() -> {
    
    
            synchronized (resourceA) {
    
    
                System.out.println(Thread.currentThread() + " get ResourceA");
                try {
    
    
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
    
    
                    e.printStackTrace();
                }
                System.out.println(Thread.currentThread() + "waiting get resourceB");
                synchronized (resourceB) {
    
    
                    System.out.println(Thread.currentThread() + " get resourceB");
                }
            }
        });

        Thread threadB = new Thread(() -> {
    
    
            synchronized (resourceB) {
    
    
                System.out.println(Thread.currentThread() + " get ResourceB");
                try {
    
    
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
    
    
                    e.printStackTrace();
                }
                System.out.println(Thread.currentThread() + "waiting get resourceA");
                synchronized (resourceA) {
    
    
                    System.out.println(Thread.currentThread() + " get resourceA");
                }
            }
        });
        threadA.start();
        threadB.start();
    }
}

view thread

thread
Threads Total: 49, NEW: 0, RUNNABLE: 9, BLOCKED: 2, WAITING: 4, TIMED_WAITING: 4, TERMINATED: 0, Internal threads: 30
ID   NAME                          GROUP          PRIORITY  STATE    %CPU      DELTA_TIM TIME      INTERRUPT DAEMON
22   Thread-0                      main           5         RUNNABLE 65.65     0.140     0:41.265  false     false
2    Reference Handler             system         10        WAITING  0.0       0.000     0:0.000   false     true
3    Finalizer                     system         8         WAITING  0.0       0.000     0:0.000   false     true
4    Signal Dispatcher             system         9         RUNNABLE 0.0       0.000     0:0.000   false     true
5    Attach Listener               system         5         RUNNABLE 0.0       0.000     0:0.015   false     true
28   arthas-timer                  system         5         WAITING  0.0       0.000     0:0.000   false     true
30   Keep-Alive-Timer              system         8         TIMED_WA 0.0       0.000     0:0.000   false     true
31   arthas-NettyHttpTelnetBootstr system         5         RUNNABLE 0.0       0.000     0:0.000   false     true
32   arthas-NettyWebsocketTtyBoots system         5         RUNNABLE 0.0       0.000     0:0.000   false     true
33   arthas-NettyWebsocketTtyBoots system         5         RUNNABLE 0.0       0.000     0:0.000   false     true
34   arthas-shell-server           system         5         TIMED_WA 0.0       0.000     0:0.000   false     true
35   arthas-session-manager        system         5         TIMED_WA 0.0       0.000     0:0.000   false     true
36   arthas-UserStat               system         5         WAITING  0.0       0.000     0:0.000   false     true
38   arthas-NettyHttpTelnetBootstr system         5         RUNNABLE 0.0       0.000     0:0.000   false     true
39   arthas-command-execute        system         5         RUNNABLE 0.0       0.000     0:0.000   false     true
23   Thread-1                      main           5         BLOCKED  0.0       0.000     0:0.000   false     false
24   Thread-2                      main           5         BLOCKED  0.0       0.000     0:0.000   false     false
25   Thread-3                      main           5         TIMED_WA 0.0       0.000     0:0.000   false     false
26   DestroyJavaVM                 main           5         RUNNABLE 0.0       0.000     0:0.046   false     false
-1   Service Thread                -              -1        -        0.0       0.000     0:0.000   false     true
-1   C1 CompilerThread9            -              -1        -        0.0       0.000     0:0.031   false     true
-1   C1 CompilerThread8            -              -1        -        0.0       0.000     0:0.000   false     true
-1   C1 CompilerThread10           -              -1        -        0.0       0.000     0:0.015   false     true
-1   C1 CompilerThread11           -              -1        -        0.0       0.000     0:0.000   false     true
-1   GC task thread#11 (ParallelGC -              -1        -        0.0       0.000     0:0.000   false     true
-1   GC task thread#10 (ParallelGC -              -1        -        0.0       0.000     0:0.000   false     true
-1   C2 CompilerThread2

View cpu too high thread stack

[arthas@1948]$ thread 22
"Thread-0" Id=22 RUNNABLE
    at com.xxx.iot.ArthasTest.lambda$cpuHigh$1(ArthasTest.java:42)
    at com.xxx.iot.ArthasTest$$Lambda$1/1711574013.run(Unknown Source)
    at java.lang.Thread.run(Thread.java:748)

Troubleshooting Thread Pool Deadlocks

thread --state BLOCKED

View thread deadlock

[arthas@1948]$ thread -b
"Thread-1" Id=23 BLOCKED on java.lang.Object@4db472ec owned by "Thread-2" Id=24
    at com.xxx.iot.ArthasTest.lambda$deadThread$2(ArthasTest.java:66)
    -  blocked on java.lang.Object@4db472ec
    -  locked java.lang.Object@5e0c9ecd <---- but blocks 1 other threads!
    at com.xxx.iot.ArthasTest$$Lambda$2/1674896058.run(Unknown Source)
    at java.lang.Thread.run(Thread.java:748)

decompile

[arthas@1948]$ jad com.xxx.iot.ArthasTest

View the value of the variable

ognl @com.xxx.iot.ArthasTest@hashSet

Nine, tracking Http request tips

1. Obtain the response time of the interface

Just now, my friend needs to analyze the time-consuming of all request interfaces through arthas analysis, and share it with everyone.
You can perform statistical analysis from directional printing to files and then according to the response format

watch org.springframework.web.servlet.DispatcherServlet doService '{params[0].getRequestURI()+" "+params[0].getRemoteAddr()+" "+ #cost}'  -n 5  -x 3 '#cost>100'  -f

2. Obtain the information of the specified header

For example, get trace-id here

 watch org.springframework.web.servlet.DispatcherServlet doService '{params[0].getRequestURI()+"  header="+params[0].getHeaders("User-Agent")}'  -n 10  -x 3 -f
 
 watch org.springframework.web.servlet.DispatcherServlet doService '{params[0].getRequestURI()+ " " +params[0].getRemoteAddr() +"  header="+params[0].getHeader("x-forwarded-for")}'  -n 10  -x 3 -f
 
 watch com.xxx.iot.interceptor.IotApiInterceptor preHandle '{params[0].getRequestURI()+ " " +params[0].getRemoteAddr() +"  header="+params[0].getHeaders("x-forwarded-for")}'  -n 10  -x 3 -f

Ten, interface time-consuming

Interface time-consuming analysis
Generally, we use arthas to analyze the specific time-consuming of the interface, and we can also combine skyworking and other distributed tracking frameworks to view the time-consuming.
The first step is to check the time-consuming of the specific interface, which requires multiple classes-E to be used together

trace com.wangji92.arthas.plugin.demo.service.impl.ArthasTestServiceImpl doTraceE  -n 5 --skipJDKMethod true 

The first step may only be related to the time-consuming analysis (project optimization batches capture and analyze time-consuming positions, and "blind" matching can also be used)

trace com.xxxCompany* * '#cost > 2000'

Eleven, arthas simply view the sql statement

The following two ideas check the basic sql structure is basically available~ You can also add conditional expressions to filter parameters

Method 1: watch Connection

Directly watch Connection to view sql, which can meet the basic needs, and there is no information about execution parameters.

watch java.sql.Connection prepareStatement '{params,throwExp}'  -n 5  -x 3 

Method 2: watch BoundSql

Using mybatis can be used

watch org.apache.ibatis.mapping.BoundSql getSql '{params,returnObj,throwExp}'  -n 5  -x 3 

Guess you like

Origin blog.csdn.net/Blueeyedboy521/article/details/130065091