Arthas practical use

1. start

curl -O https://arthas.aliyun.com/arthas-boot.jar
java -jar arthas-boot.jar

Press the corresponding number in front to start successfully

2. Purpose

  1. Real-time monitoring: Arthas can monitor various indicators and status of Java applications in real time, such as method execution time, thread status, memory usage, etc. This enables developers to gain insight into the health of their applications and spot potential performance bottlenecks and anomalies in a timely manner.

  2. Diagnosing problems: Arthas provides a series of commands and tools that can help developers diagnose and analyze problems in their applications. For example, it can view method call stacks, find resource leaks, analyze threading issues, and more. This helps to quickly locate and resolve application bugs and performance issues.

  3. Dynamically modify code: Arthas has the ability to dynamically modify Java code, developers can modify loaded classes and methods at runtime without recompiling and deploying applications. This is very useful for quick debugging and troubleshooting.

  4. Command-line interaction and visual interface: Arthas provides a command-line interaction interface, and developers can enter commands through the command line to operate and query. In addition, Arthas also provides a visual interface (Arthas Web Console), which is convenient for developers to view and analyze the running data of the application in the browser.

3. Commonly used commands

3.1 watch

        You can use this command to monitor the calling of a certain method, and display the method parameters and return information in real time when the method is called, and view the corresponding variables by writing OGNL expressions.

        You can observe the input parameters and return values ​​​​through the command

arthas@1]$ watch com.aurora.controller.CommentController listTopSixComments "{params,returnObj}" 
Press Q or Ctrl+C to abort.
Affect(class count: 2 , method count: 2) cost in 446 ms, listenerId: 1
method=com.aurora.controller.CommentController.listTopSixComments location=AtExit
ts=2023-06-17 08:23:36; [cost=73.790329ms] result=@ArrayList[
    @Object[][isEmpty=true;size=0],
    @ResultVO[ResultVO(flag=true, code=20000, message=操作成功, data=[CommentDTO(id=1032, userId=1024, nickname=木木夕, avatar=http://mumuxibucket.oss-cn-beijing.aliyuncs.com/aurora/avatar/e248edca7ebfac088b85d96fa3813ebc.png, webSite=null, commentContent=hello, createTime=2023-05-31T10:22:20, replyDTOs=null)])],
]

Conditional filtering can be added

watch com.aurora.controller.CommentController listTopSixComments "{params,returnObj}" "params[0]=false"

You can also observe whether there is an exception

watch com.aurora.controller.CommentController listTopSixComments "{params,throwExp}" -e -x 2

some other parameters

parameter name Parameter Description
class-pattern class name expression match
method-pattern function name expression match
express watch expression, default value:{params, target, returnObj}
condition-express conditional expression
[b] Observe before function call
[e] Observe after function exception
[s] Watch after the function returns
[f] After the function ends (normal return and abnormal return) observe
[E] Enable regular expression matching, the default is wildcard matching
[x:] Specify the attribute traversal depth of the output result, the default is 1, and the maximum value is 4
[m <arg>] Specify the maximum matching number of Class, the default value is 50. The long form is [maxMatch <arg>].

3.2 trace

        It is generally used to check the time consumption of a node on the method path, and can further locate the method that affects the interface speed.

[arthas@1]$ trace com.aurora.controller.CommentController listTopSixComments 
Press Q or Ctrl+C to abort.
Affect(class count: 2 , method count: 2) cost in 166 ms, listenerId: 5
`---ts=2023-06-17 08:33:48;thread_name=http-nio-8080-exec-5;id=26;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@7c214cc0
    `---[8.683509ms] com.aurora.controller.CommentController$$EnhancerBySpringCGLIB$$2f90009d:listTopSixComments()
        `---[98.89% 8.587255ms ] org.springframework.cglib.proxy.MethodInterceptor:intercept()
            `---[99.15% 8.514127ms ] com.aurora.controller.CommentController:listTopSixComments()
                +---[98.96% 8.425924ms ] com.aurora.service.CommentService:listTopSixComments() #52
                `---[0.14% 0.012228ms ] com.aurora.model.vo.ResultVO:ok() #52

3.3 jad

        Decompile the source code of the specified loaded class, which can be used to observe whether the newly launched code takes effect

jad com.aurora.controller.CommentController

3.4 thread

        Regarding the thread command, because the company's business reasons depend on the use of the thread pool, and the thread usage is large, it is occasionally necessary to observe some states of the thread

  1. thread -all show all threads
  2. thread -all | grep 't-' filter by string
  3. thread -all | grep 't-' | wc -l filter according to the string, count the number
  1. Count the number of lines after the command is executed or the number of files in the statistical directory (wc)
wc命令

常见参数如下:

-c 统计字节数。

-l 统计行数。

-m 统计字符数。这个标志不能与 -c 标志一起使用。

-w 统计字数。注意,这里的字指的是由空格,换行符等分隔的字符串。

例子:thread -all | grep 't-' | wc -l

Troubleshooting ideas for the problem that the thread is full of CPU:

1个pod CPU打满与所有pod CPU打满应对策略与应对难度是不一样的,

1、第一考虑要素是止损,先将这个pod下线,
对于这个业务是数据异步处理对于时效性要求不高,可以等看是否能自动恢复,如果持续出现线程池打满可以考虑将这个pod下线,保留现场,不要重启pod,
2、任务量是否有没有变化
3、任务量没变化,是否有任务耗时比较多占用线程时间较长,例如:大租户集中发货等,
4、是否有线程阻塞,一直无法释放

You can first use ps, top and other commands to analyze the thread status.

(1) Use the top command to find the thread ID that consumes the most CPU:

        top –Hp PID

(2) Convert the most CPU-consuming thread ID into hexadecimal, because the thread number in the thread snapshot file is recorded in hexadecimal:

        printf ‘%x \n’ PID

(3) Take the problematic server offline, then use jstack to export the thread snapshot information, and check what the thread ID found in step (2) is doing:

        jstack pid |grep time -A 30

You can print out all thread information through jstack pid, analyze the WAITING thread and where it is blocked

3.5 profiler

        Flame graph analysis

 
The color in the figure represents
        green: Java code
        yellow: JVM, C++ code
        red: user state, C code orange: kernel state, the xy axis in the
        C code graph         represents the call stack, and each layer is a function. The deeper the call stack, the higher the flame, with the executing function at the top and its parent functions below.

        The x-axis represents the number of samples. If a function occupies a wider width on the x-axis, it means that it has been sampled more times, that is, the execution time is longer. Note that the x-axis does not represent time, but all call stacks are merged and arranged alphabetically.
The meaning of stack width (CPU time)
The width can be understood as the proportion of the CPU sampling rate. The wider the stack, the higher the proportion of the current stack in the sampling number. When the x-axis of a function is wider, it means:
        the function takes a long time to run and
        the number of times the function is called Multi-
flat-top phenomenon (be sure to pay special attention)
        The flat-top phenomenon is caused by the current program's sampling number taking up too much of the total sampling number. If this phenomenon occurs, you need to pay special attention to the specific call stack of the program. The sampling ratio occupancy rate is too high. High, which means that the CPU usage of the method is too high

 Welcome to visit: http://mumuxi.chat/

Guess you like

Origin blog.csdn.net/zz18532164242/article/details/131256521