Online investigation artifact arthas summary

Introduction

The Chinese name of Arthas is Alsace

Official address: https://alibaba.github.io/arthas/

installation

Use arthas-boot(recommended)

下载arthas-boot.jar,然后用java -jar的方式启动:
curl -O https://alibaba.github.io/arthas/arthas-boot.jar
java -jar arthas-boot.jar

打印帮助信息
java -jar arthas-boot.jar -h

如果下载速度比较慢,可以使用aliyun的镜像:
java -jar arthas-boot.jar --repo-mirror aliyun --use-http

如果从github下载有问题,可以使用gitee镜像
curl -O https://arthas.gitee.io/arthas-boot.jar

Start by as.sh command

Mac安装as.sh,并设置快捷方式
curl -sk https://arthas.gitee.io/arthas-boot.jar -o ~/.arthas-boot.jar  && echo "alias as.sh='java -jar ~/.arthas-boot.jar --repo-mirror aliyun --use-http'" >> ~/.bashrc && source ~/.bashrc

如果安装了zsh执行下面命令
curl -sk https://arthas.gitee.io/arthas-boot.jar -o ~/.arthas-boot.jar  && echo "alias as.sh='java -jar ~/.arthas-boot.jar --repo-mirror aliyun --use-http'" >> ~/.zshrc && source ~/.zshrc

接下来只需要命令行输入as.sh就可以启动

Direct access to arthas through Cloud Toolkit plugin

Uninstall

  • On Linux/Unix/Mac platform

    Delete the following files:

    rm -rf ~/.arthas/
    rm -rf ~/logs/arthas
    
  • Windows platform directly delete the .arthasand logs/arthasdirectories under user home

Enter arthas

  1. First start the program that needs to be diagnosed, then start arthas
    java -jar arthas-boot.jar

  2. Select the corresponding process id

  3. Seeing the following page means arthas, attach is successful

  4. Then you can use arthad related commands for diagnostic procedures

Exit arthas

If you just exit the current connection, you can use the quitor exitcommand. Arthas attached to the target process will continue to run, the port will remain open, and you can directly connect to it next time you connect.

If you want to exit arthas completely, you can execute the stopcommand.

Quick start

Common commands

img

dashboard: Dashboard

Function: Display the real-time data panel of the current system, press q or ctrl+c to exit

img

jad: decompile a certain class, or decompile a certain method of a certain class

Function: Decompile the bytecode file into source code

Decompile class:

jad com.shinemo.todo.domain.TodoDO 

Decompile method:

显示编译后的详情
jad com.shinemo.wangge.core.service.todo.impl.TodoServiceImpl operateTodoThing

只显示源码
jad --source-only com.shinemo.wangge.core.service.todo.impl.TodoServiceImpl operateTodoThing

thread thread related commands

Role: View the thread stack information of the current JVM

thread -n: List the threads with Top N CPU usage.

thread id: display the running stack of the specified thread

thread -b: find out which thread is currently blocking other threads

Deadlock case:

    @GetMapping("/test")
    @SmIgnore
    public ApiResult<String> test() {

        /** 创建资源 */
        Object resourceA = new Object();
        Object resourceB = new Object();
        // 创建线程
        Thread threadA = new Thread(() -> {
            synchronized (resourceA) {
                log.info(Thread.currentThread() + " get ResourceA");
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                log.info(Thread.currentThread() + "waiting get resourceB");
                synchronized (resourceB) {
                    log.info(Thread.currentThread() + " get resourceB");
                }
            }
        });

        Thread threadB = new Thread(() -> {
            synchronized (resourceB) {
                log.info(Thread.currentThread() + " get ResourceB");
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                log.info(Thread.currentThread() + "waiting get resourceA");
                synchronized (resourceA) {
                    log.info(Thread.currentThread() + " get resourceA");
                }
            }
        });
        threadA.start();
        threadB.start();

        return ApiResult.success("success");
    }

image-20200708172933493

sc: View the loaded class information of the JVM

sc has the subclass matching function enabled by default, so it can be used to see which classes implement this interface.

sc -d: display detailed information

sc -f: display all member variables, need to be used with -d

sc com.shinemo.wangge.core.handler.UrlRedirectHandler

sm: View method information of loaded classes

sm -d: display detailed information

sm com.shinemo.wangge.web.controller.todo.TodoController *
sm com.shinemo.wangge.web.controller.todo.TodoController getTypeList

Data observation performed by the watch method

When we encounter online data bug, the means we generally deal with is the development environment for simulation data online and look for clues from the production log, and then, or remote debug. Regardless of the above investigation methods, they are relatively troublesome. Then Arthas's watchcan help us to view real-time code execution. Expressions can be viewed using the observation function 参数, 返回值, 异常信息. Mainly by observing expression OGNL-expression, so you can write OGNLexpressions to perform.

grammar:

watch 包名.类名 方法名 想要查看的信息
例子:
watch com.shinemo.wangge.web.controller.todo.TodoController getTodoList '{params,returnObj,throwExp}' -n 5 -x 5 '1==1'

-x表示遍历深度,可以调整来打印具体的参数和结果内容,默认值是1。
-n表示执行次数

View function return value

watch com.shinemo.wangge.web.controller.common.IndexController getIndex returnObj

View the request parameters of the function

watch com.shinemo.wangge.web.controller.common.IndexController getIndex params

Save logs asynchronously

Sometimes we troubleshoot a function, can not get the information immediately to function, arthasto provide the 后台异步任务help we log. The usage is similar to that of Linux.

watch com.shinemo.wangge.web.controller.todo.TodoController getTodoList '{params,returnObj,throwExp}' -n 5 -x 5 '1==1' > /data/logs/test.log &

trace: output method call path, and output time-consuming

We often encounter that the rt is too long when calling an api. We need to find out one or several functions in the call chain to optimize. We usually locate several possible anchor points and print the rt between each anchor point. Or find out the log printing time point from the log and calculate the time difference, no matter which method is used, it is more complicated. When using arthasthe tracecommand can easily complete our needs.

This instruction is very useful for optimizing code. You can see the specific execution time of each method. If it is a repeated statement such as a for loop, you can also see the maximum time, minimum time, and average time in n cycles.

 trace com.shinemo.wangge.web.controller.common.IndexController getIndex -n 5 '1==1'

The results are as follows:

[arthas@13090]$ trace com.shinemo.wangge.web.controller.common.IndexController getIndex -n 5 '1==1'
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 233 ms, listenerId: 3
`---ts=2020-07-08 14:13:09;thread_name=http-nio-20014-exec-4;id=64;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@11e1bd43
    `---[652.290636ms] com.shinemo.wangge.web.controller.common.IndexController:getIndex()
        +---[0.030585ms] com.shinemo.smartgrid.domain.SmartGridContext:getLongUid() #80
        +---[0.012105ms] com.shinemo.smartgrid.domain.SmartGridContext:getMobile() #81
        +---[88.923084ms] com.shinemo.wangge.core.service.stallup.StallUpService:getSimpleInfo() #82
        +---[0.012727ms] com.shinemo.common.tools.result.ApiResult:isSuccess() #83
        +---[0.009893ms] com.shinemo.common.tools.result.ApiResult:getData() #87

tt: The official name is Time Tunnel

watchYou can check the call situation of the function, and it is more suitable to check the information after the possible situation of the current call is known. If a function is called n times, a few times to perform abnormal, we're going to find these unusual calls in watchthe investigation is not how convenient. Use ttthe command can be more easily view the call and information about the exception

After you enable tt for a certain method, you will record every call (you need to set the maximum number of monitoring times), and then you can see the calls inside at any time, including parameter input, parameter input, running time, and whether it is abnormal Wait

tt -t com.shinemo.wangge.web.controller.todo.TodoController getTodoList -n 5

return:

image-20200708142011105

image-20200708142043733

View method View call information

tt -w '{method.name,params,returnObj,throwExp}' -x 3 -i 1000

Trigger again

tt -p -i 1000

Re-trigger 5 times, every 2 seconds

tt -p --replay-times 5 --replay-interval 2000 -i 1000

Get all call records

tt -l

Delete all call records

tt --delete-all

stack: the call path of the observation method

Use the stack command to view the call information of the method.

img

monitor: Time-consuming statistical method

Use the monitor command to monitor the execution of the statistical method, such as the total number of requests within a specified time, the number of successes, the number of failures, the average response time, and the proportion of failures.

-c: indicates the statistical period, the default value is 60 seconds

 monitor com.shinemo.wangge.web.controller.todo.TodoController getTodoList  -c 10

img

redefine: hot update

Common steps

  1. Decompile through the jad command, and then use vim to modify the source code
  2. Use the mc command to compile the modified code into a class file
  3. Load a new bytecode file with the redefine command
jad --source-only com.example.demo.arthas.user.UserController > /tmp/UserController.java
 
mc /tmp/UserController.java -d /tmp
 
redefine /tmp/com/example/demo/arthas/user/UserController.class

Precautions

  1. The redefine class cannot modify, add, or delete the field and method of the class, including method parameters, method names, and return values.
新增和修改,删除field,会抛异常:redefine error! java.lang.UnsupportedOperationException: class redefinition failed: attempted to change the schema (add/remove fields)

新增方法会抛异常:redefine error! java.lang.UnsupportedOperationException: class redefinition failed: attempted to add a method

修改或删除方法会抛异常:redefine error! java.lang.UnsupportedOperationException: class redefinition failed: attempted to delete a method
  1. The function that is running cannot take effect without exiting. For example, the newly added below will System.out.printlnonly run()take effect in the function
public class MathGame {
    
    
    public static void main(String[] args) throws InterruptedException {
    
    
        MathGame game = new MathGame();
        while (true) {
    
    
            game.run();
            TimeUnit.SECONDS.sleep(1);
            // 这个不生效,因为代码一直跑在 while里
            System.out.println("in loop");
        }
    }
 
    public void run() throws InterruptedException {
    
    
        // 这个生效,因为run()函数每次都可以完整结束
        System.out.println("call run()");
        try {
    
    
            int number = random.nextInt();
            List<Integer> primeFactors = primeFactors(number);
            print(number, primeFactors);
 
        } catch (Exception e) {
    
    
            System.out.println(String.format("illegalArgumentCount:%3d, ", illegalArgumentCount) + e.getMessage());
        }
    }

Asynchronously save request to file

1. Use & perform tasks in the background

watch com.shinemo.wangge.web.controller.common.IndexController getIndex '{params,returnObj,throwExp}' -n 5 -x 3 '1==1' &  

At this time the command will be executed in the background, you can continue to execute other commands in the console.

使用jobs查看所有后台任务
使用kill杀死任务

2. Use >> to redirect task output

It can be used >or >>output the task output result to the specified file, and can &be used together to realize the background asynchronous task of the arthas command. such as:

#指定文件
watch com.shinemo.wangge.web.controller.common.IndexController getIndex '{params,returnObj,throwExp}' -n 5 -x 3 '1==1' >> test.out &

#不指定文件
watch com.shinemo.wangge.web.controller.common.IndexController getIndex '{params,returnObj,throwExp}' -n 5 -x 3 '1==1' >> &
默认会保存到~/logs/arthas-cache/${PID}/${JobId}

3. Save the log of the command execution result

By default, this feature is turned off. If you need to turn it on, execute the following command:

options save-result true

The execution result of the command will be saved asynchronously in:, {user.home}/logs/arthas-cache/result.logplease clean up regularly to avoid occupying disk space.

4. Exit arthas and continue to perform background tasks

If you do not want to stop arthas, continue to perform background tasks can be performed quitwithdraw arthas console ( stopstops arthas Service)

Arthas supports Web Console

After successfully starting the connection and entering the city, it has been automatically started, and you can directly visit: http://localhost:8563/
The operation mode on the page is exactly the same as the console.

Arthas supports pipeline commands

Arthas supports the use of pipelines to further process the results of the above commands, such as sm org.apache.log4j.Logger | grep

  • grep-search results that meet the conditions
  • plaintext-remove the color from the result of the command
  • wc-statistical output results by row

Idea has arthas plugin, which is very convenient and easy to use.

Guess you like

Origin blog.csdn.net/kaihuishang666/article/details/107942092