Dive deep into arthas' commands

Arthas is a Java diagnostic tool for online diagnostics of Java applications for easier monitoring and analysis of application performance and behavior. There are many commands in Arthas for diagnosing Java applications.

1. The difference jvmwith dashboardthe command:

  • jvm: This command is used to display the current JVM information, including class loading information, memory usage, GC information, etc. It provides a snapshot view showing the internal state of the JVM in detail. This helps identify resource constraints or configuration issues.

  • dashboard: This command is used to display the real-time data dashboard of the application, including JVM information, thread status and throughput, etc. The information it displays jvmis more summary and visual than commands, focusing on the overall performance and health of the application.

Summary: While jvmcommands focus on detailed JVM information, dashboardcommands provide an overview of the application's overall performance.

1.1 A machine has only one jvm, and a jvm internal cloud may run multiple java applications?

Multiple JVM instances can run on a machine, and each JVM instance can run one or more Java applications. Each JVM instance is an independent process, each of which has its own memory space, system resources, and garbage collection. Therefore, multiple JVM instances can run on one machine to run multiple Java applications, and each JVM instance is independent.

Simply put, there is a one-to-one correspondence between JVM instances and operating system processes. Every time a Java application is started, a new JVM instance (that is, the operating system process) is started. Each JVM instance is completely independent, with its own heap memory, method area, stack memory, etc. These memory areas are not shared between different JVM instances.

Note: Multiple JVM instances can run on one machine to run multiple Java applications, and each JVM instance is an independent process. However, there is usually only one Java application running inside a JVM instance. If you need to run multiple Java applications inside a JVM, you usually use an application server or container (such as Tomcat, Jetty, JBoss, etc.) tomcat runs multiple java web applications).

2. The difference syspropbetween sysenv:

  • sysprop: This command is used to display or modify Java system properties. System properties are key-value pairs set by command-line arguments, configuration files, or code when the JVM starts. They can be retrieved at runtime using System.getPropertymethods and are often used to configure application behavior.

  • sysenv: This command is used to display the environment variables of the operating system. Environment variables are key-value pairs provided by the operating system, usually set by the operating system or shell startup scripts. They can be retrieved using methods at runtime System.getenvand are often used to configure the application environment.

Summary: syspropCommands focus on Java system properties, while sysenvcommands focus on environment variables of the operating system.

3 What is the vmoption of arthas for?

vmoptioncommand is used in Arthas to view and modify JVM startup options. JVM startup options are a series of parameters set when the JVM starts, which can be used to configure the behavior of the JVM, such as heap size, garbage collector type, enabling or disabling certain features, and so on.

Using vmoptionthe command, you can dynamically view or modify certain JVM startup options without restarting the JVM. Note that not all JVM startup options support dynamic modification, and some options can only be set when the JVM starts.

Use vmoptionthe command to view the current JVM startup options:

vmoption

Use vmoptionthe command to modify JVM startup options (for example, to enable class data sharing):

vmoption -set UseSharedSpaces=true

Note that vmoptioncaution should be exercised when using the command to modify JVM startup options, as the modification of certain options may have a significant impact on the behavior of the JVM. Before modifying, it is recommended to carefully read the relevant JVM documentation and option descriptions.

4 The difference between vmoption and sysprop

vmoptionand syspropare two different commands in Arthas, they are used to view or modify JVM startup options and Java system properties respectively. Although they both involve the configuration of the Java runtime, their purpose and scope differ.

  1. vmoption:

    • Purpose: View or modify JVM startup options.
    • Scope: Affects the behavior and configuration of the entire JVM.
    • Setting timing: usually set through command line parameters when the JVM starts. Some options can be changed dynamically at runtime using vmoptionthe command.
  2. sysprop:

    • Purpose: View or modify Java system properties.
    • Scope: Affects the behavior and configuration of Java applications.
    • Setting timing: It can be set through the command line parameter ( ) when the JVM is started -Dkey=value, or it can be set using the method at runtime System.setProperty.
    • Example: set file encoding ( file.encoding), set Java temporary directory ( java.io.tmpdir), set user-defined application configuration, etc.
    • Access method: It can be accessed using the method in Java code System.getProperty.

Summary: vmoptionis mainly used to configure the behavior of the JVM, and syspropis mainly used to configure the behavior of Java applications. They are also different in how they are set up and accessed.

4.1 The vmoption of arthas can set the parameters -xx xms, -xx xmx

vmoptionCommands are used in Arthas to view and modify JVM startup parameters. However, not all JVM startup parameters can be modified at runtime. Some parameters, such as -Xmsand -Xmx, are used to set the initial size and maximum size of the heap memory. These parameters are set when the JVM starts and cannot be modified during operation. This is because these parameters have a significant impact on the JVM's memory management and allocation, and runtime modifications may cause erratic behavior.

Therefore, although vmoptionthe command can be used to view the startup parameters of the JVM, it cannot be used to modify -Xmsand -Xmxsuch parameters. If you need to modify these parameters, you need to set them when starting the JVM.

In summary, vmoptioncommands can be used to view JVM startup parameters, but not all parameters can be modified at runtime. If you need to modify these parameters, it is best to set them when starting the JVM.

The following are the parameters that can be set
insert image description here

4.2 Some parameters are -XX:+, some are -XX:-, and some do not have the following plus and minus signs. What do they mean?

JVM startup parameters usually -XX:start with and are used to set some JVM internal options and behaviors. These parameters take several forms:

  1. -XX:+OptionName: Indicates to turn on or enable an option.
  2. -XX:-OptionName: Indicates to turn off or disable an option.
  3. -XX:OptionName=value: Indicates setting the value of an option.

5 Can sysprop only target a single application, and then the parameters of vmoption have an impact on all java applications in this jvm?

Yes, your understanding is correct.

  1. sysprop(system property):

    • For a single application: System properties are set in the context of a specific Java application. Each Java application has its own set of system properties that are valid only for the running instance of that application. Setting or modifying system properties will not affect other applications running in the same JVM.
    • Access method: In Java code, you can use System.getPropertymethods to read system properties, or use System.setPropertymethods to set system properties.
  2. vmoption(JVM startup options):

    • Valid for all applications in the JVM: JVM startup options are set when the JVM starts, and are valid for all running Java applications in the JVM. Setting or modifying these options affects the entire JVM and all applications within it.
    • Access method: usually set through command line parameters when starting the JVM, eg java -Xms512m -Xmx1024m MyApp. vmoptionSome options can be changed dynamically at runtime using Arthas commands.

Summary: syspropIt is for a single Java application, but vmoptionit is valid for the entire JVM and all applications in it.

6 Can sysprop only target a single application, and then the parameters of vmoption have an impact on all java applications in this jvm?

Yes, your understanding is correct.

  1. sysprop(system property):

    • For a single application: System properties are set in the context of a specific Java application. Each Java application has its own set of system properties that are valid only for the running instance of that application. Setting or modifying system properties will not affect other applications running in the same JVM.
    • Access method: In Java code, you can use System.getPropertymethods to read system properties, or use System.setPropertymethods to set system properties.
  2. vmoption(JVM startup options):

    • Valid for all applications in the JVM: JVM startup options are set when the JVM starts, and are valid for all running Java applications in the JVM. Setting or modifying these options affects the entire JVM and all applications within it.
    • Access method: usually set through command line parameters when starting the JVM, eg java -Xms512m -Xmx1024m MyApp. vmoptionSome options can be changed dynamically at runtime using Arthas commands.

Summary: syspropIt is for a single Java application, but vmoptionit is valid for the entire JVM and all applications in it.

7 What does the jad command of arthas do, why is it called so

jadcommand is used to decompile Java class files into Java source code. It helps you view the source code of a running Java application, even if you don't have access to the original source code. The name "jad" comes from an early popular Java decompiler tool also called JAD (Java Decompiler).

What about 8 mc?

mc(Memory Compiler) command is used for memory compilation and dynamic loading of Java classes. It allows you to modify and reload Java classes directly in memory without restarting the Java application. This is handy for fixing bugs, debugging applications, or testing code changes.

9 The difference between arthas and jprofile

  • Arthas: It is a Java diagnostic tool open sourced by Alibaba, focusing on online diagnosis of Java applications. It provides many functions, including viewing JVM information, decompiling Java classes, dynamically loading Java classes, monitoring method calls, tracking method execution, etc. Arthas is a lightweight tool suitable for use in production environments to attach and diagnose without restarting the Java application.

  • JProfiler: is a commercial Java performance profiler, mainly used for performance analysis and optimization. It provides many functions, including CPU analysis, memory analysis, thread analysis, database analysis, etc. JProfiler is often used in development environments to help developers find performance bottlenecks, memory leaks, and threading issues. JProfiler has a graphical user interface (GUI) for easy visualization and analysis of performance data.

Summarize:

  • Arthas is more suitable for online diagnosis and troubleshooting, and can be used without restarting the application.
  • JProfiler is more suitable for development and testing environments, mainly for performance analysis and optimization.
  • Arthas is an open source tool, JProfiler is a commercial tool.

10 Why is the redefine command set to be unrecoverable after modification?

redefineThe behavior of the command is implemented based on the method Instrumentationin the interface of the Java language. redefineClassesThis method is a feature provided by the JVM that allows the bytecode of a class to be replaced at runtime. It is usually used in hot repair, hot deployment and other scenarios.

When we use redefinethe command, we actually replace the bytecode of the class at runtime. This is an irreversible operation because the JVM does not save the original bytecode. Once we redefine the class, the class in the JVM will use the new bytecode and cannot be restored to the original state.

Irreversibility here is not a design decision of Arthas, but based on a limitation of the JVM. redefineClassesmethod can only replace bytecode, but does not provide functionality to restore the original bytecode.

If you need to redefinerestore the original bytecode after , you can manually save the original bytecode of the class, and then use the redefinecommand to reload the original bytecode when you need to restore.

It should be noted that there are some restrictions when redefining classes. For example, fields and method signatures of a class cannot be added, removed, or modified. This is because such changes may cause incompatibilities between loaded classes and the new bytecode, and may cause runtime errors.

11 redefine

11.1 How to use the redefine command

For example, I now want to hot-deploy the new XXX.class file just uploaded to the /root/tmp/ directory of Linux. Just use this command:

redefine /root/tmp/ XXX.class

11.1.1 Why is only one parameter required when using the redefine command? Shouldn't you specify the fully qualified name of the replaced memory.class file?

In Java, .classthe inside of the file contains the metadata of the class, including the fully qualified name of the class. When Arthas redefinereads .classa file with a command, it parses the fully qualified name of the class from the file's metadata. Arthas will then use this fully qualified name to look up the corresponding class in runtime memory and replace its bytecode.

Specifically, the Java .classfile format contains a constant pool (Constant Pool), which contains the fully qualified name of the class, method name, field name and other information. Arthas can obtain the fully qualified name of the class from the constant pool.

11.2 A demo that uses redefine to implement online troubleshooting

Hot deployment: Arthas redefine command

12 Why does redefine conflict with reset

redefineand resetcommands conflict mainly because of their different working mechanism and purpose.

  1. redefineOrder:

    redefineCommands are used to load external .classfiles and redefine them into the JVM. It is a hot replacement technique that allows you to directly replace the bytecode of a loaded class without restarting the application. redefineThe command uses the method Instrumentationin the JVM's interface redefineClassesto implement bytecode replacement.

  2. resetOrder:

    resetCommand to reset all enhanced classes to their original state without any Arthas enhancements (such as monitoring, tracking, etc.). resetThe purpose of the command is to remove Arthas' modifications to the class and restore the bytecode of the class to the state when the application started.

Reason for conflict:

After you use redefinethe command to replace the bytecode of a class, Arthas' resetcommand does not recognize this new bytecode (because this new bytecode is not the bytecode at application startup). Therefore, resetthe command cannot correctly restore redefinethe post-class to its original state.

Similarly, redefineafter you execute the command, if you execute commands such as jad, watch, trace, monitor, ttetc., redefinethe bytecode of may be reset, because these commands may trigger class reloading.

redefineTo avoid such conflicts, it is recommended to execute the command before using the command resetto ensure that all enhanced classes are reset. Then use redefinethe command to replace the bytecode of the class. After replacing the bytecode, if you need to use other Arthas commands, be careful not to reset the redefinebytecode accidentally.

13 Because this bytecode is replaced, not modified on the original basis

Yes, when using redefinethe command, you actually completely replace the bytecode of the target class, rather than modifying the original bytecode. That's resetwhy the command doesn't recognize the redefined bytecode.

14 Then use redefine, use the reset command, except for the redefine file, other files should be restored to the state when the application was started, will it affect the application?

Yes, resetthe command restores other Arthas-enhanced classes to their state at application startup. Only redefineclasses replaced with the command cannot be restored. This has no effect on the application, it's just that the bytecode for that particular class cannot be restored.

15 If redefine is used and reset fails, is there any way to restore it to the state when the application started?

To restore to the state when the application was started:

  • Manual restore : redefineManually save the original bytecode of the target class before using the command. When recovery is required, use redefinethe command again to reload the saved original bytecode into the JVM.
  • Restart the application : If manual recovery is not possible, restarting the application is the easiest and most reliable method. Restarting the application causes the JVM to reload all classes, thus restoring the state when the application was started.

Note that redefinethe command is meant for hotfix or hotdeploy in case of emergency. During the development and testing phase, it is recommended to use the normal development, testing and deployment process. When using the command in a production environment redefine, please make sure to fully test it, and make a corresponding risk assessment and backup.

16 classloader command

16.1 classloader -t display class loader hierarchy and inheritance relationship

±BootstrapClassLoader
±sun.misc.Launcher E x t C l a s s L o a d e r @ 6 d 6 f 6 e 28 + − c o m . t a o b a o . a r t h a s . a g e n t . A r t h a s C l a s s l o a d e r @ 464 c d 0 d 8 + − s u n . m i s c . L a u n c h e r ExtClassLoader@6d6f6e28 +-com.taobao.arthas.agent.ArthasClassloader@464cd0d8 +-sun.misc.Launcher ExtClassLoader@6d6f6e28+com.taobao.arthas.agent.ArthasClassloader@464cd0d8+s u n . mi sc . L a u n c h er AppClassLoader@18b4aac2
±java.net.FactoryURLClassLoader@170c01c5
Affect(row-cnt:5) cost in 1 ms.the output of the command
in Arthasclassloaderclass loader hierarchy in the JVM.

In the JVM, class loaders are organized in a hierarchy, and each class loader (except the bootstrap class loader) has a parent loader. This organization facilitates the parental delegation model of class loaders.

In this example:

  • BootstrapClassLoaderis the bootstrap class loader for the JVM. It is responsible for loading the JVM's core class libraries (eg java.lang.* classes).
  • sun.misc.Launcher$ExtClassLoader@6d6f6e28is the extension class loader. It is responsible for loading classes located in <JAVA_HOME>/jre/lib/extthe directory or java.ext.dirsdirectories specified by the system property .
    • com.taobao.arthas.agent.ArthasClassloader@464cd0d8is the class loader for Arthas. It is responsible for loading Arthas' own classes.
    • sun.misc.Launcher$AppClassLoader@18b4aac2is the application class loader. It is responsible for loading application classes, usually located on the classpath (classpath).
      • java.net.FactoryURLClassLoader@170c01c5is a custom class loader. It may be used to load additional resources or plugins.

+The prefix or on each line -indicates the class loader hierarchy. +means it has a subclass loader, -means it has no subclass loader.

This output is very useful as it can help you understand the classloader structure and hierarchy to better understand how classes are loaded. This is useful for diagnosing classloading related issues such as class version conflicts, class not found, class conversion exceptions, etc.

16.2 In the parent delegation model, BootstrapClassLoader is the top-level loader, and ExtClassLoader is its subclass, right? Why are two siblings shown here?

In the parent delegation model, BootstrapClassLoaderis the top-level loader, but it is a special loader, not an object in Java. Therefore, in the class loader hierarchy, it BootstrapClassLoaderusually does not appear as a parent loader of other class loaders.

Specifically, BootstrapClassLoaderthe part of the JVM that is written in C++ and not exposed as Java objects. BootstrapClassLoaderThis is why you cannot get an instance of in a Java program .

In Java, all other class loaders are java.lang.ClassLoaderinstances of the class. ExtClassLoaderand AppClassLoaderare URLClassLoaderinstances of , which URLClassLoaderare ClassLoadersubclasses of . In Java's class loader hierarchy, the parent loader ExtClassLoaderof .AppClassLoader

In the output of the Arthas classloadercommand, BootstrapClassLoaderis displayed at the topmost level and ExtClassLoaderdisplayed alongside the . This is because BootstrapClassLoaderis not a Java object, it will not appear as a parent loader to other loaders. But in fact, BootstrapClassLoaderit is the top-level loader, ExtClassLoaderwhich is its subclass.

17 Differences between watch and monitor commands

18 ttandwatch

18.1 The difference ttbetweenwatch

ttand watchare both powerful commands in Arthas, they are similar in some respects, but the main usage scenarios and functions are different. Here are the main differences between them:

tt (TimeTunnel) command:

  1. Record call data : ttThe command is to record the call data of the method, including input parameters, output parameters, exceptions, etc., and store these data in a "time tunnel".

  2. View history : You can use ttthe command at any time to view the recorded call data, not just the most recent one.

  3. Replay Method Calls : ttA powerful feature of the command is the ability to replay method calls, which is helpful for reproducing issues and debugging them.

  4. Live display not supported : ttThe command does not display the results of method calls in real time unless you explicitly view already logged data.

watch command:

  1. Real-time observation : watchThe main purpose of the command is to observe the call data of the method in real time. Every time the target method is called, watchthe command is fired and the result is displayed immediately.

  2. Support expressions : watchThe command supports OGNL expressions, which makes it very flexible to display method call data, such as parameters, return values, exceptions, etc.

  3. Do not record historical data : ttUnlike commands, watchcommands do not record historical method call data. It only focuses on real-time data.

Summarize:

  • If you need to record the invocation data of a method and possibly view or replay it later, then ttthe command is more appropriate.

  • If you want to observe method invocation data in real time, or view method invocation data under certain conditions (for example, when a method throws an exception), then the command is watchmore suitable.

ttDespite their differences, the and commands are often used together in many debugging and diagnostic tasks watchto obtain more complete information.

18.2 Why do we need the tt command when we have the watch command?

When we debug an application, sometimes we may not know where the source of the problem is. ttCommands can help us record method calls, including information such as input parameters, return values, and thrown exceptions. This way, we can review this information at any time without spending time reproducing the problem or guessing where it might go wrong.

Specifically, ttthe command allows us to take a "snapshot" of the invocation of the target method and save this information in a "time tunnel". Later, we can use ttthe command to view these snapshots and analyze them to find the source of the problem. You can even re-execute method calls through ttthe command replay feature to gain deeper insight into the cause of the problem.

In short, ttthe command is a very powerful tool that can help us locate and solve problems more easily. Of course, like any other tool, in order to use it well, we need to have some understanding of its capabilities and limitations.

19 profiler

19.1 Why do you need a flame graph:

Flame Graph is a visual tool used to display the CPU resources occupied by the code during the running of the program. The advantage of the flame graph is that it can visually display the call stack of each function in the program and their CPU usage. The visual form of the flame graph allows developers to more easily locate performance bottlenecks in the program and quickly diagnose performance problems. Flame graphs are very useful tools for code performance optimization and program debugging.

19.2 The flame graph indicates the busyness of a single CPU or all CPUs:

A flame graph can represent how busy a single CPU or all CPUs are. It depends on how the data for the flame graph was collected.

  • Single CPU flame graph: When you only focus on the performance bottleneck of a single CPU, you can use the flame graph to represent the busyness of a single CPU. This is a common use case, especially when debugging single-threaded programs.

  • All CPU flame graphs: When you need to analyze the performance of the entire system, you can collect the data of all CPUs and generate a flame graph showing the busyness of all CPUs. This kind of flame graph can show the performance distribution of the whole system, which is helpful to understand the overall performance bottleneck.

When generating a flame graph, you can choose the range of data to be collected (single CPU or all CPUs), depending on your needs.

19.3 Why "The flame graph is to see which function on the top layer occupies the largest width. As long as there is a "flat top" (plateaus), it means that the function may have performance problems."

The y-axis represents the call stack, and each layer is a function. The deeper the call stack, the higher the flame, with the executing function at the top and its parent functions below.

The x-axis represents the number of samples. If a function occupies a wider width on the x-axis, it means that it has been sampled more times, that is, the execution time is longer. Note that the x-
axis does not represent time, but all call stacks are merged and arranged in alphabetical order.

**The flame graph is to see which function at the top level occupies the largest width. As long as there is a "plateaus", it indicates that the function may have performance problems.

A:
Flame Graph is a visualization technique used to show performance bottlenecks in the software execution stack. Flame graphs were invented by Brendan Gregg and are widely used in performance tuning and performance bottleneck analysis. The flame graph provides a more intuitive and efficient way to display the program's call stack and its performance.

The Y-axis of the flame graph represents the call stack, and each layer is a function. The deeper the call stack, the higher the flame. At the top is the currently executing function, and below it are its parent functions. The x-axis is not time, but a sample number of all callstacks combined in alphabetical order. The wider the width of a function on the X-axis, the more times it is drawn, that is, the longer the execution time.

The core idea of ​​the flame graph is to visualize the data of the call stack, so that developers can find performance bottlenecks faster. Flame graphs make performance bottlenecks very visible by visualizing stack data as a colorful graph.

A "plateaus" is a wider part of the flame graph, indicating that the function is sampled multiple times throughout its execution, implying that the function takes longer to execute. Therefore, a flat top usually indicates a performance bottleneck. Fixing performance issues in flattop functions can often significantly improve overall program performance.

The advantage of the flame graph is that it can quickly locate the code location where the performance bottleneck is located. Because the flame graph visualizes the function call stack, developers can quickly find functions that cause performance problems and optimize them.

19.4 What about the underlying functions? What if the ground floor is a flat top?

If the underlying function is flat-topped, it may not necessarily be a performance bottleneck. In a flame graph, the bottom layer is usually the root call, which may include the main function or other system calls. The width of the underlying function is likely due to it being called many times , but it's not necessarily a performance bottleneck in itself.

In performance analysis, we usually pay more attention to top-level functions, because their execution time may include the execution time of their sub-functions. The width of a top-level function represents the total execution time of that function and all its child functions. If the top-level function is flat-topped, it may be a performance bottleneck.

The width of the underlying function only represents the execution time of the function itself. Even if the underlying function is flat-topped, it may just be called frequently, but the execution time of each call may be short. Therefore, we generally don't consider low-level functions to be performance bottlenecks.

In short, the flat top in the flame graph is mainly used to find the performance bottleneck of the top-level function, and the flat top of the bottom function is usually not the focus of our attention. When optimizing performance, we should pay more attention to functions that are at the top level and have a larger width.

20 Using arthas to diagnose cases

20.1 Use arthas to determine which controller a time-consuming request comes from, and analyze and optimize the code

20.1.1 Why do the first step of "determining the controller of the request source"?

Analysis: First of all, we need to know which request we want to analyze, such as user login request. Secondly, we need to understand the characteristics of springMVC. We know that all requests will go through the DispatcherServet class, and then return to DispatcherServet from a getHandler method for processing The controller for this request.

20.1.2 Phase 1: Find a concrete solution

  1. So our focus is to directly use the watch command to observe the input parameters and return value of the getHandler method;
watch org.springframework.web.servlet.DispatcherServlet getHandler 'returnObj'
  1. Then trigger such a request on the front end or postman, for example, I trigger a login request here.

3. Check which controllers are obtained:
Through the red part below, we can know that this login request has passed the login method of the UserController controller and the findAll method of the StudentController controller.
insert image description here

20.1.3 Phase II: Analysis

  1. Input parameter and return value analysis: Use the watch command to observe the input parameter and return value of the processing method in this specific controller
watch com.itheima.controller.* * '{params,returnObj}' -x 2

The result obtained is as follows:

method=com.itheima.controller.UserController.login location=AtExit
ts=2023-08-21 20:55:52; [cost=5.2786ms] result=@ArrayList[
    @Object[][
        @User[User{
    
    id=null, name='newboy', password='123'}],
        @StandardSessionFacade[org.apache.catalina.session.StandardSessionFacade@2a57d000],
    ],
    @String[forward:/student/list],
]
method=com.itheima.controller.StudentController.findAll location=AtExit
ts=2023-08-21 20:55:52; [cost=6.0088ms] result=@ArrayList[
    @Object[][isEmpty=true;size=0],
    @ModelAndView[
        view=@String[list],
        model=@ModelMap[isEmpty=false;size=1],
        status=null,
        cleared=@Boolean[false],
    ],
]
  1. Call link and node time-consuming analysis: Use the trace command to get the call link of the controller processing method and the time spent on each node

trace com.itheima.controller.* login

insert image description here
We found that this controller calls the most time-consuming method "com.iheima.service.UserService:login()"

  1. So we can continue to analyze the login method of this business layer

trace com.itheima.service.UserService login

The results are as follows: We found that the com.iheima.dao.UserDao:login() method for accessing the database is the most time-consuming. It can be seen that IO operations often occupy most of the time for processing a request.

Affect(class count: 3 , method count: 2) cost in 78 ms, listenerId: 12
`---ts=2023-08-21 21:10:42;thread_name=http-nio-8080-exec-3;id=1a;is_daemon=true;priority=5;TCCL=org.apache.catalina.loader.ParallelWebappClassLoader@64e2f243
    `---[7.7167ms] com.sun.proxy.$Proxy25:login()
        `---[28.93% 2.2322ms ] com.itheima.service.impl.UserServiceImpl:login()
            `---[98.83% 2.206ms ] com.itheima.dao.UserDao:login() #17
  1. View the generated proxy method and use the jad command to decompile

jad com.sun.proxy.$Proxy24 login

ClassLoader:

  +-java.net.URLClassLoader@1c4af82c
    +-sun.misc.Launcher$AppClassLoader@764c12b6
      +-sun.misc.Launcher$ExtClassLoader@3d82c5f3

Location:

public final User login(User user) {
    
    
    try {
    
    
        return (User)this.h.invoke(this, m3, new Object[]{
    
    user});
    }
    catch (Error | RuntimeException throwable) {
    
    
        throw throwable;
    }
    catch (Throwable throwable) {
    
    
        throw new UndeclaredThrowableException(throwable);
    }
}

20.1.4 Phase 3: Code Optimization

  1. SQL optimization:

Note: The most time-consuming method in this project is a com.sun.proxy.$Proxy25:login()proxy method with a path of . This method is generated by the mybatis plug-in. Generally, we cannot change it. If accessing the database is really time-consuming, it is also to modify the sql statement and adjust it. Database related parameters

  1. If the business layer method of processing the request is very time-consuming, then our code optimization generally involves modifying the bytecode file, so we need to use the dump command to save the bytecode file where the business method is located in another directory; suppose We get the concrete implementation class in step 6 -com.itheima.service.impl.UserServiceImpl:login() method is very time-consuming, we need to improve it now

Save the UserServiceImpl class to another directory for recovery,

dump com.itheima.service.impl.UserServiceImpl > /root/UserServiceImpl.java

Decompile the UserServiceImpl implementation class to the specified directory,

jad --source-only com.itheima.service.impl.UserServiceImp > /root/UserServiceImpl.java

Modify the /root/UserServiceImpl.java file in the local idea or in the vim editor, and use the mc command to compile the java file to the /root/bbb/ directory

mc -d /root/bbb/ /root/UserServiceImpl.java

Use the redefine command to load new bytecode

redefine /root/bbb/UserServiceImpl.class

20.2 Locate the corresponding commands in turn according to an optimization summary of Zhihu

Interviewer: How to perform JVM tuning (with real cases)

20.2.1 Basic Knowledge Questions

20.2.1.1 After the boot program of arthas is running, you must first select a java process, and then use the dashboard command of arthas. The dashboard here shows the macro operation of this process.

Yes, the Arthas dashboardcommand provides a macro real-time view of the health and key metrics of selected Java processes. This view can help developers quickly understand the overall status and performance of the Java process.

dashboardThe information displayed by the command includes but is not limited to:

  1. System Load : Displays information such as the CPU usage, memory usage, and number of threads of the system.

  2. JVM information : Displays information such as heap memory usage, non-heap memory usage, and garbage collection of the JVM.

  3. Thread information : Displays the thread state distribution in the Java process, including running threads, blocked threads, waiting threads, etc.

  4. Tomcat Information : If the Java process is a Tomcat server, also shows Tomcat's health and key metrics.

With this information, developers can quickly understand the running status of the Java process, discover performance bottlenecks or signs of failure, and take further diagnostic and optimization measures as needed.

20.2.1.2 The following is the dashboard’s memory usage information about the java process. Is the total memory here the memory allocated by the jvm for this application or the entire memory of the jvm?

Memory                    used    total    max     usage    GC
heap                      60M     243M     3593M   1.69%    gc.ps_scavenge.count          1
ps_eden_space             49M     63M      1326M   3.76%    gc.ps_scavenge.time(ms)       6
ps_survivor_space         10M     10M      10M     99.71%   gc.ps_marksweep.count         0
ps_old_gen                481K    173568K          0.02%    gc.ps_marksweep.time(ms)      0
nonheap                   28M     29M      -1      97.16%
code_cache                6M      6M       240M    2.74%
metaspace                 19M     20M      -1      96.72%
compressed_class_space    2M      2M       1024M   0.23%

The "total" here refers to the heap memory allocated by the JVM for this application. When the JVM starts, it will allocate a certain amount of heap memory for the application. This value can be configured in the startup parameters. The "total" value will gradually increase as the program runs (until it reaches the "max" value). When the value of "total" increases, it means that the application needs more memory to store objects.

in conclusion:

  • "used" is the heap memory already used by the application.
  • "total" is the heap memory currently allocated by the JVM for the application.
  • "max" is the maximum amount of heap memory that the JVM can allocate for the application.
  • "usage" is the usage rate, calculated as "used" / "max".

20.2.1.3 Are the settings of the following parameters of jvm only for a certain java process instead of all processes in the entire jvm virtual machine?

-Xms128m
-Xmx1024m
-XX:ReservedCodeCacheSize=512m
-XX:+IgnoreUnrecognizedVMOptions
-XX:+UseG1GC
-XX:SoftRefLRUPolicyMSPerMB=50
-XX:CICompilerCount=2
-XX:+HeapDumpOnOutOfMemoryError
-XX:-OmitStackTraceInFastThrow 

Yes, these JVM parameters are only for a specific Java process, not the entire JVM virtual machine. JVM parameters are specified when starting a Java application and are valid only for that process . Each Java process has its own JVM instance, and these parameters are independent for each JVM instance.

A brief explanation of these parameters:

  • -Xms128m: Set the initial size of the JVM heap memory to 128MB.
  • -Xmx1024m: Set the maximum size of JVM heap memory to 1024MB.
  • -XX:ReservedCodeCacheSize=512m: Set the size of the code cache to 512MB.
  • -XX:+IgnoreUnrecognizedVMOptions: Allows the JVM to ignore unrecognized JVM options.
  • -XX:+UseG1GC: Use the G1 garbage collector.
  • -XX:SoftRefLRUPolicyMSPerMB=50: Set the maximum idle time for soft referenced objects.
  • -XX:CICompilerCount=2: Set the number of threads of the just-in-time compiler to 2.
  • -XX:+HeapDumpOnOutOfMemoryError: Generate a heap dump file when an out-of-memory error occurs.
  • -XX:-OmitStackTraceInFastThrow: Do not omit stack information when certain common exceptions are thrown.

These parameters are Java Virtual Machine (JVM) startup parameters used to configure and tune the runtime environment of Java applications. Different Java processes can have different startup parameters.

20.2.2 cpu index

查看占用CPU最多的进程
查看占用CPU最多的线程
查看线程堆栈快照信息
分析代码执行热点
查看哪个代码占用CPU执行时间最长
查看每个方法占用CPU时间比例

In Arthas, you can use the following command to solve the problem you mentioned:

  1. To see which processes are using the most CPU:

    • Use operating system commands such as topor psview. Arthas does not have a specific command to view the CPU usage of a process.
  2. View the threads that occupy the most CPU; view several threads that are deadlocked:

    • threadThe command can view the thread information in the Java process, including the state of the thread, and sort according to the CPU usage.
      insert image description here
  3. View thread stack snapshot information:

    • threadThe command can view a specific thread information in the Java process, including the thread stack information.
      insert image description here
  4. Analyze code execution hotspots:

    • profilercommand can analyze the execution hotspots of a Java application and generate a flame graph.
    • traceThe command can track a certain (some) method calls of a certain class, and view the internal call link information, and give the time-consuming and proportion of each node

insert image description here

  1. See which code takes the longest CPU execution time:

    • profilercommand to analyze the execution of a Java application and generate a flame graph.
    • watchCommands can observe method calls and returns, and can calculate method execution time.
  2. View the percentage of CPU time taken up by each method:

    • profilercommand to analyze the execution of a Java application and generate a flame graph.
    • traceThe command can analyze the time-consuming and proportion of the method represented by each node on the calling link
      insert image description here

It should be noted that profilerthe flame graph of the command is generated based on statistical information, which can display the execution time ratio of each method. Flame graphs can help you find performance bottlenecks and hotspots for optimizing code.

20.2.2 JVM memory metrics

查看当前 JVM 堆内存参数配置是否合理
查看堆中对象的统计信息
查看堆存储快照,分析内存的占用情况
查看堆各区域的内存增长是否正常
查看是哪个区域导致的GC
查看GC后能否正常回收到内存

In Arthas, the following command can help you to solve the mentioned problem:

  1. Check whether the current JVM heap memory parameter configuration is reasonable:

    • jvmThe command can view various parameter configurations and runtime information of the JVM, including heap memory, non-heap memory, GC and other information.
    • vmoptioncommand can
  2. View statistics on objects in the heap:

    • heapdumpcommand to generate a heap storage snapshot. The snapshot file can then be analyzed with an external tool such as MAT to view the object's statistics.
    • scThe and smcommands can view information about classes and methods loaded by the system, but they do not provide detailed statistics for objects.
  3. View the heap storage snapshot and analyze the memory usage:

    • heapdumpcommand to generate a heap storage snapshot. The snapshot file can then be analyzed with an external tool such as MAT to see the memory footprint.
  4. Check whether the memory growth in each area of ​​the heap is normal:

    • dashboardThe command provides a simple overview of heap memory, showing the memory usage of various heap regions.
    • jvmThe command can also view the usage of various areas of the heap memory.
  5. See which area caused the GC:

    • dashboardThe command also displays GC statistics.
Memory                    used    total    max     usage    GC
heap                      60M     243M     3593M   1.69%    gc.ps_scavenge.count          1
ps_eden_space             49M     63M      1326M   3.76%    gc.ps_scavenge.time(ms)       6
ps_survivor_space         10M     10M      10M     99.71%   gc.ps_marksweep.count         0
ps_old_gen                481K    173568K          0.02%    gc.ps_marksweep.time(ms)      0
nonheap                   28M     29M      -1      97.16%
code_cache                6M      6M       240M    2.74%
metaspace                 19M     20M      -1      96.72%
compressed_class_space    2M      2M       1024M   0.23%

  1. Check whether the memory can be recovered normally after GC:
    • dashboardWhen the command displays the memory usage, you can see whether there is memory release after GC.
    • jvmThe command can also view the usage of each area of ​​the heap memory, as well as GC statistics.

It should be noted that what Arthas provides is a way to diagnose and analyze Java applications at runtime. It can help you view and collect some information, but deeper analysis and diagnosis may require the help of other tools, such as memory analysis tools (MAT), code analysis tools, performance analysis tools, etc.

  1. Generate a Java heap storage snapshot dump file and store it in a specified directory

heapdump /path/to/dumpFile.hprof

20.2.3 JVM GC Metrics

查看每分钟GC时间是否正常
查看每分钟YGC次数是否正常
查看FGC次数是否正常
查看单次FGC时间是否正常
查看单次GC各阶段详细耗时,找到耗时严重的阶段
查看对象的动态晋升年龄是否正常

The above parameters can be viewed through the dashboard command, such as the following memory usage table

Memory                    used    total    max     usage    GC
heap                      73M     243M     3593M   2.04%    gc.ps_scavenge.count          1
ps_eden_space             62M     63M      1326M   4.71%    gc.ps_scavenge.time(ms)       6
ps_survivor_space         10M     10M      10M     99.71%   gc.ps_marksweep.count         0
ps_old_gen                481K    173568K          0.02%    gc.ps_marksweep.time(ms)      0
nonheap                   29M     30M      -1      97.61%
code_cache                6M      7M       240M    2.91%
metaspace                 20M     20M      -1      97.65%
compressed_class_space    2M      2M       1024M   0.24%

In the output of the above dashboardcommand, we can find the following information to answer your question:

  1. Check whether the GC time per minute is normal

    • gc.ps_scavenge.time(ms)Reflects the total time spent on Minor GC (YGC).
    • gc.ps_marksweep.time(ms)Reflects the total time spent on Full GC (FGC).
  2. Check whether the number of YGC per minute is normal

    • gc.ps_scavenge.countReflects the number of Minor GC (YGC).
  3. Check whether the FGC times are normal

    • gc.ps_marksweep.countReflects the number of Full GC (FGC).
  4. Check whether the single FGC time is normal

    • This information is not included in the above output. In order to view individual FGC times, you need to view log files or use other tools, eg jstat.
  5. View the detailed time-consuming of each stage of a single GC, and find the stage with serious time-consuming

    • This information is not included in the above output. In order to view the detailed time consumption of each stage, you need to use other tools, such as GC logs, jstator visualization tools (such as JVisualVM).
  6. Check whether the dynamic promotion age of the object is normal

    • This information is not included in the above output. To see the dynamic promotion age of an object, you need to use other tools such as jstator a visualization tool such as JVisualVM.

Note: In the above output, we can see the usage of heap memory and the memory usage of different areas, but these information do not directly reflect the detailed behavior of GC. For a deeper analysis of the GC's behavior, you may need to use other tools or analyze the GC logs.

Q: Why can ps_scavenge reflect ygc and ps_marksweep reflect FGC?

The names here are derived from the names of the two garbage collectors used by the Java HotSpot virtual machine.

  1. PS Scavenge(Parallell Scavenge): This is a garbage collector for the young generation, which is parallel and is mainly used to clean up objects in the young generation. Therefore, gc.ps_scavenge.time(ms)it reflects the total time of Minor GC (YGC), that is, the garbage collection time of the young generation.

  2. PS MarkSweep: This is a garbage collector for the old age. It is serial in the marking and cleaning phase. In CMS, it is possible to trigger a YGC operation, and the time of YGC is relatively small. This collector is typically used for garbage collection during an application's global pause. Therefore, gc.ps_marksweep.time(ms)it reflects the total time of Full GC (FGC), that is, the garbage collection time of the entire heap (including young generation and old generation).

These two collectors are usually used together and form the default garbage collection strategy in the Java HotSpot virtual machine.

20.2.4 What is the difference between JNI reflection and bytecode reflection? (OOM issues related to method area/metaspace)

background:

元空间主要适用于存放类的相关信息,而存在内存碎片化说明很可能创建了较多的类加载器,同时使用率较低。

因此,当元空间出现内存碎片化时,我们会着重关注是不是创建了大量的类加载器。

在 JVM 上,最初是通过 JNI 调用来实现方法的反射调用,当 JVM 注意到通过反射经常访问某个方法时,它将生成字节码来执行相同的操作,称为膨胀(inflation)机制。如果使用字节码的方式,则会为该方法生成一个 DelegatingClassLoader,如果存在大量方法经常反射调用,则会导致创建大量 DelegatingClassLoader。

反射调用频次达到多少才会从 JNI 转字节码?

默认是15次,可通过参数 -Dsun.reflect.inflationThreshold 进行控制,在小于该次数时会使用 JNI 的方式对方法进行调用,如果调用次数超过该次数就会使用字节码的方式生成方法调用。

分析结论:反射调用导致创建大量 DelegatingClassLoader,占用了较大的元空间内存,同时存在内存碎片化现象,导致元空间利用率不高,从而较快达到阈值,触发 FGC。

1 Metaspace is mainly suitable for storing class-related information, and the existence of metaspace fragmentation or OOM indicates that more class loaders are likely to be created, and the usage rate is low. Why?

Metaspace (Metaspace) is mainly used to store metadata information of classes, including class bytecodes, method information, field information, etc. When a large number of class loaders are created and many classes are loaded by those class loaders, the usage in metaspace can become low. There may be several reasons:

  1. Short-lived class loaders : If there are many short-lived class loaders, they load classes for a while and are recycled when they are done. This can lead to memory fragmentation in the metaspace.

  2. Dynamic code generation : Some frameworks and libraries dynamically generate and load classes, which can cause a large number of classes to be loaded into the metaspace, further causing memory fragmentation.

  3. Improper class loader design : If there is an improper class loader design in the application, such as an unreasonable class loader hierarchy or too many custom class loaders, then it may cause memory fragmentation of the metaspace .

  4. Frequent class unloading : If an application loads and unloads classes frequently, it can cause memory fragmentation in the metaspace.

  5. Class loader leaks : If class loaders hold references to classes and are not properly reclaimed, this can lead to memory fragmentation in the metaspace.

In summary, memory fragmentation in metaspace is usually caused by unreasonable design of class loaders, dynamic code generation, short-lived class loaders, frequent class loading and unloading, etc. The methods to solve this problem include reasonable design of class loader, avoid unnecessary dynamic code generation, reasonable management of class loader and so on.

2 Why does the bytecode loading mechanism need to be used if the reflection calls a certain method too frequently? Isn’t this a space-consuming mechanism for creating a new bytecode loader?

When reflection calls a method, the JVM's initial approach is to use the JNI (Java Native Interface) to perform the call. However, JNI calls have a certain overhead on performance, so when a method is repeatedly called reflectively, the JVM will take an optimization measure, that is, by generating a new Java class, using bytecode instead of JNI to make reflective calls . This optimization mechanism is called "inflation".

The newly generated Java class here is a proxy class designed for fast execution of reflective calls. In order to load this newly generated Java class, the JVM needs to create a new class loader ( DelegatingClassLoader). This practice is to ensure the isolation and security of class loading. Different reflection calls should be performed under their own independent class loaders to avoid conflicts and interference between class loaders.

It should be noted that this new class loader is not created for each reflection call. Instead, once a newly generated Java class is created and loaded, it can be shared by multiple reflection calls. However, if there are many different methods that are frequently called reflectively, the JVM will generate new Java classes for these different methods, and create corresponding Java classes DelegatingClassLoaderto load these newly generated classes. This approach may cause a large number of DelegatingClassLoaderobjects to be created, which may put some pressure on memory .

In general, the purpose of this approach is to improve the performance of reflection calls, but it may bring additional memory overhead. In practical applications, if reflection calls are very frequent, it is recommended to consider using other technologies instead of reflection to reduce performance overhead and memory consumption.

3 Why do you need to create a DelegatingClassLoader for every reflection, can’t you just reuse the first one? (In fact, after creating a DelegatingClassLoader, subsequent reflections will reuse this classloader)

Actually, not every reflection needs to create a new one DelegatingClassLoader. In the JVM, when reflection calls a method, the initial stage usually uses the JNI (Java Native Interface) method to call. When a method is called frequently through reflection (usually more than 15 times, which can be -Dsun.reflect.inflationThresholdcontrolled by parameters), the JVM will trigger the inflation mechanism, dynamically generate a new Java class, and replace the JNI method with bytecode reflection call.

This newly generated Java class needs to be loaded into the JVM, so one is created DelegatingClassLoaderto load this new Java class. This is to isolate different reflection calls and ensure that they are performed under their own independent class loaders, so that conflicts and interference between class loaders can be avoided. Once this classloader is created and loaded with newly generated Java classes, it can be shared by multiple reflection calls without creating a new one for each reflection call DelegatingClassLoader.

But if a large number of methods are frequently called by reflection, the JVM will generate new Java classes for these different methods, and create corresponding ones DelegatingClassLoaderto load these newly generated classes, which may cause a large number of Java classes DelegatingClassLoaderto be created.

4 The JNI method may involve OS system calls, even disk IO, and may also interact with other languages. These are time-consuming, so jvm adopts the method of exchanging space for time, right?

Partially correct. JNI (Java Native Interface) is a framework that allows Java code and native code (such as C or C++ code) to call each other. Through JNI, Java code can call native methods and vice versa. Using JNI, developers can use native libraries to perform tasks that Java may not be able to perform efficiently, or to access system-level resources.

JNI calls do have some overhead, mostly because of the need to context switch between the Java Virtual Machine and native code . However, these overheads are generally not related to operating system system calls or disk I/O. The main overhead of JNI comes from the data conversions and context switches that must be done. For example, when Java code calls a native method, it may be necessary to convert the Java object to a native data structure, or to switch the execution environment between the virtual machine and the native code.

When the JVM detects that a method is often invoked through reflection, it uses the inflation mechanism to replace JNI calls by generating Java bytecode. The purpose of this is really to improve performance, since bytecode execution is usually faster than JNI calls. However, this approach increases memory overhead, because a new Java class needs to be generated for each method that is frequently invoked reflectively, and a new class loader is created to load this class.

In general, the JVM employs bloat to improve the performance of reflective calls, but it is not to avoid system calls or disk I/O. Rather, it is to avoid the data conversion and context switching overhead brought by JNI calls.

5 Why jdk is fast to create but relatively slow to execute, cglib is just the opposite. The
performance difference between JDK dynamic proxy and CGLIB dynamic proxy mainly comes from their implementation methods and applicable scenarios.

  1. JDK dynamic proxy : Proxy based on interface, which means that only a proxy class needs to be created for the interface, and the proxy class will implement the interface of the proxy object and forward all method calls to an InvocationHandler. So the creation of proxy classes is relatively fast. However, every time a method is called, it is necessary to make a reflection call through the invoke method of InvocationHandler, which is time-consuming.

  2. CGLIB dynamic proxy : Proxy based on inheritance. For each proxied class, a subclass inherited from the class will be generated. The subclass overrides all non-final methods of the parent class, and adds enhanced code to the overridden method. Therefore, new classes and bytecodes need to be generated when creating proxy objects, which is relatively slow. But at runtime, the code that directly calls the covered method does not need to go through reflection, so the execution speed is faster.

That's why JDK proxies are fast to create but slow to execute, and CGLIB proxies are slow to create but fast to execute. However, with the optimization of the JVM, the performance difference between the two proxy methods gradually narrows. When choosing a proxy method, other factors should also be considered, such as whether there is an interface, whether a specific method needs to be proxied, and so on.

Guess you like

Origin blog.csdn.net/yxg520s/article/details/132381305