In-depth JVM - JIT layered compilation technology and log details

In-depth JVM - JIT layered compilation technology and log details

1. Background

There are two modes in which the JVM executes bytecode ( bytecode) at runtime:

  • The first is to explain the execution mode (interprets), which is theoretically slower to execute, especially for large-scale loops and calculation tasks;
  • The other is the compilation and operation mode (JIT, just-in-time compilation, just-in-time compilation), which can greatly improve the performance, with an average improvement ratio of tens to hundreds of times.

java -versionmixed modeThis is what the command, in the output , means.

Take a look at the example for JDK11:

% java -version
java version "11.0.6" 2020-01-14 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.6+8-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.6+8-LTS, mixed mode)

In the output content, mixed modeit means mixed mode, that is, to mix compiled mode and interpreted mode.

Look at the example of JDK8 again:

$ java -version

openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

View the relevant startup parameters for interpreted execution and compiled execution:

java -X
    ...省略部分内容...
    -Xint             仅解释模式执行
    -Xmixed           混合模式执行(默认值)
    -Xcomp            在首次调用时强制编译方法

    -Xms<大小>        设置初始 Java 堆大小
    -Xmx<大小>        设置最大 Java 堆大小
    -Xdiag            显示附加诊断消息
    -Xinternalversion
                      显示比 -version 选项更详细的 JVM版本信息
    -XshowSettings:all
                      显示所有设置并继续
    -XshowSettings:vm
                      显示所有与 vm 相关的设置并继续
    -XshowSettings:system
                      (仅 Linux)显示主机系统或容器配置并继续
    -Xss<大小>        设置 Java 线程堆栈大小
    -Xverify          设置字节码验证器的模式

In a CRUD program with general concurrency, there may be no obvious performance difference between the two methods macroscopically, because the time is mainly consumed in other places besides the CPU, such as waiting operations such as network and IO interaction.

The JIT compiler can greatly improve the performance of the program, which is divided into two types;

  • One is called 客户端编译器(Client Complier), the design goal is to make the Java program start quickly; the main usage scenario is a graphical interface client such as AWT/Swing;
  • The other one is called 服务端编译器(Server Complier), the design goal is to have better performance overall; as the name suggests, the main usage scenario is a server system that runs continuously for a long time.

In older Java versions, we can explicitly specify which instant compiler to use for the Hotspot JVM through startup parameters;

In order to be compatible with more complex usage scenarios and achieve better performance, starting from the Java 7 version, tiered compilation technology (tiered compilation) has been introduced.

This article first introduces these two JIT compilers; then introduces the Tiered Compilation technology (Tiered Compilation) and its five compilation levels in detail; finally, through specific examples, analyzes the compilation log to gain an in-depth understanding of the operating principle of JIT compilation.

2. JIT compiler

The JIT compiler, or just-in-time compiler (JIT compiler), is used to compile high-frequency bytecodes ( bytecode) into local machine codes ( native code).

Frequently executed code is called 热点代码(hotspots), which is Hotspot JVMwhere the name comes from.

Through just-in-time compilation technology, the execution performance of Java programs is greatly improved, which is similar to that of purely compiled languages.

Of course, in actual software development practice, code quality is also a factor that has a great impact on performance.

Comparing complex systems developed with high-level languages, the comprehensive quality of high-level language systems is much superior at the same "development cost" compared with those developed with low-level languages.

Hotspot JVM provides 2 types of JIT compilers:

  • Compiler for client version: C1
  • Compiler for server version: C2

2.1. Client version compiler: C1

The client version of the compiler (client compiler), called C1(compiler 1) in the technical field, is a JVM built-in just-in-time compiler. One of the design goals is to make Java applications start faster. Therefore, the word Section code is optimized as much as possible, and compiles quickly to machine code.

Initially, the main application scenarios of C1 are client applications with a short life cycle. For such applications, startup time is a very important non-functional requirement.

In versions before Java8, you can specify -clienta startup parameter to set the C1 compiler, but in Java8 and higher Java versions, this parameter has no effect, and it is reserved only to avoid errors and to be compatible with previous startup scripts.

Verify command:java -client -version

2.2. Server version compiler: C2

The server-side version of the compiler (server compiler), called C2(compiler 2) in the technical field, is a JVM built-in just-in-time compiler (JIT compiler) with better performance, suitable for applications with a longer life cycle , the main usage scenario is server-side applications.
C2 will monitor and analyze the execution of compiled code, and through these analysis data, it can generate and replace with more optimized machine code.

In versions prior to Java8, it is necessary to specify -servera startup parameter to set the C2 compiler, but in Java8 and higher Java versions, this parameter has no effect, and it is reserved to avoid error reporting.

Verify command:java -server -version

The output content is different from the previous client mode

2.3. Graal JIT Compiler

Java 10 and later versions begin to support the Graal JIT compiler, which is a compiler that can replace C2.

It is characterized by supporting both just-in-time compilation mode and ahead-of-time compilation mode.

The pre-compiled mode is to compile all Java bytecodes into native codes before the program starts.

3. Tiered Compilation

Compared with the C1 compiler, the C2 compiler needs to consume more CPU and memory resources when compiling the same method, but can generate highly optimized native code with excellent performance.

Starting from the Java 7 version, JVM has introduced layered compilation technology, the goal is to comprehensively utilize C1 and C2 to achieve a balance between fast startup and long-term efficient operation.

3.1. Merging the advantages of the two compilers

The whole process of hierarchical compilation is shown in the figure below:

insert image description here

The first stage:

After the application starts, the JVM first interprets and executes all bytecodes, and collects various information related to method calls.
Next, the JIT compiler analyzes the collected data to find hot codes.

second stage:

Start C1, and quickly compile frequently executed methods into local machine code.

The third phase:

After collecting enough information, C2 will intervene;
C2 will consume a certain amount of CPU time to compile, and adopt a more aggressive method to recompile the code into highly optimized local machine code to improve performance.

Overall, C1 quickly improves code execution efficiency, and C2 analyzes based on hotspot code, so that the performance of compiled local code can be improved again .

3.2. Accurate Profiling

Another benefit of layered compilation is more accurate analysis of the code.

In versions of Java without tiered compilation, the JVM can only collect the optimization information it needs during interpretation.

With layered compilation, JVM also collects information during the execution of code compiled by C1. Since the compiled code has better performance, it can tolerate more data analysis sampling performed by the JVM.

3.3. Code Cache

Code cache is a memory area in the JVM, which is used to store all local machine codes generated after JIT compilation.
Using layered compilation technology, the memory usage required by the code cache has increased to about 4 times of the original.

Java 9 and later versions divide the JVM's code cache pool into three areas:

  • Code cache area used by non-Java methods (non-method): Store the native code inside the JVM; the default size is about 5 MB, which can be specified through the startup parameters -XX:NonNMethodCodeHeapSize.
  • Code cache area with information collection (profiled-code): stores the local code compiled by C1; generally the life cycle of this part of the code is not long, the default size is about 122 MB, which can be specified through the startup parameters -XX:ProfiledCodeHeapSize.
  • Code cache area without information collection (non-profiled): stores the local code compiled and optimized by C2; generally, this part of the code has a long life cycle, and the default size is about 122 MB, which can be specified through the startup parameters -XX:NonProfiledCodeHeapSize.

By splitting the code cache pool into multiple blocks, the overall performance has been improved a lot, because the compiled related codes are closer together (code locality) and the problem of memory fragmentation is reduced (memory fragmentation).

3.4. Deoptimization

Although the compiled C2 is a highly optimized native code, which will generally be retained for a long time, sometimes deoptimization operations will also occur.
The result is that the corresponding code falls back to JVM interpreted mode.

Deoptimization occurs because the compiler's optimistic expectations are broken, for example, if the collected profiling information does not match the method's actual behavior:

insert image description here

In this scenario, once the hotspot path changes, the JVM will deoptimize the previously compiled, inline-optimized code.

4. Compilation Levels

The JVM has a built-in interpreter and 2 JIT compilers, with a total of 5 possible compilation levels;

C1 can operate on 3 compilation levels, the difference between these 3 levels is whether the sampling analysis work is completed.

4.1. Level 0 - Interpreted code

After the JVM starts, it interprets and executes all Java code. At this initial stage, performance is generally not comparable to compiled languages.

However, the JIT compiler kicks in after the warmup phase and compiles the hot code at runtime.

The JIT compiler performs optimizations by analyzing sampling information collected during Level 0 (Level 0) .

4.2. Level 1 - C1 simply compiled code

At Level 1, the JVM uses the C1 compiler to compile the code, but does not perform any analysis data sampling. The JVM uses level 1 for simple methods.

Many methods have no complexity, even if they are recompiled with C2, the performance will not be improved, such as methods such as Getter and Setter.

Therefore, the conclusion drawn by the JVM is that collecting and analyzing information cannot optimize performance, so it is useless to collect and analyze information, and the collection logic is not implanted at all.

4.3. Level 2 - C1 compiled code after restriction

At the Level 2 level, the JVM uses the C1 compiler to compile the code and perform simple sampling analysis.

When C2's queue to be compiled is full (limited), the JVM will use this level. The goal is to compile the code as quickly as possible to improve performance.

Later, the JVM recompiles the code at Level 3 with full sample analysis.

Finally, if the C2 queue is no longer busy, the JVM will recompile at Level 4.

4.4. Level 3 - C1 fully compiled code

At Level 3, the JVM uses C1 to compile code with complete sampling analysis.

Level 3 is part of the `default compilation path``.

Therefore, except for simple methods, or when the compiler queue is full, the JVM compiles with this level in all other cases.

The most common scenario in JIT compilation is to jump directly from the interpreted code (Level 0) to Level 3.

4.5. Level 4 - C2 compiled code

At Level 4, the JVM uses C2 to perform code compilation for the strongest long-term performance.

Level 4 is also 默认编译路径part of . Except for simple methods, the JVM uses this level to compile all other methods.

Level 4 code is assumed to be fully optimized code and the JVM stops collecting profiling information.

However, it is also possible to unoptimize and fall back to Level 0.

5. Compiler parameter settings

After Java 8 version, tiered compilation is enabled by default. Do not disable tiered compilation unless you have a specific and valid reason to do so.

5.1. Disable layered compilation

By setting –XX:-TieredCompilation, to disable hierarchical compilation.
When this flag is disabled with a minus sign ( -TieredCompilation), the JVM will not switch between compilation levels.
So you also need to choose the JIT compiler to use: C1, or C2.

If not explicitly specified, the JVM will determine the default JIT compiler based on CPU characteristics.
For multi-core processors or 64-bit virtual machines, the JVM will choose C2.

If you want to disable C2 and only use C1 without increasing the performance loss of the analysis, you can pass in the startup parameters -XX:TieredStopAtLevel=1.

To completely disable the JIT compiler and use the interpreter to run everything, you can specify a startup parameter -Xint. Of course, disabling the JIT compiler will have some negative impact on performance.

In some cases, such as complex generic combinations used in program code, generic information may be erased due to JIT optimization. At this time, you can try to disable layered compilation or JIT compilation.

For simple CRUD programs with a small amount of concurrency, because the CPU calculation time in the entire processing link is very small, the difference between interpretation execution and compilation execution is not obvious.

5.2. Set the trigger threshold for each level of compilation (Threshold)

The compile threshold refers to the number of method calls that need to be reached before the code is compiled.

In the case of tiered compilation, 2~4these thresholds can be set for the compilation level of .

For example, we can lower the threshold for Tier4 to 10,000: -XX:Tier4CompileThreshold=10000.

java -versionIt is a very useful means of detecting JVM parameters.

For example, we can -XX:+PrintFlagsFinalrun with the flag java -versionto check the default threshold on a certain Java version,

An example of the parameters for the Java 8 version is as follows:

java -XX:+PrintFlagsFinal -version | grep Threshold

 intx BackEdgeThreshold                         = 100000    {
    
    pd product}
 intx BiasedLockingBulkRebiasThreshold          = 20        {
    
    product}
 intx BiasedLockingBulkRevokeThreshold          = 40        {
    
    product}
uintx CMSPrecleanThreshold                      = 1000      {
    
    product}
uintx CMSScheduleRemarkEdenSizeThreshold        = 2097152   {
    
    product}
uintx CMSWorkQueueDrainThreshold                = 10        {
    
    product}
uintx CMS_SweepTimerThresholdMillis             = 10        {
    
    product}
 intx CompileThreshold= 10000     {
    
    pd product}
 intx G1ConcRefinementThresholdStep             = 0         {
    
    product}
uintx G1SATBBufferEnqueueingThresholdPercent    = 60        {
    
    product}
uintx IncreaseFirstTierCompileThresholdAt       = 50        {
    
    product}
uintx InitialTenuringThreshold                  = 7         {
    
    product}
uintx LargePageHeapSizeThreshold                = 134217728 {
    
    product}
uintx MaxTenuringThreshold                      = 15        {
    
    product}
 intx MinInliningThreshold                      = 250       {
    
    product}
uintx PretenureSizeThreshold                    = 0         {
    
    product}
uintx ShenandoahAllocationThreshold             = 0         {
    
    product rw}
uintx ShenandoahFreeThreshold                   = 10        {
    
    product rw}
uintx ShenandoahFullGCThreshold                 = 3         {
    
    product rw}
uintx ShenandoahGarbageThreshold                = 60        {
    
    product rw}
uintx StringDeduplicationAgeThreshold           = 3         {
    
    product}
uintx ThresholdTolerance                        = 10        {
    
    product}
 intx Tier2BackEdgeThreshold                    = 0         {
    
    product}
 intx Tier2CompileThreshold                     = 0         {
    
    product}
 intx Tier3BackEdgeThreshold                    = 60000     {
    
    product}
 intx Tier3CompileThreshold                     = 2000      {
    
    product}
 intx Tier3InvocationThreshold                  = 200       {
    
    product}
 intx Tier3MinInvocationThreshold               = 100       {
    
    product}
 intx Tier4BackEdgeThreshold                    = 40000     {
    
    product}
 intx Tier4CompileThreshold                     = 15000     {
    
    product}
 intx Tier4InvocationThreshold                  = 5000      {
    
    product}
 intx Tier4MinInvocationThreshold               = 600       {
    
    product}

openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

An example of the parameters for the Java 11 version is as follows:

java -XX:+PrintFlagsFinal -version | grep Threshold

 intx BiasedLockingBulkRebiasThreshold         = 20              {
    
    product} {
    
    default}
 intx BiasedLockingBulkRevokeThreshold         = 40              {
    
    product} {
    
    default}
uintx CMSPrecleanThreshold                     = 1000            {
    
    product} {
    
    default}
size_t CMSScheduleRemarkEdenSizeThreshold      = 2097152         {
    
    product} {
    
    default}
uintx CMSWorkQueueDrainThreshold               = 10              {
    
    product} {
    
    default}
uintx CMS_SweepTimerThresholdMillis            = 10              {
    
    product} {
    
    default}
 intx CompileThreshold                         = 10000        {
    
    pd product} {
    
    default}
double CompileThresholdScaling                 = 1.000000        {
    
    product} {
    
    default}
size_t G1ConcRefinementThresholdStep           = 2               {
    
    product} {
    
    default}
uintx G1SATBBufferEnqueueingThresholdPercent   = 60              {
    
    product} {
    
    default}
uintx IncreaseFirstTierCompileThresholdAt      = 50              {
    
    product} {
    
    default}
uintx InitialTenuringThreshold                 = 7               {
    
    product} {
    
    default}
size_t LargePageHeapSizeThreshold              = 134217728       {
    
    product} {
    
    default}
uintx MaxTenuringThreshold                     = 15              {
    
    product} {
    
    default}
 intx MinInliningThreshold                     = 250             {
    
    product} {
    
    default}
size_t PretenureSizeThreshold                  = 0               {
    
    product} {
    
    default}
uintx StringDeduplicationAgeThreshold          = 3               {
    
    product} {
    
    default}
uintx ThresholdTolerance                       = 10              {
    
    product} {
    
    default}
 intx Tier2BackEdgeThreshold                   = 0               {
    
    product} {
    
    default}
 intx Tier2CompileThreshold                    = 0               {
    
    product} {
    
    default}
 intx Tier3AOTBackEdgeThreshold                = 120000          {
    
    product} {
    
    default}
 intx Tier3AOTCompileThreshold                 = 15000           {
    
    product} {
    
    default}
 intx Tier3AOTInvocationThreshold              = 10000           {
    
    product} {
    
    default}
 intx Tier3AOTMinInvocationThreshold           = 1000            {
    
    product} {
    
    default}
 intx Tier3BackEdgeThreshold                   = 60000           {
    
    product} {
    
    default}
 intx Tier3CompileThreshold                    = 2000            {
    
    product} {
    
    default}
 intx Tier3InvocationThreshold                 = 200             {
    
    product} {
    
    default}
 intx Tier3MinInvocationThreshold              = 100             {
    
    product} {
    
    default}
 intx Tier4BackEdgeThreshold                   = 40000           {
    
    product} {
    
    default}
 intx Tier4CompileThreshold                    = 15000           {
    
    product} {
    
    default}
 intx Tier4InvocationThreshold                 = 5000            {
    
    product} {
    
    default}
 intx Tier4MinInvocationThreshold              = 600             {
    
    product} {
    
    default}

java version "11.0.6" 2020-01-14 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.6+8-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.6+8-LTS, mixed mode)

Focus on the following CompileThresholdsigns:

java -XX:+PrintFlagsFinal -version | grep CompileThreshold
intx CompileThreshold = 10000
intx Tier2CompileThreshold = 0
intx Tier3CompileThreshold = 2000
intx Tier4CompileThreshold = 15000

It should be noted that if layered compilation is enabled, the general compilation threshold parameter CompileThreshold = 10000will no longer take effect.

6. Method compilation

The life cycle of method compilation is shown in the figure below:

insert image description here

In general, a method is initially interpreted and executed by the JVM. until the number of calls reaches Tier3CompileThresholdthe specified threshold.
Once the threshold is reached, the JVM compiles the method using the C1 compiler while continuing to collect profiling information.
When the number of method calls reaches Tier4CompileThreshold, the JVM uses the C2 compiler to compile the method.

Of course, it is possible for the JVM to cancel the optimization of the code by the C2 compiler. Then this process may be repeated back and forth.

6.1. Compile log format

By default, the JVM disables the output of JIT compilation logs. To enable it, a startup parameter needs to be set -XX:+PrintCompilation.

The format of the compilation log consists of these sections:

  • Timestamp (Timestamp) – The value in milliseconds from the JVM startup time at compile time. The time of many JVM logs is this relative time.
  • Compile ID (Compile ID) – the auto-increment ID corresponding to each compiled method.
  • Attributes – There are 5 possible values ​​for the status corresponding to the compilation task:
    • %– On-stack replacement
    • s– The method is a synchronized method
    • !– The method contains an exception handler
    • b– blocking mode
    • n– Native method flag (native method), actually compiles the wrapper method
  • Compilation level: the value is from 0to4
  • Method name
  • Bytecode size
  • Inverse optimization indicator flag, there are 2 possible values:
    • Made not entrant - eg a standard C1 deoptimization occurs, or a compiler optimistic inference error.
    • Set to zombie mode (made zombie) – no longer used, can be cleaned up at any time, a cleaning mechanism when the garbage collector releases code cache space.

A sample line of compilation log is as follows:

# 这里为了排版进行了折行
2258 1324 %     4
       com.cncounter.demo.compile.TieredCompilation::main @ 2
        (58 bytes)  made not entrant

Simple interpretation from left to right:

  • 2258It is the timestamp milliseconds since the JVM was started;
  • 1324It is the compilation ID corresponding to a certain method. When there are multiple compilation records, you can use this id to locate.
  • %Indicates replacement on the stack;
  • 4Indicates that the compilation level is level 4 (value 0-4)
  • com.cncounter.demo.compile.TieredCompilation::maindisplay method
  • @ 2This is not necessary, and you can see other numbers by analyzing the compilation log, which always %appear together with the replacement on the stack, and may be related to the stack memory slot.
  • (58 bytes)Indicates that the bytecode corresponding to this method is 58 bytes.
  • made not entrantIf there is this string of characters, it is a deoptimization indicator.

A sample compilation log executed by JDK11 can refer to the file: compile-log-sample.txt

6.2. Demo code

The following is a specific example to show the life cycle of method compilation.

First create a simple Formatterinterface:

package com.cncounter.demo.compile;

public interface Formatter {
    
    
    <T> String format(T object) throws Exception;
}

Then create a simple implementation class in JSON format JsonFormatter:

package com.cncounter.demo.compile;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.json.JsonMapper;

public class JsonFormatter implements Formatter {
    
    

    private static final JsonMapper mapper = new JsonMapper();
    @Override
    public <T> String format(T object) throws JsonProcessingException {
    
    
        return mapper.writeValueAsString(object);
    }
}

The corresponding dependencies in the code can be searched on https://mvnrepository.com website:

Strictly speaking, there is a difference between formatting and serialization: formatting = converting an object to a string; serialization = converting an object to a sequence of bytes.

Create another implementation class for XML formatting XmlFormatter:

package com.cncounter.demo.compile;

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;

public class XmlFormatter implements Formatter {
    
    

    private static final XmlMapper mapper = new XmlMapper();

    @Override
    public <T> String format(T object) throws JsonProcessingException {
    
    
        return mapper.writeValueAsString(object);
    }

}

and a simple class Article:

package com.cncounter.demo.compile;

public class Article {
    
    

    private String name;
    private String author;

    public Article(String name, String author) {
    
    
        this.name = name;
        this.author = author;
    }

    public String getName() {
    
    
        return name;
    }

    public String getAuthor() {
    
    
        return author;
    }

}

After preparing these classes, write a mainclass containing method to call these two formatters.

package com.cncounter.demo.compile;

public class TieredCompilation {
    
    

    public static void main(String[] args) throws Exception {
    
    
        for (int i = 0; i < 1_000_000; i++) {
    
    
            Formatter formatter;
            if (i < 500_000) {
    
    
                formatter = new JsonFormatter();
            } else {
    
    
                formatter = new XmlFormatter();
            }
            formatter.format(new Article("Tiered Compilation in JVM", "CNC"));
        }
    }

}

forThere are statements in the loop ifto judge the number of loops, the implementation is called first JsonFormatter, and after a certain number of times, the implementation is XmlFormattercalled.

After the code is written, when executing the program, you need to specify the JVM startup parameters -XX:+PrintCompilation. Note that the plus sign ( +) of the startup parameter is used to turn on the switch, and the minus sign ( -) means it is turned off.

After executing the program, you can see the corresponding compilation log.

A sample compilation log executed by JDK11 can refer to the file: compile-log-sample.txt

6.3. Interpreting the compilation log

Tip: Execute the same program multiple times, the corresponding compilation logs may be different, and specific analysis is required for specific situations.

The output compilation log has a lot of content. For a log sample of a certain execution using JDK11, please refer to the file: compile-log-sample.txt

Using the pipeline | grep cncounter, filter out the parts of interest:

cat compile-log-sample.txt| grep cncounter

1023  788       1       com.cncounter.demo.compile.Article::getName (5 bytes)
1025  789       1       com.cncounter.demo.compile.Article::getAuthor (5 bytes)

1032  800       3       com.cncounter.demo.compile.JsonFormatter::<init> (5 bytes)
1032  801       3       com.cncounter.demo.compile.Article::<init> (15 bytes)
1041  820       3       com.cncounter.demo.compile.JsonFormatter::format (8 bytes)

1122  903       4       com.cncounter.demo.compile.JsonFormatter::<init> (5 bytes)
1123  800       3       com.cncounter.demo.compile.JsonFormatter::<init> (5 bytes)   made not entrant
1123  904       4       com.cncounter.demo.compile.Article::<init> (15 bytes)
1124  801       3       com.cncounter.demo.compile.Article::<init> (15 bytes)   made not entrant

1132  932 %     3       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)
1133  933       3       com.cncounter.demo.compile.TieredCompilation::main (58 bytes)

1146  905       4       com.cncounter.demo.compile.JsonFormatter::format (8 bytes)
1281  820       3       com.cncounter.demo.compile.JsonFormatter::format (8 bytes)   made not entrant
1281  934 %     4       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)
1285  932 %     3       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)   made not entrant

1346  934 %     4       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)   made not entrant
1350  933       3       com.cncounter.demo.compile.TieredCompilation::main (58 bytes)   made not entrant
1361  905       4       com.cncounter.demo.compile.JsonFormatter::format (8 bytes)   made not entrant

1543 1228       2       com.cncounter.demo.compile.XmlFormatter::<init> (5 bytes)
1546 1235       2       com.cncounter.demo.compile.XmlFormatter::format (8 bytes)

1561 1298 %     3       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)
1577 1310       3       com.cncounter.demo.compile.TieredCompilation::main (58 bytes)
1935 1324 %     4       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)
1939 1298 %     3       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)   made not entrant

2258 1324 %     4       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)   made not entrant

6.3.1. Timestamp

The first column represents the timestamp; it represents the number of milliseconds from the JVM startup time point, and you can see that the timestamps of the compilation log output are in order.

The other parts are briefly explained below;

6.3.2. Level 1

The first two lines of compilation logs correspond to the and methods Articleof the class :getNamegetAuthor

1023  788       1       com.cncounter.demo.compile.Article::getName (5 bytes)
1025  789       1       com.cncounter.demo.compile.Article::getAuthor (5 bytes)

The implementation of these two get methods is very simple, and there is little room for optimization.

Review the previous knowledge points:

Level 1 This level represents C1 simple compiled code, JVM uses level 1 for simple methods.

6.3.3. Level 3

The next compilation log is level 3, and levels 1 to 3 correspond to C1 compilers.

1032  800       3       com.cncounter.demo.compile.JsonFormatter::<init> (5 bytes)
1032  801       3       com.cncounter.demo.compile.Article::<init> (15 bytes)
1041  820       3       com.cncounter.demo.compile.JsonFormatter::format (8 bytes)

<init>method is actually a method generated by the compiler after integrating 构造方法and 实例初始化块, which is automatically called when creating an object.

JsonFormatterClass formatmethods also go into level 3.

6.3.4. Level 4

The next compilation log is level 4, which corresponds to the C2 compiler.

1122  903       4       com.cncounter.demo.compile.JsonFormatter::<init> (5 bytes)
1123  800       3       com.cncounter.demo.compile.JsonFormatter::<init> (5 bytes)   made not entrant
1123  904       4       com.cncounter.demo.compile.Article::<init> (15 bytes)
1124  801       3       com.cncounter.demo.compile.Article::<init> (15 bytes)   made not entrant

The corresponding method is <init>that this compilation log is a bit interesting.

Look carefully at this part of the log, and you can find that after each level 4 compilation log, there is a low-level inaccessible sign ( made not entrant);

The reason is easy to understand. After upgrading, the old one will become obsolete.

6.3.5. On-stack replacement

The next compilation log is still level 3.

1132  932 %     3       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)
1133  933       3       com.cncounter.demo.compile.TieredCompilation::main (58 bytes)

The percent sign ( ) here %indicates that an on-stack replacement has occurred;

In the compiled main method, a certain thread is executing the method, so a replacement on the stack occurs.

6.3.6. Level 4 and on-stack substitution

The next compile log is level 4.

1146  905       4       com.cncounter.demo.compile.JsonFormatter::format (8 bytes)
1281  820       3       com.cncounter.demo.compile.JsonFormatter::format (8 bytes)   made not entrant
1281  934 %     4       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)
1285  932 %     3       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)   made not entrant

JsonFormatter::formatMethod compilation is raised to level 4, which was introduced earlier.

The replaced TieredCompilation::mainmethod on the stack is also upgraded to level 4, and the part of level 3 is marked as inaccessible.

6.3.7. Deoptimization

Then something unexpected happened.

1346  934 %     4       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)   made not entrant
1350  933       3       com.cncounter.demo.compile.TieredCompilation::main (58 bytes)   made not entrant
1361  905       4       com.cncounter.demo.compile.JsonFormatter::format (8 bytes)   made not entrant

TieredCompilation::mainThe method is deoptimized by the JVM;

Looking back at mainthe Java code of the method, we see that after executing the loop 500,000 times, the result of the if condition changes.

Maybe C2 has done some radical optimizations such as pruning, optimistically inferring that the corresponding preconditions are no longer valid, so the JVM will return the optimized code to the interpretation mode.

6.3.8. Level 2

Next is XmlFormatterthe method compilation of the class.

1543 1228       2       com.cncounter.demo.compile.XmlFormatter::<init> (5 bytes)
1546 1235       2       com.cncounter.demo.compile.XmlFormatter::format (8 bytes)

To recap, level 2 - restricted code compiled by C1;

At the Level 2 level, the JVM uses the C1 compiler to compile the code and perform simple sampling analysis.

It may be that the compilation queue is full, or it may be hurt by the fallback just now. JVM uses C1 to XmlFormatter::formatquickly compile Level2 methods.

In this execution, until the program exits, there is no further optimization of the methods of this class.

6.3.9. Re-optimize

After another period of execution, the main method is upgraded again.

1561 1298 %     3       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)
1577 1310       3       com.cncounter.demo.compile.TieredCompilation::main (58 bytes)
1935 1324 %     4       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)
1939 1298 %     3       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)   made not entrant

Here, level 3 on-stack replacement and level 3 compilation occur first.

Then another level 4 on-stack replacement occurs, along with obsolete marks for lower-level on-stack replacements.

6.3.9. Method Exit

Finally, mainwhen the method execution ends, the corresponding method stack is no longer there.

2258 1324 %     4       com.cncounter.demo.compile.TieredCompilation::main @ 2 (58 bytes)   made not entrant

So the Level 4 optimization method replaced on the stack is also marked as inaccessible.

7. Summary

This article provides a brief introduction to tiered compilation techniques in the JVM.

Covers both types of JIT compilers, and how tiered compilation techniques use them in combination for best results.

It also details 5 different compilation levels, along with associated JVM tuning parameters.

The last is a specific case. By printing and analyzing the compilation log, the entire life cycle of Java method compilation and optimization has been deeply studied.

For related sample codes, you can also refer to: https://github.com/eugenp/tutorials/tree/master/core-java-modules/core-java-lang-4

reference documents

Guess you like

Origin blog.csdn.net/renfufei/article/details/132190834