Impact of JVM escape analysis on performance

The basic behavior of escape analysis is to analyze the dynamic scope of an object: when an object is defined in a method, it may be referenced by an external method, called method escape. It may even be accessed by external threads, such as assignment to class variables or instance variables that can be accessed in other threads, called thread escape.

Several methods of method escape are as follows:

public class EscapeTest {
    public static Object obj;
    public void globalVariableEscape () {// assign value to global variable, escape occurs
        obj = new Object ();
    }
    public Object methodEscape () {// method return Value, escape occurs
        return new Object ();
    }
    public void instanceEscape () {// Instance reference escape occurs
        test (this);
    }
}

Allocation on the
stack Allocation on the stack is to allocate variables and objects in the method to the stack, method After the execution is completed, it is automatically destroyed without intervention of garbage collection, thereby improving system performance.

Synchronization eliminates
thread synchronization itself, which is more expensive. If it is determined that an object will not escape the thread and cannot be accessed by other threads, then there will be no competition for reading and writing of the object, and synchronization measures for this variable can be eliminated. There is no lock contention in a single thread. (The objects in the lock and the lock block will not be able to escape the thread and the synchronization block can be cancelled.)

Scalar replacement
The primitive data types (int, long and other numeric types and reference types, etc.) in the Java virtual machine cannot be further decomposed, they can be called scalars. On the contrary, if a piece of data can continue to be decomposed, then it is called an aggregation. The most typical aggregation in Java is an object. If escape analysis proves that an object will not be accessed externally, and this object is decomposable, then when the program is actually executed, it may not create this object, but instead directly create several of its member variables used by this method Instead. The disassembled variables can be analyzed and optimized separately, and the
space can be allocated on the stack frame or the register separately. The original object does not need to allocate space as a whole.

In-depth analysis of stack allocation
public class OnStackTest {
    public static void alloc () {
        byte [] b = new byte [2];
        b [0] = 1;
    }
    public static void main (String [] args) {
        long b = System .currentTimeMillis ();
        for (int i = 0; i <100000000; i ++) {
            alloc ();
        }
        long e = System.currentTimeMillis ();
        System.out.println (eb);
    }
}

-XX: + DoEscapeAnalysis enables escape analysis (jdk1.8 is enabled by default, other versions are not tested)
-XX: -DoEscapeAnalysis disable escape analysis to

enable escape analysis, the execution time is 4 milliseconds. As shown below:


Turn off the escape analysis, the execution time is 618 milliseconds, and there is a lot of GC log information. As shown below:


** Escape analysis is turned on and off through the above: When escape analysis is
turned on, the object is not allocated on the heap and GC is not performed, but the object is allocated on the stack.
When escape analysis is turned off, all objects are allocated on the heap. When the objects in the heap are full, multiple GCs are performed, resulting in a greatly extended execution time. Allocation on the heap is hundreds of times slower than allocation on the stack. **

Just-in-time Compilation (JIT)
1. When using the client compiler, the default execution is 1500 times before it is considered to be hot code;
2. When using the server compiler, the default execution is 10000 times before it is considered to be Hot code;
after the escape analysis is turned on in the above example, not all objects are directly allocated on the stack, but this code is hot code by JIT analysis, and then the asynchronous compilation is compiled into local machine code, and through escape analysis, the Allocated on the stack. (If it is a server compiler: during the first 10,000 cycles and compilation of local machine code, the object will be allocated on the heap, and the local machine code will be allocated on the stack after compilation)

-XX: + EliminateAllocations open scalar Replacement (jdk1.8 is enabled by default, other versions have not been tested)
-XX: -EliminateAllocations Turn off scalar replacement.
Scalar replacement is based on analysis of escape. Turn on scalar replacement. Turn on escape analysis and

turn off scalar replacement.

This time we turned on escape analysis, and turned off the scalar replacement function, we found that the object was allocated to the heap again, and performed multiple GCs. It can be seen from this that java does not implement real stack allocation, but scalar replacement to achieve stack allocation.

In-depth analysis of lock removal
slightly modified the above OnStackTest code and added a synchronization block. The default array length greater than 64 will not be allocated on the stack. We will use the allocation on the heap as an example to test the impact of lock elimination.

public class OnStackTest {
    public static void alloc () {
        byte [] b = new byte [65];
        synchronized (b) {// synchronous code block
            b [0] = 1;
        }
    }
    public static void main (String [] args ) throws IOException {
        long b = System.currentTimeMillis ();
        for (int i = 0; i <100000000; i ++) {
            alloc ();
        }
        long e = System.currentTimeMillis ();
        System.out.println (eb);
    }
}

-XX: + EliminateLocks open lock elimination (jdk1.8 is enabled by default, other versions have not been tested)
-XX: -EliminateLocks close lock elimination
lock elimination is based on analysis of escape, open lock elimination must open escape analysis

open lock elimination

close lock elimination


The open lock elimination execution time is 1807 milliseconds and the
close lock elimination execution time is 3801 milliseconds.
By opening and closing lock elimination, we can see that the performance is at least doubled.

Guess you like

Origin www.cnblogs.com/zhuyeshen/p/12735782.html