"In-depth understanding of the Java virtual machine" reading notes (eight)-late (runtime) optimization (below)

One, common sub-expression elimination

  If an expression E has been calculated, and the values ​​of all variables in E have not changed from the previous calculation, then this occurrence of E becomes a common sub-expression . For this kind of expression, there is no need to spend time to calculate it, just directly replace E with the result of the previous calculation. If this optimization is limited to the basic blocks of the program, it is called local common sub-expression elimination , and if the scope of optimization covers multiple basic blocks, it is called global common sub-expression elimination .
  For the following code:

int d = (c * b) * 12 + a + (a + b * c);

  For the just-in-time compiler, it is detected that "c*b" and "b*c" are the same expression, and the values ​​of b and c are unchanged during the calculation. Therefore, this expression may be compiled as:

int d = E * 12 + a + (a + E);

  At this time, the compiler may also perform another optimization, changing the expression to:

ind d = E * 13 + a * 2;

2. Elimination of array boundary check

  When accessing array elements in Java, the system will automatically check the range of the upper and lower bounds: for array array, when accessing array[i], you need to check that i must meet the condition i>=0 && i<array.length , Otherwise it will throw a runtime exception: ArrayIndexOutOfBoundsException. Although this is a good thing for developers, for the execution subsystem of the virtual machine, there is an implicit conditional judgment every time an array is read and written. It is also a performance for program codes that have a large number of array accesses. burden.
  In order to be safe, the array boundary check must be done, but it does not necessarily have to be checked once during operation. For example, when an array access occurs in a loop, and the loop variable is used for array access, if the compiler can determine that the value of the loop variable will not overflow through data flow analysis, then the upper and lower positions of the array can be set in the entire loop. The boundary check is eliminated, which can save many times of conditional judgment operations.

Three, implicit exception handling

  In fact, security checks are more than array boundary checks, null pointer access will get NullPointException, and the divisor will get ArthmeticException and so on. Similar problems occur in C/C++ programs, and various situations will appear if you are not careful, and carelessness will also cause the program to crash and exit. But these security checks have also become an implicit overhead, resulting in the same program, Java has to do more than C/C++, if it is not handled well, it may become a factor that the Java language is slower than C/C++.
  The aforementioned advancement of the partial array boundary check to the compiler is to better handle this overhead. In addition to this, there is another way to avoid it- implicit exception handling , the null pointer check in Java and the divisor in arithmetic operations are This idea is adopted for all 0 inspections.
  For example, if a program wants to access a certain attribute of an object, the process of using Java pseudo code to represent the virtual machine access is as follows:

if(obj != null){
    
    
	return obj.value;
}else{
    
    
	throw new NullPointException();
}

  After using implicit exception optimization, the virtual machine turns the access process represented by the above pseudo code into:

try{
    
    
	return obj.value;
}catch(segment_fault){
    
    
	uncommon_trap();
}

  The virtual machine registers an exception handler for the Segment Fault signal (uncommon_trap in the above pseudo code). When obj is not empty, the access to value will not consume an additional overhead for obj. When the price is really obj is empty, must be transferred to the exception handler to recover and throw NullPointException exceptions, this process should go to kernel mode from user mode process, after the end return to user mode, faster than a sentence Empty check is slow. If obj is rarely empty, then implicit exception optimization is worthwhile, otherwise it will be counterproductive. The virtual machine automatically selects an automatic scheme based on the Profile information collected during the runtime.

Four, method inlining

  Due to the polymorphic selection of Java methods, for a virtual method, the compiler cannot determine which method version should be used if it needs to be inlined. (For static dispatch and dynamic dispatch of methods, please refer to the virtual machine bytecode execution engine (on) )
  Java object methods are virtual methods by default , and Java indirectly encourages programmers to use virtual methods to complete program logic. In order to solve the problem of inlining of virtual methods, the JVM design team first introduced a technology called " Type Inheritance Analysis (CHA) ", which is used to determine whether there is more than one interface in the currently loaded class. The realization of species, whether a certain class has subclasses, whether the subclasses are abstract classes, and other information.
  When the compiler is inlining, if it is a non-virtual method, just inline it directly. If a virtual method is encountered, it will query CHA whether there are multiple target versions of this method under the current program. If there is only one version, then it can also be inlined. This kind of inline is a radical optimization and needs to reserve an "escape" "Gate" is called guardian inline . If the state does not change, this inline optimized code can be used forever, but if a new class that causes a change in the inheritance relationship is loaded , the compiled code needs to be discarded, returned to the interpreted state for execution, or recompiled.
  If the result of querying CHA is that there are multiple versions of the target method, the compiler will make one last effort to use the inline cache to complete the method inlining. The working principle is roughly: the inline cache is initially empty. When the first call occurs, the cache records the version information of the method receiver, and compares the method receiver version every time the method is called. If the receiver is The versions are the same, so this inline can always be maintained. If the method receivers are inconsistent, it means that the program really uses the polymorphism of the virtual method. Only then will the inlining be cancelled and the virtual method table will be searched for method dispatch.

Five, escape analysis

  The basic behavior of escape analysis is to analyze the dynamic scope of an object: when an object is defined in a method, it may be referenced by an external method, such as being passed to other methods as a call parameter or returned to the caller, which is called method escape . It may even be accessed by other threads, such as assigning to class variables or instance variables that can be accessed in other threads, which is called thread escape .
  If an object does not escape outside the method or thread, it means that other methods or threads cannot access the object in any way, and it is possible to make some efficient optimizations for this variable:

  • Allocation on the stack : If it is determined that an object will not escape, then let the object be directly allocated in the stack frame, so that the memory space occupied by the object can be destroyed when the stack frame is popped out , reducing the pressure of the GC.
  • Synchronization elimination : If a variable does not escape the thread and cannot be accessed by other threads, then there will be no competition in the read and write of this variable, and the synchronization measures implemented on this variable can be eliminated.
  • Scalar replacement : A scalar refers to a data that cannot be decomposed into smaller data to represent. For example, primitive data types (numerical types such as int, long, and reference types, etc.) cannot be further decomposed and can be called scalar . If a piece of data can continue to be decomposed, it is called an aggregate amount , such as an object in Java. If a Java object is broken up, and the member variables used by it are restored to the original type for access according to the program's access situation, it is called scalar substitution. If the escape analysis proves that an object cannot be accessed from the outside, and the object can be disassembled, the object may not be created when the program is actually executed , and several of its member variables used by this method may be directly created instead. Instead. After the object is split, in addition to allowing the member variables of the object to be allocated and read and written on the stack, it can also create conditions for subsequent further optimization methods.

  There is no guarantee that the performance benefit of escape analysis will be higher than its consumption. If there are few objects that do not escape after the analysis is completed, the analysis consumption will be wasted. The user can use the parameter -XX:+DoEscapeAnalysis to turn on the escape analysis. After opening it, you can use the parameter: -XX:+PrintEscapeAnalysis to view the analysis results. With escape analysis, you can use the parameter -XX:+EliminateAllocations to turn on scalar replacement, use +XX:+EliminateLocks to turn on synchronization elimination, and use the parameter: +XX:+PrintEliminateAllocations to view the scalar replacement.

Six, Java and C/C++ compiler comparison

  When Java first appeared, it was mainly executed by the interpreter, and the slow execution speed was real, but with the development of just-in-time compilation technology, it has been greatly improved. However, compared with the static optimizing compiler of C/C++, the just-in-time compiler of the Java virtual machine may have some disadvantages due to the following reasons:

  • Just-in-time compiler occupies the running time of the user program , which has a lot of time pressure, and the optimization methods it can provide are also severely constrained by the cost of compilation.
  • The Java language is a dynamic type-safe language. The virtual machine frequently performs dynamic checks , such as checking for null pointers, checking subscripts during array access, and checking inheritance during type conversion. Although the compiler will try to optimize, it still consumes a lot of running time overall.
  • The frequency of use of virtual methods in Java is much higher than that of C/C++, which means that the frequency of polymorphic selection of method receivers at runtime is higher, and there is a higher optimization difficulty (such as the method mentioned above) United).
  • Java is a language that can be dynamically extended. Loading new classes at runtime may change the inheritance relationship of program types , making many global optimizations difficult to perform, and many global optimizations can only be done in a radically optimized way.
  • The efficiency of object allocation and garbage collection in Java is lower than that of stack allocation (program code controllable) and manual memory release in C/C++.

  But there are also some optimizations that Java's just-in-time compiler can do, while C/C++'s static optimization compiler can't do it. For example, in C/C++, alias analysis is much more difficult than Java, and Java benefits from its type safety. In addition, because all optimizations of the C/C++ compiler are completed during compilation, optimization measures based on runtime performance monitoring cannot be performed, such as call frequency prediction , branch frequency prediction , and cutting of unselected branches .

Guess you like

Origin blog.csdn.net/huangzhilin2015/article/details/115258649