In-depth understanding of the Java virtual machine optimization & Lock escape analysis techniques

introduction

HotSpot Virtual Machine team 1.5 -> 1.6 version of evolution, a lot of lock optimization techniques, and contracting appropriate jdk6 also introduced a number of concurrent containers & API, so JDK6 is a key to efficient concurrent version shine. This paper describes about the java virtual machine optimization technology for locks, escape analysis techniques.

Lock Optimization: adaptability spin locks elimination, lock coarsening, lightweight locks and lock bias

Escape Analysis: distribution on the stack, to eliminate synchronization, scalar replacement, etc.

Theoretical basis

Introducing optimization & lock during escape analysis before introduction, let's review the basic concepts, it is necessary.

· The synchronized synchronization method for locking ACC_SYNCHRONIZED keywords based on implicit synchronization code block lock and release the lock based monitorenter & monitorexit. Before locking optimization technology is not yet mature, synchronized implementation is a "heavyweight lock" operation by ObjectMonitor call directly enter & exit.

· The main object header contains the GC generational age, lock status flag, hash code, epoch and other information. There are five state of the object, respectively, is no lock state, the lock lightweight, heavyweight lock, and the GC numerals biased locking

· In the HotSpot virtual machine, using oop-klass model to represent objects


Figure Transfer from https://blog.csdn.net/linxdcn/article/details/73287490

· Thread five basic state, lock and switch threads & thread state are closely related.

For thread all, a total of five states, namely: the initial state (New), a ready state (Runnable), running state (Running), blocked (Blocked) and death state (Dead).


Figure Transfer from https://www.cnblogs.com/aspirant/p/8900276.html

· Java multithreading model

· Kernel threads 1: 1

• The user Thread 1: n

• The user threads, plus lightweight process to achieve mixed m & n


Lock optimization

Spinlocks

Java-based multi-threaded model, and the thread mutually exclusive principles state switch, can know the greatest impact on performance is achieved blocked, suspend and resume threads need to be mapped to the OS kernel mode is completed, while the virtual machine team found that in many applications , the lock status of shared data will only last a short period of time, this time in order to suspend and resume the thread is not cost-effective. If the physical machine has more than one processor, you can achieve parallel operation, then we can get behind that request thread lock 'wait a minute', but do not give up the processor execution time, to see whether the thread holding the lock quickly it will release the lock. This "wait a little" process is spin.

Spin locks have been introduced in JDK1.4 in, but turned off by default, JDK6 enabled by default.

In JDK1.4 version, by parameters -XX: UseSpinning choose to open spin, -XX: PreBlockSpin to change the spin of the time, the default value is 10 times.

Adaptive spin locks

JDK6 introduced adaptive spin lock, so we do not need to specify a fixed spin time, there are opportunities for virtual intelligent algorithm strategy of choice time, with the program running and continuously improve performance monitoring information, the virtual machine is forecast to be increasingly the more accurate, more and more 'smart'.

Spin and blocked difference is whether to give up the processor execution time, more suitable for spin less competitive, and kept short lock time scene.


Lock elimination

Lock JIT compiler optimizations is to eliminate one of the functions, but also escape analysis to be mentioned below.

As the name suggests, the JIT compiler is synchronized block of time, use escape analysis techniques to determine the current code block, whether it is a local variable, etc., it will only be accessible to a thread. Analogy



In the above code block, a modified inner StringBuffer is synchronized, but it belongs to a local variable in the code block, the thread is private. There are examples of 2, is a local variable, so this situation, JIT will help us optimize the locks were eliminated, locked completely meaningless.

Because synchronized based moniotor instructions to achieve, some people may want to try javap is not really lock eliminates, the need to mention here, optimizing JIT compilation phase, javap not see concrete results, if readers are interested in, you can still see , but it will be a little complicated, first of all you have to build your own version of the jdk fasttest, then add -XX when using the java command to execute the .class file: + PrintEliminateLocks parameters. And jdk model must also be a server mode.


Lock coarsening

Refinement commonplace problem locks, locks in use when the granularity of control is conducive to enhancing performance, locks in the area real race conditions occur.

So why is there this lock coarsening say, quite simply, the same way we must have experienced and encountered.

When you write a try catch exception handling code block if there is cyclic operation, you will try catch in written inside or outside the loop cycle?



So this is lock coarsening focus point, and when JIT find a series of successive operations on the same object repeatedly lock and unlock the lock operation even appear in the body of the loop when the lock will be synchronized diffusion range (roughening) to the outside of the entire operation sequence.


Lightweight lock

Lightweight is the new lock lock mechanism in JDK 1.6 added its name in the "lightweight" is relative to the use of the operating system mutex lock to achieve in terms of traditional, traditional locking mechanism is called "heavyweight" lock. First need to stress that it is not intended to replace the lock lightweight heavyweight lock, it was intended under the premise of no multithreading competition, reduce the use of traditional heavyweight operating system lock mutex produce performance overhead .

When entering the synchronized block of code, if this synchronization object is not locked (locked flag is "01" state), the virtual machine will first build a space called the locks (Lock Record) in the current thread's stack frame, for the current Mark Word stores a copy of the locked object (the official copy of the Displaced added a prefix, i.e. Displaced Mark Word), which when the thread stack state object header as shown below:



Then, the virtual machine will try to use the CAS operation target of Mark Word updated to point to Lock Record pointer. If the update action is successful, then the thread owns the lock of the object, and the object of Mark Word lock flag (the last 2bit Mark Word) will be converted to "00", it means that this object is lightweight locked state as shown in this time a thread stack state object header follows:



If this update fails, the virtual machine will first check whether the object of Mark Word points to the current thread's stack frame, if only to explain the current thread already owns the lock of this object, it can directly enter the synchronized block to continue, otherwise explain this lock object has been preempted by other threads. If more than two threads competing for the same lock, lightweight lock that is no longer valid, is to be expanded heavyweight lock, the lock status of the flag value becomes "10", Mark Word is stored in pointing heavyweight lock (mutex) pointer, followed by a thread waiting for the lock should enter the blocked state.

Locking procedure described above is a lightweight lock, unlock its process is carried out by the CAS operation, if Mark Word object is still pointing to the record thread lock, then use CAS operation to target current and Mark Word thread copy of Displaced Mark Word replacement back if the replacement is successful, the entire synchronization process is complete. If the replacement fails, the other threads have tried to acquire the lock, it would have to release the lock at the same time, wake suspended thread.

Lightweight lock based program can improve synchronization performance is "for most of the locks are no competition in the entire synchronization cycle," which is an empirical data. If there is no competition, lightweight lock using CAS operations without the overhead of using a mutex, but if there is lock contention, in addition to the cost of the mutex, additionally happened CAS operation, so in a competitive situation, light lock the order will be slower than the traditional heavyweight lock.


Biased locking

Biased locking a lock JDK 1.6 also introduced optimization, it is designed to eliminate data synchronization primitives in the absence of competition, to further improve the operating performance of the program. If the lightweight lock is used without a competitive situation CAS operation to eliminate the use of mutex synchronization, biased locking is that in the absence of competition in the case of the whole synchronization are eliminated, even the CAS operations are not done.

Biased locking of "bias" is eccentric "partial" favoritism "partial", it means that the lock will be biased in favor of the first to get its thread, if in the next execution, the lock has not been other threads acquisition, biased locking thread that holds will never need to re-synchronize.

If the read operation between the Word and the previous thread lock on the object Lightweight head Mark, biased locking principle that understanding will be very simple. Suppose the current virtual machine enabled biased locking (enabled parameter -XX: + UseBiasedLocking, which is the default JDK 1.6), then, when the lock object is first thread gets, the virtual machine will be the subject header logo bit is set to "01", that is biased mode. At the same time the use of CAS operation to get to the lock thread ID is recorded in Mark Word object among the CAS if the operation is successful, the future holds a biased locking thread lock every time you enter the relevant sync block, a virtual machine can not any further synchronization (e.g. Locking, Unlocking and Update of Mark Word, etc.).

When there is another thread to try to acquire the lock, it came to an end bias mode. According to the lock object is currently locked in a state of withdrawal bias (Revoke Bias) back to the unlocked (flag "01") or lightweight locking (flag "00") state, the subsequent synchronization operations as introduced above is performed as a lightweight lock. Biased locking, the lock state of the relationship between conversion and lightweight objects Mark Word as shown below:



Biased locking can improve performance with synchronized without competition. It is also a trade-off with optimized efficiency (Trade Off) nature, that is, it is not always beneficial to run the program, if the program most of the locks are always a number of different threads access, that bias model is superfluous. Under the premise analyze specific issues, and sometimes use the parameters -XX: -UseBiasedLocking but to prohibit the biased locking optimization can improve performance.


Escape analysis

Allocated on the stack

As the name suggests, when the virtual machine to determine a method other than the object does not escape, then let the object on the stack memory allocation, memory space occupied by the object as you can pop the stack frame and destruction

Let's look at the test results of a piece of code



We jvm using the above parameters (open escape analysis, optimization TLAB closed, 1g allocated heap memory, open gc log printing), the above code execution


Again adjust jvm parameters (closed only escape analysis), performed once again



You can see, escape analysis determined that local variables testObject not escape the current method scope later, will be allocated on the stack optimized, but because I jdk using mixed mode, so there are still gc log printing.


mixed mode behalf mixed mode

Used in the Hotspot is an interpreter and compiler parallel architecture, the so-called mixed mode is the interpreter and compiler with the use, when the number of program starts early, using an interpreter execution (while the relevant data will be recorded, such as the calling function , loop execution count), save compile time. Using an interpreter during execution, the recording function of the operating data, these data codes is found that some hot code using the compiler to compile the code hot, and optimization (optimization escape analysis techniques is one).


Scalar replacement

Scalar (the Scalar) refers to a data can no longer be broken down into smaller data. Java in the original data type is a scalar. In contrast, those data may also be called a decomposed amount of polymerization (Aggregate), Java objects is the amount of the polymerization, since he may be decomposed into other aggregate and scalar quantity.

In JIT stage, if after escape analysis, found that an object can not be accessed outside, then through JIT optimizations, this object will be broken down into a number of members of several variables contained therein instead. This process is replaced by a scalar.

Or the example above, we then adjust the jvm parameters (open escape analysis, close TLAB, close the scalar replacement)



Examples of the above description, the allocation of the stack is achieved by replacing the scalar.

Lock elimination

Lock lock with a virtual machine optimization of elimination.


to sum up

Java Virtual Machine screen information related to the specific operating system platform, but also for programmers to do a lot of optimization, in addition to the optimization mentioned in this article, as well as common subexpression elimination, array bounds checking elimination method inlining , memory and code location conversion optimization, TLAB, PLAB and other optimization techniques, the interested reader can study in depth


reference:

<Depth understanding of java virtual machine>

<Java Virtual Machine Specification Version 2>

<Hotspot combat>

https://www.hollischuang.com/archives/tag/%E6%B7%B1%E5%85%A5%E7%90%86%E8%A7%A3%E5%A4%9A%E7%BA%BF%E7%A8%8B

www.jianshu.com/p/04fcd0ea5…

blog.csdn.net/hollis_chua…


Guess you like

Origin juejin.im/post/5dc8f93a6fb9a04a9f11c68a