Original | Please, stop saying that Java objects are allocated on the heap memory!

△Hollis, a person with a unique pursuit of Coding△
Original | Please, stop saying that Java objects are allocated on the heap memory!
This is Hollis’s 256th original sharing
author l Hollis
source l Hollis (ID: hollischuang)
Java as an object-oriented, cross-platform language, its objects, memory, etc. It is a difficult knowledge point, so even a beginner in Java must have some understanding of JVM more or less. It can be said that the relevant knowledge about JVM is basically a knowledge point that every Java developer must learn, and it is also a knowledge point that must be tested during an interview.

In the memory structure of JVM, the two common areas are heap memory and stack memory (if not specified, the stack mentioned in this article refers to the virtual machine stack). Regarding the difference between the heap and the stack, many developers are also Like Shu Jiazhen, there are many books or articles on the Internet that are probably introduced like this:


1、堆是线程共享的内存区域，栈是线程独享的内存区域。
2、堆中主要存放对象实例，栈中主要存放各种基本数据类型、对象的引用。

However, the author can tell everyone responsibly that neither of the above two conclusions are completely correct.

In my previous article "Java heap memory is shared by threads! Interviewer: Are you sure? In ", I introduced the knowledge points about thread sharing where the heap memory is not complete. This article will discuss the second topic.

Object memory allocation

In the "Java Virtual Machine Specification", there is such a description about the heap:

在Java虚拟机中，堆是可供各个线程共享的运行时内存区域，也是供所有类实例和数组对象分配内存的区域。

In "Java heap memory is shared by threads! Interviewer: Are you sure? "In the article, we also introduced that when a Java object is allocated on the heap, it is mainly allocated on the Eden area. If the TLAB is started, it will be allocated on the TLAB first. In a few cases, it may be directly allocated in the old generation. , The allocation rules are not 100% fixed, it depends on which garbage collector is currently used, and the settings of the parameters related to memory in the virtual machine.

But in general, the following principles are followed:

Objects are allocated first in the Eden area
Prioritize allocation in Eden. If Eden does not have enough space, a Monitor GC will be triggered
Big objects enter the old age directly
For Java objects that require a large amount of contiguous memory space, when the memory required by the object is greater than the value of the -XX: PretenureSizeThreshold parameter, the object will directly allocate memory in the old generation.
However, although there are such requirements in the virtual machine specification, each virtual machine manufacturer may make some optimizations for the memory allocation of the object when implementing the virtual machine. The most typical of these is the maturity of the JIT technology in the HotSpot virtual machine, which makes the allocation of memory on the heap for objects not certain.
In fact, in "In-Depth Understanding of the Java Virtual Machine", the author also put forward a similar view, because the maturity of JIT technology makes "objects allocate memory on the heap" not so absolute. However, the book does not introduce exactly what JIT is, nor does it introduce what JIT optimization does. Then let's take a closer look:
JIT technology

As we all know, Java program source code can be compiled and converted into java bytecode through javac. JVM translates the bytecode into corresponding machine instructions, reads in one by one, and interprets the translation one by one. This is the function of the traditional JVM Interpreter. Obviously, after the interpretation and execution of the Java compiler, its execution speed will inevitably be much slower than the direct execution of executable binary bytecode. In order to solve this efficiency problem, JIT (Just In Time) technology is introduced.
With JIT technology, Java programs are still interpreted and executed through an interpreter. When the JVM finds that a method or code block runs particularly frequently, it will consider it to be a "hot spot code" (Hot Spot Code). Then JIT will translate part of the "hot code" into machine code related to the local machine, and optimize it, and then cache the translated machine code for next use.
Hot spot detection

As we said above, in order to trigger JIT, you first need to identify hot codes. At present, the main hot spot code identification method is hot spot detection, and the hot spot detection based on counter is mainly used in the HotSpot virtual machine

基于计数器的热点探测（Counter Based Hot Spot Detection)。采用这种方法的虚拟机会为每个方法，甚至是代码块建立计数器，统计方法的执行次数，某个方法超过阀值就认为是热点方法，触发JIT编译。

Compilation optimization

After JIT has done hot spot detection and identified the hot code, it will not only cache its bytecode, but also optimize the code. Among these optimizations, the more important ones are: escape analysis, lock elimination, lock expansion, method inlining, null value check elimination, type detection elimination, common sub-expression elimination, etc.
The escape analysis in these optimizations is related to the content of this article.

Escape analysis

Escape Analysis is the most advanced optimization technology in the current Java virtual machine. This is a cross-function global data flow analysis algorithm that can effectively reduce synchronization load and memory heap allocation pressure in Java programs. Through escape analysis, the Hotspot compiler can analyze the use range of a new object reference to decide whether to allocate this object to the heap.
The basic behavior of escape analysis is to analyze the dynamic scope of an object: when an object is defined in a method, it may be referenced by an external method, such as being passed to other places as a call parameter, which is called method escape.
E.g:

public static String craeteStringBuffer(String s1, String s2) {

    StringBuffer sb = new StringBuffer();

    sb.append(s1);

    sb.append(s2);

    return sb.toString();

}

sb is an internal method variable. The above code does not return it directly, so this StringBuffer will not be changed by other methods, so its scope is only inside the method. We can say that this variable does not escape to the outside of the method.
With escape analysis, we can determine whether the variables in a method may be accessed or changed by other threads. Based on this feature, JIT can make some optimizations:

Synchronous omission
Scalar substitution
Allocation
on the stack Regarding synchronization omission, you can refer to the introduction of lock elimination technology in my previous "In-depth understanding of multithreading (5)-Java virtual machine lock optimization technology". This article mainly analyzes subscalar substitution and stack allocation.
Scalar replacement, allocation on the stack

We say that after the JIT escape analysis, if it is found that an object does not escape outside the method body, it may be optimized. The biggest result of this optimization is that it may change the Java object to allocate memory on the heap. Of this principle.
There are actually many reasons for objects to be allocated on the heap, but one of the more critical ones is related to this article, and that is because the heap memory is shared by threads, so that objects created by one thread can be accessed by other threads.
So, just imagine, if we create an object inside a method body, and the object does not escape outside the method, is it necessary to allocate the object on the heap?
In fact, there is no need, because this object will not be accessed by other threads, and the life cycle is only inside a method, so there is no need to allocate memory on the heap and reduce the need for memory recovery.
Then, after the escape analysis, if an object is found to have not escaped outside the law, what method can be optimized to reduce the possibility of object allocation on the heap?
This is the allocation on the stack. In HotSopt, the allocation on the stack is not currently being implemented, but is implemented through scalar replacement.
So we will focus on what is scalar replacement and how to implement stack allocation through scalar replacement.
Scalar substitution

Scalar refers to a piece of data that cannot be broken down into smaller data. The primitive data type in Java is a scalar. In contrast, data that can be decomposed is called Aggregate. Objects in Java are aggregates, because they can be decomposed into other aggregates and scalars.
In the JIT stage, if it is found that an object cannot be accessed by the outside world after escape analysis, then after JIT optimization, the object will be disassembled into several member variables contained therein instead. This process is scalar replacement.

public static void main(String[] args) {

   alloc();

}

private static void alloc() {

   Point point = new Point（1,2）;

   System.out.println("point.x="+point.x+"; point.y="+point.y);

}

class Point{

    private int x;

    private int y;

}

In the above code, the point object does not escape the alloc method, and the point object can be disassembled into scalars. Then, JIT will not directly create Point objects, but directly use two scalar int x and int y to replace Point objects.

private static void alloc() {

   int x = 1;

   int y = 2;

   System.out.println("point.x="+x+"; point.y="+y);

}

It can be seen that after the escape analysis of Point’s aggregation quantity, it was found that he did not escape, so it was replaced with two aggregation quantities.
Through scalar substitution, the original object is replaced with multiple member variables. The memory that originally needed to be allocated on the heap is no longer needed, and the memory allocation for member variables can be done in the local method stack.

Experimental proof

Talk Is Cheap, Show Me The Code
No Data, No BB;
Next, let’s go through an experiment to see if the escape analysis can take effect, whether stack allocation will actually happen after it takes effect, and what’s on the stack allocation? What are the benefits?
Let's look at the following code:

public static void main(String[] args) {

    long a1 = System.currentTimeMillis();

    for (int i = 0; i < 1000000; i++) {

        alloc();

    }

    // 查看执行时间

    long a2 = System.currentTimeMillis();

    System.out.println("cost " + (a2 - a1) + " ms");

    // 为了方便查看堆内存中对象个数，线程sleep

    try {

        Thread.sleep(100000);

    } catch (InterruptedException e1) {

        e1.printStackTrace();

    }

}

private static void alloc() {

    User user = new User();

}

static class User {

}

In fact, the code content is very simple, that is, use a for loop to create 1 million User objects in the code.
We defined the User object in the alloc method, but did not reference it outside the method. In other words, this object will not escape to the outside of alloc. After the escape analysis of JIT, its memory allocation can be optimized.
We specify the following JVM parameters and run:

-Xmx4G -Xms4G -XX:-DoEscapeAnalysis -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError

Among them -XX:-DoEscapeAnalysis means to turn off escape analysis.
After the program prints out the cost XX ms, and before the end of the code, we use the jmap command to check how many User objects are in the current heap memory:

➜  ~ jmap -histo 2809

 num     #instances         #bytes  class name
----------------------------------------------

   1:           524       87282184  [I

   2:       1000000       16000000  StackAllocTest$User

   3:          6806        2093136  [B

   4:          8006        1320872  [C

   5:          4188         100512  java.lang.String

   6:           581          66304  java.lang.Class

From the above jmap execution results, we can see that a total of 1 million StackAllocTests are created in the heap. When the
escape analysis is turned off (-XX:-DoEscapeAnalysis), although the User object created in the alloc method does not escape to the outside of the method , But it is still allocated in the heap memory. In other words, if there is no JIT compiler optimization and no escape analysis technology, it should be like this under normal circumstances. That is, all objects are allocated to the heap memory.
Next, we turn on the escape analysis, and then execute the above code.

-Xmx4G -Xms4G -XX:+DoEscapeAnalysis -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError

After the program prints out the cost XX ms, and before the end of the code, we use the jmap command to check how many User objects are in the current heap memory:

➜  ~ jmap -histo 2859
 num     #instances         #bytes  class name
----------------------------------------------

   1:           524      101944280  [I

   2:          6806        2093136  [B

   3:         83619        1337904  StackAllocTest$User

   4:          8006        1320872  [C

   5:          4188         100512  java.lang.String

   6:           581          66304  java.lang.Class

From the above print results, we can find that after the escape analysis (−XX:+DoEscapeAnalysis) is turned on, there are only more than 80,000 StackAllocTestUser objects in the heap memory. In other words, after the JIT optimization, the number of objects allocated in the heap memory has dropped from 1 million to 80,000.
In addition to the above method of verifying the number of objects through jmap, readers can also try to reduce the heap memory, and then execute the above code, analyze according to the number of GCs, and find that after the escape analysis is turned on, the number of GCs during operation Will be significantly reduced. It is precisely because many heap allocations are optimized to stack allocations, the number of GCs has been significantly reduced.

Escape analysis is immature

In the previous example, after the escape analysis is turned on, the number of objects has changed from 1 million to 80,000, but it is not 0, indicating that JIT optimization does not completely optimize all situations.
The paper on escape analysis was published in 1999, but it was not implemented until JDK 1.6, and the technology is not very mature until now.
The fundamental reason is that there is no guarantee that the performance consumption of escape analysis will be higher than his consumption. Although scalar replacement, stack allocation, and lock elimination can be done after escape analysis. However, escape analysis itself also requires a series of complex analysis, which is actually a relatively time-consuming process.
An extreme example is that after escape analysis, it is found that no object does not escape. Then the process of escape analysis is wasted.
Although this technology is not very mature, but it is also a very important means of just-in-time compiler optimization technology.

to sum up

Under normal circumstances, objects are allocated on the heap. However, with the maturity of compiler optimization technology, although the virtual machine specification requires this, there are still some differences in specific implementation.
For example, after the HotSpot virtual machine introduces JIT optimization, it will perform escape analysis on the object. If it is found that an object does not escape to the outside of the method, then it is possible to implement stack allocation through scalar replacement, and avoid memory allocation on the heap.
Therefore, the object must allocate memory on the heap, which is incorrect.
Finally, let's leave a question for thought. We discussed TLAB before, and today we introduced stack allocation. Do you think there are similarities and differences between these two optimizations?
About the Author: Hollis, has a unique quest for Coding people, the current Alibaba technical experts, personal technology blogger, technical articles, the amount of reading the whole network of tens of millions, "three classes programmer" joint author.
Dynamic black notes

The series of articles
on the road to becoming a god of Java engineers are in the GitHub update, welcome to follow, welcome to star.
Original | Please, stop saying that Java objects are allocated on the heap memory!
Facing Java Issue 300: What is escape analysis?
In-depth concurrency Issue 013: Expand synchronized-lock optimization

MORE | More exciting articles-The application of
9 design patterns in Spring must be very skilled!
The average salary is 16,000 yuan! The big exposure of programmers' salaries in first-tier cities in 2020
is also to sell tickets. Why does Ali not get stuck when selling movie tickets? What has been done technically? ? ?
It turns out that the MySQL interview will ask these...

If you like this article,
please press
Original | Please, stop saying that Java objects are allocated on the heap memory!
and hold the QR code and follow Hollis. Forward it to the circle of friends. This is my greatest support.
Good article, I am reading ❤️