JVM memory area and data changes in memory when the program is running

JVM memory area

image

JVM runtime data area

 

Definition: When JVM executes a JAVA program, it divides the area it manages into several different virtual areas for management.

JAVA is proud of his automatic memory management mechanism. Compared with the manual management of C++ and the incomprehensible pointers, JAVA programs are much easier to write. Therefore, to understand the JVM in depth, we must first understand the concept of memory virtualization.

In JVM, memory is mainly divided into heap, stack, and method area.

At the same time, dividing from the perspective of threads can also be divided into thread private area and thread shared area.

Thread private area: A single thread corresponds to a single area, and threads do not disturb each other.

Thread shared area: shared by all threads, and there is only one copy.

A concept of direct memory is also involved here. This memory does not belong to the runtime data area of ​​the JVM, but it is also frequently used. Assuming that the computer's memory has 8 Gs and the virtual machine is divided into 5 Gs, then the direct memory is the remaining 3 Gs. At this time, the JVM can use some tools to use the direct memory.

 

JAVA method operation and virtual machine stack

Thread private

Virtual machine stack

The data structure of the stack: FILO data structure

The role of the virtual machine stack: During the execution of the JVM, it stores the method running by the current thread, as well as the data, instructions, and return addresses in the method.

The virtual machine stack is thread-based: even if there is only one main() method in a single thread, it runs as a thread. In the life cycle of a thread, all data involved in the calculation will be frequently pushed into and out of the stack. Therefore, the life cycle of the virtual machine stack is the same as the thread.

The size of the virtual machine stack: JVM will allocate a fixed size of memory for the virtual machine stack of each thread, generally 1M. (-Xss parameter). Therefore, the stack frame that the virtual machine stack can accommodate must be limited. If the stack frame is continuously pushed into the stack but not out of the stack, it will eventually cause the memory space of the current thread virtual machine stack to be exhausted. A typical recursive function call without end conditions.

image.png

Stack frame and its four major areas

When each method is called, a stack frame is generated and pushed onto the stack. Once the method is called, it pops the stack and releases the memory. The stack frame is used to store the local variable table, operand stack, dynamic connection, method exit and other information. The so-called first-in-last-out stack memory refers to stack frame pushing.image.png

Local variable table

As the name suggests, it stores our local variables (variables in methods). Its length is 32 bits, and it mainly stores the eight basic types of our JAVA. Generally 32-bit can be put down. 64-bit uses high and low bytes, two can also be stored. The general method creates an object, we only need to create a reference address here.

When entering a method, how much local variable table space this method needs to allocate in the stack frame is completely determined. The size of the local variable table will not be changed during the execution of the method. Note: The size mentioned here refers to the number of slots

Operand stack

It is also a first-in-last-out stack that stores JAVA execution operands. The operand stack is used for operands. The element of operation can be any JAVA data type. So when the method first started, the operand stack was empty.

I understand the operand stack as a work area of ​​the JVM execution engine . That is, the operand stack will be operated when the method is executed. If the code is not executed, the operand stack will be empty. (Personal understanding: For each independent stack frame, the operand stack is responsible for processing logic like our program, and the local variable table stores data like a database).

Dynamic link

JAVA language feature polymorphism (subsequently , it will be recorded in detail together with the dynamic and static dispatch chapters combining class and execution engine and method calls, and I will briefly describe it here).

First of all, if you see the dynamic connection, do you guess if there is a static connection? Yes, the so-called "connection", in simple terms, refers to the call of the method. Then why is there only dynamic connection in our stack frame, but no static connection? Because static linking is determined during the class loading stage, it will never change during runtime. The dynamic method is not determined during class loading, but determined at runtime. Here again involves virtual, not virtual methods. The concept of dynamic and static assignment. It will be explained in detail in subsequent articles. Note: The determination here refers to the conversion of symbolic references to direct references, that is, subsequent runtime access does not require any processing, and the address can be obtained directly.

Complete the export (return address)

When a method starts to execute, there may be two ways to exit the method:

1. Normal completion exit
2. Abnormal completion exit
Normal completion exit means that the method completes normally and exits without throwing any exceptions (including Java virtual machine exceptions and exceptions thrown through the throw statement during execution). If the current method completes normally, according to the bytecode instruction returned by the current method, there may be a return value passed to the method caller (the method that called it), or no return value. Whether there is a return value and the data type of the return value will be determined according to the bytecode instructions returned by the method.

Abnormal completion exit means that an exception is encountered during the execution of the method, and this exception is not handled within the method body, causing the method to exit. Take the following code as an example:

image

image

Obviously, when an exception occurs in the program, the content in the catch will continue to be executed. The content after the method will be ignored.

Whether it is an exception thrown by the Java virtual machine or an exception generated by the athrow instruction in the code, as long as the corresponding exception handler is not searched in the exception table of this method, the method will exit.
No matter what method is used to exit, after the method exits, it needs to return to the place where the method was called before the program can continue. When the method returns, it may need to save some information in the current stack frame to help him restore its upper method. ( Call level ) Execution status.

The method exit process is actually equivalent to popping the current stack frame, so the operations that can be performed are: restore the local variable table and operand stack of the upper method, and push the return value ( if any ) into the caller's operation In the number stack, adjust the value of the PC counter to point to the next instruction after the method call instruction.
Generally speaking, when the method exits normally, the caller's PC count value can be used as the return address, and this count value may be stored in the stack frame. When the method exits abnormally, the return address is determined by the exception handler table, and this part of information is generally not saved in the stack frame.

image

The procedure is normal

(The address of the calling program counter is returned to the user thread)

Trilogy:

1. Restore the local variable table and operand stack of the upper method

2. Push the return value (if any) into the operand stack of the caller's stack frame.

3. Adjust the value of the program counter to point to an instruction after the method call instruction.

Program exception

(Through the exception handling table in the <non-stack frame>) to deal with.

 

Stack frame change

The JAVA virtual machine stack stores the local variable table, operand stack, dynamic connection and completion exit. The dynamic connection and completion exit are matched with the program counter in the stack frame to record the execution of the cpu. Because the CPU is in a high-speed switching state in a multi-threaded environment, the program counter will record the position of the thread this time it is executed, and then complete the exit cut out, and when the thread grabs the execution right next time, continue to execute the thread from the position counted by the program. The focus here is the collaboration between the local variable table and the operand stack.

image

Here, during the execution of the code, the execution flow of the program is broken down from the perspective of the stack frame. Schematic diagram of stacking and popping.

image

 

Program counter

Only occupies a small memory space, records the line number currently executed by the thread, and each thread is independent of each other and does not affect each other.

The program counter only occupies a small memory space, and mainly records the bytecode address executed by the thread of its own. For example: branches, loops, jumps, exceptions, data recovery, etc. all depend on the program counter. Since JAVA is a multi-threaded language, when the number of CPU threads exceeds the number of cores, threads will compete for CPU resources based on time slices. If the time slice of a thread is used up, or resources are robbed in advance for other reasons, then the program counter of this thread needs to record the next running instruction. The program counter is the only place in the JVM that is not OOM.

 

Native method stack

Since JVM is a JAVA virtual machine, there is a complete instruction execution process inside. So you need to use the program counter when running the JAVA method.

At that time, the method of the native method stack (native modifier) ​​was not executed by the JVM, so the program counter was not needed to execute it. This is because the operating system also has a program counter, which records the execution address of the code in the local method area. So if the method in the local method stack is executed, the program counter in the virtual machine stack will display (Undefined).

The impact of stack frame execution on memory

Decompilation process

Disassemble the class javap -c XXXX.class

Bytecode view URL: https://cloud.tencent.com/developer/article/1333540 

Use of decompilation tools

1. Find the .class file after the program is precompiled

2. Hold down Shift+right key to open the DOS command.

3. Use javap -c XXXX.class command to disassemble the class file, the execution result is as follows.

image

 

image

Process:

0: x=1 into the operand stack

1: x=1 into the local variable table

2: y=2 enters the operand stack

3: y=2 enters the local variable table

4: z=1+2 enters the operand stack (there are no more instructions to execute 1+2)

5: z=3 into the local variable table

6: x=1 enters the operand stack from the local variable table

7: y=2 enters the operand stack from the local variable table

8: Execute x+y (At this time, the operand stack will hand over the value to the execution engine for execution, and return the execution result to the operand stack)

9: z=3 enters the operand stack from the local variable table

10: The result of executing (x+y)*z (At this time, the operand stack will pass the value to the execution engine for execution, and the execution result will be returned to the operand stack)

11: The calculation result h enters the local variable table

13: h=9 enter the operand stack from the local variable table (discontinuity here may result in a larger result, use two locations to store)

15: h=9 returns from the operand stack to the calling stack frame

 

In the JVM, if it is mentioned that the interpretation-based execution is based on the stack-based execution engine. The stack-based engine is talking about the operand stack.

 

The meaning of the local method stack

The function of the local method stack is similar to that of the virtual machine. The JAVA virtual machine is used to manage the calls of JAVA functions. The local method stack is used to manage the function calls of the local method. But the native method is not written in JAVA. For example, the Object.hashcode() method.

The native method stack specifically serves the methods modified by the native modifier. Even the local method stack and the virtual machine stack can be combined into one. There is no mandatory requirement in the virtual machine specification, and the virtual machines of each version can be implemented freely. HotSpot combines the two regions into one.

 

Thread sharing

Method area

The method area is similar to the heap space and both are shared memory areas. So the method area is shared by threads. If a class is loaded into the JVM, both threads want to access the same information in the method area. At this time, only one thread is allowed to load it, and the other threads must wait (the JVM implements the lock function by itself, and the delayed placeholder mode of the singleton mode that will be learned in the future will take advantage of this ). The permanent generation and meta space we often mentioned refer to the realization of the method area. With the emergence of permanent generation, HotSpot implementers want to make it similar to heap memory, so that the garbage collector can manage method areas like heap memory. However, the size of the permanent generation has an upper limit, which makes the method area more prone to OOM problems. After JDK6, developers have spent a lot of effort, abandoning permanent generation and using metaspace instead. For the realization of metaspace, the virtual machine requirements are very relaxed. It is not mandatory to set the upper limit of memory, and even garbage collection is not required. Avoid memory leaks caused by incomplete recovery of useless objects in this area (don't handle it! Don't do it!).

Symbolic reference

Regarding symbolic quotation and direct quotation, many blogs and books talk about abstraction. Let me talk about my understanding.

First of all, the class constant pool is not our runtime constant pool. The class constant pool is a piece of memory that exists in the class file. Its function is to transfer the relevant information of the class to our method area during the process of class loading. In the process of dynamic connection, we access the method in the specific method area by the direct reference in the stack frame. But can the stack frame method access the class information of the class constant pool?

Certainly not, because at this time the data has not yet entered the JVM runtime data area. Then when the data in the class constant pool is to be loaded into our method area, the method area also has a reference to specifically identify the method in the class constant pool. When the data in the class constant pool is loaded into the method area through symbolic reference, the symbolic reference is converted to a direct reference, that is, the address of a runtime data area is actually allocated. Provide direct access to the references in our runtime data area. (Symbol reference can be any type of literal, but we don't need to know the specific naming rules. As long as the symbol reference can help us accurately load the class data from the class constant pool into the runtime data area).

Impromptu: the class constant pool is like a human trafficker. Different types of information are like the children they catch. Human traffickers are not in the mood to remember your names one by one. Simply call this group of children No. 1, No. 2, No. 3 (symbol reference)... When someone came to buy a child, the trafficker said, "Hey! Number 2, come here." The buyer looked at the child very much, so he bought the child. But if the child is his own, he can't still be called No. 2, so the child has the name Zhang San. Later, when family members called their children, they would call Zhang San directly (quote directly)

Direct quote

A direct reference is a reference that can access the real address. (Zhang San, Zhang San is calling you!)

Literal

The so-called literal is the value of the variable. Literal quantities can only appear as "=" rvalues. "="The lvalue is called a constant or variable.

 

int i=0;// i is a variable, 0 is a literal

final int a = 10;// a is a constant, 10 is a literal

string str = "hellow world";//str is a variable, hello world is a literal

 

Constant pool and runtime constant pool

When the class is loaded into memory, the JVM will load the contents of the class file constant pool (the file constant pool is in the class file and has not yet reached the method area) into the runtime constant pool. In the parsing phase, the JVM will resolve symbolic references into direct references (the index value of the object).

For example: when a string constant in a class is in the class file, it is stored in the class file pool constant; after the JVM loads the class, the JVM will load the constant from the class file constant pool to the JVM constant pool. And in the parsing process, specify the index value of the string object. The runtime constant pool is also shared, and multiple classes share a runtime constant pool. Multiple strings of the same literal amount stored in the constant pool of the clss file will only be stored in the runtime constant pool.

There are many concepts of constant pools. For example, class constant pool, string constant pool, runtime constant pool.

The above definition of the virtual machine specification only belongs to the method area, and does not specify the implementation of the virtual machine manufacturer.

Strictly speaking, the constant pool in Java is actually divided into two forms: static constant pool and runtime constant pool.

1) The so-called static constant pool is the constant pool in the *.class file. The constant pool in the class file not only contains the string (number) literal, but also contains the information of the class and method, which occupies most of the space of the class file.

     2) The runtime constant pool is that after the jvm virtual machine completes the class loading operation, it loads the constant pool in the class file into the memory and saves it in the method area. The constant pool we often say refers to the method The runtime constant pool in the zone.

The runtime constant pool was moved to the heap memory after JDK1 and 7. The movement here refers to physical space, but logically it is still a method area. (The method area is a logical partition)

Permanent Generation and Metaspace

Before JDK1.7, the permanent generation was used. The data of the string constant pool is moved to the heap memory. But doing so can easily cause performance problems and memory overflow. The size of the permanent generation can be specified, but because the permanent generation still occupies the memory of the runtime data area, a small size setting can easily cause permanent generation overflow. If the setting is large, it can easily cause overflow in the old generation and make garbage collection more frequent. Resulting in reduced efficiency. Therefore, the theory of permanent generation is eliminated in the subsequent metaspace. Metaspace uses local memory (direct memory), so metaspace is theoretically only affected by the size of local memory.

Expansion: When Orcle is fed up with BEA acquiring the ownership of the JRockit virtual machine, it is ready to port the excellent features of JRockit to HotSpot. But JRockit does not have the concept of permanent generation, so it has brought great suffering to the merger. After JDK6, the HoySpot development team is determined to work hard to remove the concept of permanent generation and gradually switch to metaspace.

Metaspace no longer occupies the memory of our runtime data area, but is placed in off-heap memory. That is to say, as long as the total memory of the machine is large enough, the JVM will not overflow the memory of the method area. Of course, unlimited use will still cause the death of the operating system.

In simple terms, both permanent generation and metaspace refer to method areas.

Heap

The heap occupies the largest part of the memory space in the JVM, and almost all the objects we apply for are stored in the heap memory. In the garbage collection we often say, the object of operation is the heap.

The heap space is generally applied for when the program starts, but it may not be all used up. The heap is generally set to be scalable. (The normal heap is generally only very small as the program starts. As objects are continuously created, they continue to increase and expand).

With the frequent creation of objects, more and more heap space is occupied, and objects that are no longer used need to be reclaimed from time to time. This is called GC (Garbage Collection) in Java.

So is an object allocated on the heap or the stack when it is created?

    For basic types: if it is an object (local variable) with a data type declared in the method body, it will be allocated on the stack. If it is in other cases (such as member variables), it will be allocated in the heap.

For reference types: the new object will be created in the heap, and the reference will be stored in the local variable table in the virtual machine stack.

Extension: So must the new object be in the heap?

The description in the "Java Virtual Machine Specification" is: "All object instances and arrays should be allocated on the heap." With the advancement of just-in-time compilation technology, especially the increasingly powerful escape analysis technology, stack allocation, scalar replacement optimization methods have also made this statement less absolute.

 

Off-heap memory (direct memory)

When the JVM is running, it will request a large amount of memory from the operating system for data storage. For example, the virtual machine stack, the local method area, and the program counter. This block is called the stack area. The remaining memory of the operating system is also the off-heap memory.

It is not part of the data area when the virtual machine is running. But it is limited by the total memory of the machine. So it will also produce OOM,

 

summary:

1. Direct memory mainly used the memory requested by DirectByteBuffer, you can use the parameter "MaxDirectMemorySize" to limit the size (otherwise this will not stop until the windows is stuck)

2. The off-heap memory can be directly applied for through Unsafe or other JNI means. Off-heap memory leaks are very serious. The troubleshooting is difficult, the impact is great, and even the host is killed.

Guess you like

Origin blog.csdn.net/weixin_47184173/article/details/109550397