4. Virtual machine stack

4.1. Overview of virtual machine stack

4.1.1. Background of the emergence of virtual machine stack

Due to the cross-platform design, Java instructions are designed based on the stack. Different platforms have different CPU architectures, so they cannot be designed to be register-based.

The advantages are that it is cross-platform, the instruction set is small, and the compiler is easy to implement. The disadvantage is that performance is reduced and more instructions are needed to achieve the same function.

4.1.2. Initial impressions

When many Java developers mention the Java memory structure, they will very roughly understand the memory area in the JVM as only the Java heap (heap) and Java stack (stack)? Why?

4.1.3. Stack and Heap in Memory

The stack is the unit of runtime, and the heap is the unit of storage.

● The stack solves the running problem of the program, that is, how the program executes, or how to process data.
● The heap solves the problem of data storage, that is, how and where to put the data.

Insert image description here

4.1.4. Basic contents of virtual machine stack

What is the Java virtual machine stack?

Java Virtual Machine Stack (Java Virtual Machine Stack), also called Java stack in the early days. Each thread will create a virtual machine stack when it is created, which stores stack frames one by one, corresponding to Java method calls one after another, and is private to the thread.

life cycle

The life cycle is consistent with the thread

effect

Responsible for the running of Java programs, it saves local variables and partial results of methods, and participates in the calling and return of methods.

Features of the stack

The stack is a fast and efficient way to allocate storage, and its access speed is second only to the sequence counter.

The JVM has only two direct operations on the Java stack:

● The execution of each method is accompanied by pushing into the stack (push, push)
● The popping work after the execution is completed

There is no garbage collection problem for the stack (the stack may overflow)

Insert image description here

Interview question: What exceptions did you encounter during development?

Possible Exceptions in the Stack
The Java Virtual Machine specification allows the size of the Java stack to be dynamic or fixed.
If a fixed-size Java virtual machine stack is used, the Java virtual machine stack capacity of each thread can be selected independently when the thread is created. If the stack capacity requested by the thread exceeds the maximum capacity allowed by the Java virtual machine stack, the Java virtual machine will throw a StackOverflowError exception.
If the Java virtual machine stack can be dynamically expanded and cannot apply for enough memory when trying to expand, or there is not enough memory to create the corresponding virtual machine stack when creating a new thread, the Java virtual machine will throw a OutOfMemoryError exception.

public static void main(String[] args) {
    
    
    test();
}
public static void test() {
    
    
    test();
}
//抛出异常:Exception in thread"main"java.lang.StackoverflowError
//程序不断的进行递归调用,而且没有退出条件,就会导致不断地进行压栈。

Set stack memory size

We can use the parameter -Xss option to set the maximum stack space of the thread. The size of the stack directly determines the maximum reachable depth of the function call.

public class StackDeepTest{
    
     
    private static int count=0; 
    public static void recursion(){
    
    
        count++; 
        recursion(); 
    }
    public static void main(String args[]){
    
    
        try{
    
    
            recursion();
        } catch (Throwable e){
    
    
            System.out.println("deep of calling="+count); 
            e.printstackTrace();
        }
    }
}

4.2. Storage unit of stack

4.2.1. What is stored in the stack?

Each thread has its own stack, and the data in the stack exists in the format of a stack frame.

Each method being executed on this thread corresponds to a stack frame.

A stack frame is a memory block and a data set that maintains various data information during method execution.

4.2.2. Stack operation principle

The JVM has only two direct operations on the Java stack, which are pushing and popping stack frames, following the "first in, first out"/"last in, first out" principle.

In an active thread, there will only be one active stack frame at a point in time. That is, only the stack frame of the currently executing method (top stack frame) is valid. This stack frame is called the current stack frame (Current Frame), and the method corresponding to the current stack frame is the current method (Current Method). The class that defines this method is the current class (Current Class).

All bytecode instructions run by the execution engine only operate on the current stack frame.

If other methods are called in this method, the corresponding new stack frame will be created and placed on the top of the stack to become the new current frame.
Insert image description here
Stack frames contained in different threads are not allowed to reference each other, that is, it is impossible to reference a stack frame of another thread in one stack frame.

If the current method calls other methods, when the method returns, the current stack frame will return the execution result of this method to the previous stack frame. Then, the virtual machine discards the current stack frame, making the previous stack frame become the current stack frame again.

There are two ways to return a function in Java. One is a normal function return, using the return instruction; the other is to throw an exception. No matter which method is used, the stack frame will be popped.

public class CurrentFrameTest{
    
    
    public void methodA(){
    
    
        system.out.println("当前栈帧对应的方法->methodA");
        methodB();
        system.out.println("当前栈帧对应的方法->methodA");
    }
    public void methodB(){
    
    
        System.out.println("当前栈帧对应的方法->methodB");
    }

4.2.3. Internal structure of stack frame

Each stack frame stores:

● Local Variables
● Operand Stack (or expression stack)
● Dynamic Linking (or method reference to the runtime constant pool)
● Method return address (Return Address) (or method Definition of normal exit or abnormal exit)
● Some additional information

Insert image description here
The stack under each parallel thread is private, so each thread has its own stack, and each stack has many stack frames. The size of the stack frame is mainly determined by the local variable table and operand stack.

Insert image description here

4.3. Local Variables

Local variable table is also called local variable array or local variable table

Defined as a numeric array, mainly used to store method parameters and local variables defined in the method body. These data types include various basic data types, object references (reference), and returnAddress types.
● Since the local variable table is built on the thread's stack and is the thread's private data, there is
no
data security issue .
The required capacity of the local variable table is determined during compilation and is stored in the Code attribute of the method. maximum local variables data item. The size of the local variable table will not change during method execution.
The number of nested method calls is determined by the size of the stack . Generally speaking, the larger the stack, the greater the number of nested method calls . For a function, the more parameters and local variables it has, causing the local variable table to expand, the larger its stack frame will be to meet the increased demand for information that needs to be passed during method calls. In turn, function calls will occupy more stack space, resulting in fewer nested calls.
The variables in the local variable table are only valid in the current method call . When the method is executed, the virtual machine completes the transfer process of parameter values ​​to the parameter variable list by using the local variable table. When the method call ends, as the method stack frame is destroyed, the local variable table will also be destroyed.

4.3.1. Understanding Slot

● Local variable table, the most basic storage unit is Slot (variable slot)
● The storage of parameter values ​​always starts at index 0 of the local variable array and ends at the index of array length -1.
● The local variable table stores various basic data types (8 types), reference types (reference), and returnAddress type variables that are known at compile time.
● In the local variable table, types within 32 bits occupy only one slot (including returnAddress type), and 64-bit types (long and double) occupy two slots.
Byte, short, and char are converted to int before storage, and boolean is also converted to int. 0 means false, and non-0 means true.
● The JVM will assign an access index to each Slot in the local variable table. Through this index, the local variable value specified in the local variable table can be successfully accessed. ● When an
instance method is called, its method parameters and The local variables defined inside the method body will be copied to each slot in the local variable table in order
If you need to access a 64-bit local variable value in the local variable table, you only need to use the previous index. (For example: accessing long or doub1e type variables)
● If the current frame is created by a constructor or instance method, then the object reference this will be stored in the slot with index 0, and the remaining parameters will continue to be arranged in the order of the parameter list.
Insert image description here

4.3.2. Slot reuse

The slots in the local variable table in the stack frame can be reused . If a local variable exceeds its scope, then a new local variable declared after its scope is likely to reuse the slot of the expired local variable. bit, thereby achieving the purpose of saving resources.

public class SlotTest {
    
    
    public void localVarl() {
    
    
        int a = 0;
        System.out.println(a);
        int b = 0;
    }
    public void localVar2() {
    
    
        {
    
    
            int a = 0;
            System.out.println(a);
        }
        //此时的就会复用a的槽位
        int b = 0;
    }
}

4.3.3. Comparison between static variables and local variables

After the parameter table is allocated, it is allocated according to the order and scope of the variables defined in the method body.
We know that the class variable table has two opportunities to initialize. The first time is in the "preparation phase", which performs system initialization and sets zero values ​​​​to the class variables. The other time is in the "initialization" phase, which is given to the programmer to define in the code. initial value.
Different from class variable initialization, there is no system initialization process for local variable tables, which means that once local variables are defined, they must be manually initialized, otherwise they cannot be used.

public void test(){
    
    
    int i;
    System. out. println(i);
}

Such code is wrong and cannot be used without assignment.

4.3.4. Supplementary instructions

In the stack frame, the part most closely related to performance tuning is the local variable table mentioned earlier. When the method is executed, the virtual machine uses the local variable table to complete the transfer of the method.

The variables in the local variable table are also important garbage collection root nodes. As long as the objects are directly or indirectly referenced in the local variable table, they will not be recycled.

4.4. Operand Stack (Operand Stack)

In addition to the local variable table, each independent stack frame also contains a Last-In-First-Out (Last-In-First-Out) operand stack, which can also be called an Expression Stack.

Operand stack. During method execution, data is written to or extracted from the stack according to the bytecode instructions, that is, push and pop.

● Some bytecode instructions push values ​​onto the operand stack, and other bytecode instructions take operands off the stack. After using them, push the results onto the stack
● For example: perform operations such as copying, exchanging, summing, etc.

Insert image description here
Code example

public void testAddOperation(){
    
    
    byte i = 15; 
    int j = 8; 
    int k = i + j;
}
public void testAddOperation(); 
    Code:
    0: bipush 15
    2: istore_1 
    3: bipush 8
    5: istore_2 
    6:iload_1 
    7:iload_2 
    8:iadd
    9:istore_3 
    10:return

The operand stack is mainly used to save the intermediate results of the calculation process and also serves as a temporary storage space for variables during the calculation process.

The operand stack is a workspace of the JVM execution engine. When a method first starts executing, a new stack frame will be created. The operand stack of this method is empty .

Each operand stack will have a clear stack depth for storing values. The required maximum depth is defined at compile time and is stored in the Code attribute of the method as the value of max_stack.

Any element in the stack can be any Java data type

● The 32-bit type occupies one stack unit depth
● The 64-bit type occupies two stack unit depths

The operand stack does not use index access for data access , but can only complete one data access through standard push and pop operations.

If the called method has a return value, its return value will be pushed into the operand stack of the current stack frame , and the next bytecode instruction to be executed in the PC register will be updated.

The data type of the elements in the operand stack must strictly match the sequence of bytecode instructions, which is verified by the compiler during the compiler and again during the data flow analysis phase of the class verification phase of the class loading process.

In addition, we say that the interpretation engine of the Java virtual machine is a stack-based execution engine , where the stack refers to the operand stack.

4.5. Code tracking


public void testAddOperation() {
    
    
    byte i = 15;
    int j = 8;
    int k = i + j;
}

Use the javap command to decompile the class file: javap -v classname.class

public void testAddoperation(); 
		Code:
	0: bipush 15 
	2: istore_1 
	3: bipush 8
	5: istore_2
	6: iload_1
	7: iload_2
	8: iadd
	9: istore_3
    10: return

Insert image description here
Insert image description here
Insert image description here
Insert image description here
During the programmer interview process, the common differences between i and i will be introduced in the bytecode chapter.

4.6. Top Of Stack Cashing Technology

As mentioned earlier, the zero-address instructions used by virtual machines based on stack architecture are more compact, but when completing an operation, more push and pop instructions must be used, which also means that more instructions will be needed. The number of instruction dispatches and the number of memory reads/writes.

Since the operands are stored in memory, frequent memory read/write operations will inevitably affect the execution speed. In order to solve this problem, the designers of HotSpot JVM proposed the Top-of-Stack Cashing (Tos) technology, which caches all the top elements of the stack in the registers of the physical CPU to reduce the reading/writing of memory. times to improve the execution efficiency of the execution engine.

4.7. Dynamic Linking

Dynamic links, method return addresses, additional information: Some places are called frame data areas

Each stack frame contains a reference to the method in the runtime constant pool to which the stack frame belongs. The purpose of including this reference is to support the code of the current method to implement dynamic linking (Dynamic Linking) . For example: invokedynamic instruction

When a Java source file is compiled into a bytecode file, all variable and method references are saved in the constant pool of the class file as symbolic references (Symbolic Reference). For example: when describing a method that calls another method, it is represented by a symbolic reference pointing to the method in the constant pool. Then the role of dynamic linking is to convert these symbolic references into direct references to the calling method.
Insert image description here
Why do you need a runtime constant pool?

The function of the constant pool is to provide some symbols and constants to facilitate the identification of instructions.

4.8. Method calling: parsing and allocation

In the JVM, converting a symbol reference into a direct reference to the calling method is related to the method's binding mechanism.

4.8.1. Static linking

When a bytecode file is loaded into the JVM, if the called target method is known at compile time and remains unchanged during runtime, the process of converting the symbolic reference of the calling method into a direct reference is called static link

4.8.2. Dynamic linking

If the called method cannot be determined at compile time , the symbol of the called method can only be converted into a direct reference during program running. Since this reference conversion process is dynamic, it is also called dynamic linking.

Static link and dynamic link are not nouns, but verbs, which is the key to understanding.

The binding mechanisms of the corresponding methods are: Early Binding and Late Binding. Binding is the process by which a symbolic reference to a field, method, or class is replaced with a direct reference. This happens only once .

4.8.3. Early binding

Early binding means that if the called target method is known at compile time and remains unchanged during runtime , this method can be bound to the type it belongs to. In this way, because the target method being called is clear Which one is it, so you can use static linking to convert the symbol reference into a direct reference.

4.8.4. Late binding

If the called method cannot be determined at compile time, the related method can only be bound according to the actual type during program running . This binding method is also called late binding.

With the emergence of high-level languages, there are more and more object-oriented programming languages ​​​​similar to Java. Although there are certain differences in the syntax and style of such programming languages, they always maintain a commonality with each other. , that is, they all support object-oriented features such as encapsulation, inheritance, and polymorphism. Since this type of programming language has polymorphic features, it naturally has two binding methods: early binding and late binding.

Any ordinary method in Java actually has the characteristics of a virtual function, which is equivalent to the virtual function in the C language (in C, the keyword virtual needs to be used to explicitly define it). If you do not want a method to have the characteristics of a virtual function in a Java program, you can use the keyword final to mark the method.

4.8.5. Virtual methods and non-virtual methods

If the method determines the specific calling version at compile time, this version is immutable at runtime. Such methods are called non-virtual methods.

Static methods, private methods, final methods, instance constructors, and parent class methods are all non-virtual methods. Other methods are called virtual methods.

It can be parsed during the parsing phase of class loading. The following is an example of a non-virtual method:

class Father{
    
    
    public static void print(String str){
    
    
        System. out. println("father "+str); 
    }
    private void show(String str){
    
    
        System. out. println("father"+str);
    }
}
class Son extends Father{
    
    
    public class VirtualMethodTest{
    
    
        public static void main(String[] args){
    
    
            Son.print("coder");
            //Father fa=new Father();
            //fa.show("atguigu.com");
        }
    }

The following method calling instructions are provided in the virtual machine:

Ordinary calling instructions:

● invokestatic: call a static method, the unique method version is determined during the parsing stage
● invokespecial: call methods, private and parent class methods, the unique method version is determined during the parsing stage
● invokevirtual: call all virtual methods
● invokeinterface: call the interface method

Dynamic call instructions:

● invokedynamic: dynamically resolve the method that needs to be called, and then execute it

The first four instructions are solidified inside the virtual machine, and the calling and execution of methods cannot be intervened by humans, while the invokedynamic instruction supports the user to determine the method version. The methods called by the invokestatic and invokespecial instructions are called non-virtual methods, and the rest (except those modified by final) are called virtual methods.
It can be summarized that methods that cannot be overridden by subclasses are called non-virtual methods, and methods that can be overridden by subclasses are called virtual methods (except those modified by final).

About the invokenamic directive

● The JVM bytecode instruction set has always been relatively stable. It was not until Java7 that an invokedynamic instruction was added. This is an improvement made by Java to support "dynamically typed language".
● However, Java7 does not provide a method to directly generate invokedynamic instructions. You need to use ASM, an underlying bytecode tool, to generate invokedynamic instructions. It was not until the emergence of Lambda expressions in Java 8 that the generation of invokedynamic instructions had a direct generation method in Java.
● The dynamic language type support added in Java7 is essentially a modification of the Java virtual machine specification, rather than a modification of the Java language rules. This is relatively complex, and it increases method calls in the virtual machine, which is the most direct benefit. The compiler is a dynamic language compiler running on the Java platform.

Dynamically typed languages ​​and statically typed languages

The difference between dynamically typed languages ​​and statically typed languages ​​lies in whether type checking is done at compile time or runtime. If the former is satisfied, it is a statically typed language, and vice versa, it is a dynamically typed language.

To put it more straightforwardly, **static type language is to determine the type information of the variable itself; dynamic type language is to determine the type information of the variable value. Variables have no type information,** only the variable value has type information. This is a dynamic language an important feature of.

4.8.6. The nature of method overriding

The essence of method overriding in Java language:

  1. Find the actual type of the object executed by the first element on the top of the operand stack, denoted as C.
  2. If a method is found in type C that matches the description in the constant with a simple name, access permission verification is performed. If it passes, a direct reference to the method is returned, and the search process ends; if it fails, java.lang is returned. IllegalAccessError exception.
  3. Otherwise, the search and verification process in step 2 is performed on each parent class of C from bottom to top according to the inheritance relationship.
  4. If no suitable method is found, a java.lang.AbstractMethodsrror exception is thrown.

Introduction to llegalAccessError

The program attempts to access or modify a property or call a method. You do not have permission to access this property or method. Generally, this will cause a compiler exception. If this error occurs at runtime, it indicates that an incompatible change has occurred in a class.

4.8.7. Method calling: virtual method table

In object-oriented programming, dynamic dispatch is used frequently. If the appropriate target must be searched for in the method metadata of the class again during each dynamic dispatch process, the execution efficiency may be affected. Therefore, in order to improve performance , the JVM uses a virtual method table (virtual method table) in the method area of ​​the class (non-virtual methods will not appear in the table). Use index tables instead of lookups.

Each class has a virtual method table, which stores the actual entry of each method.

When was the virtual method table created?

The virtual method table will be created and initialized during the linking phase of class loading. After the initial values ​​of the class variables are prepared, the JVM will also initialize the method table of the class.

Example 1:
Insert image description here
Example 2:

interface Friendly{
    
    
    void sayHello();
    void sayGoodbye(); 
}
class Dog{
    
    
    public void sayHello(){
    
    
    }
    public String tostring(){
    
    
        return "Dog";
    }
}
class Cat implements Friendly {
    
    
    public void eat() {
    
    
    }
    public void sayHello() {
    
     
    } 
    public void sayGoodbye() {
    
    
    }
    protected void finalize() {
    
    
    }
}
class CockerSpaniel extends Dog implements Friendly{
    
    
    public void sayHello() {
    
     
        super.sayHello();
    }
    public void sayGoodbye() {
    
    
    }
}

Insert image description here

4.9. Method return address (return address)

Store the value of the pc register that calls this method.

There are two ways to end a method:

  • Normal execution completed
  • An unhandled exception occurred and exited abnormally.

No matter which method is used to exit, after the method exits, it returns to the location where the method was called . When the method exits normally, the value of the caller's pc counter is used as the return address , that is, the address of the next instruction following the instruction that called the method. For exits through exceptions, the return address must be determined through the exception table, and this part of the information is generally not saved in the stack frame.

When a method starts executing, there are only two ways to exit the method:

  1. When the execution engine encounters a bytecode instruction (return) returned by any method, the return value will be passed to the upper method caller, referred to as the normal completion exit; after
    a method completes a normal call, which return instruction needs to be used? It also depends on the actual data type of the method return value.
    In the bytecode instructions, the return instructions include ireturn (used when the return value is boolean, byte, char, short and int types), lreturn (Long type), freturn (Float type), dreturn (Double type), areturn. There is also a return directive declared as void method, instance initialization method, and initialization method of classes and interfaces.
  2. An exception (Exception) is encountered during the execution of the method, and this exception is not handled within the method. That is, as long as a matching exception handler is not found in the exception table of this method, the method will exit, which is referred to as exception completion. exit.

During method execution, the exception handling when an exception is thrown is stored in an exception
handling table to facilitate finding the code to handle the exception when an exception occurs.

Exception table:
from to target type
4	 16	  19   any
19	 21	  19   any

In essence, the exit of a method is the process of popping the current stack frame. At this time, it is necessary to restore the local variable table and operand stack of the upper-layer method, push the return value into the operand stack of the caller's stack frame, set the PC register value, etc., so that the caller method can continue to execute.

The difference between the normal completion exit and the abnormal completion exit is that exiting through the exception completion exit will not generate any return value to its upper-level caller.

4.10. Some additional information

The stack frame is also allowed to carry some additional information related to the Java virtual machine implementation. For example: information that provides support for program debugging.

4.11. Stack related interview questions

● Give an example of stack overflow? (StackOverflowError)
○ Set the stack size through -Xss
● Can adjusting the stack size ensure that overflow does not occur?
○ There is no guarantee that it will not overflow
● Is the larger the allocated stack memory, the better?
○ No, it reduces the OOM probability within a certain period of time, but it will occupy other thread spaces because the entire space is limited.
● Does garbage collection involve the virtual machine stack?
○ No
● Are the local variables defined in the method thread-safe?
○ Detailed analysis of specific issues. If the object is generated internally and dies internally without returning to the outside, then it is thread-safe, otherwise it is thread-unsafe.
Reference data type variables that receive reference type arguments in method parameters are thread-unsafe and may be passed to other threads for use at the point where they are called.

Runtime data area Is there an Error? Is there a GC?
program counter no no
Virtual machine stack Yes (SOE) no
native method stack yes no
method area Yes (OOM) yes
heap yes yes

Guess you like

Origin blog.csdn.net/picktheshy/article/details/132545398