[JVM] Five, virtual machine stack

Hello, everyone, I am a pig who is dominated by cabbage.

A person who loves to study, sleepless and forgets to eat, is obsessed with the girl's chic, calm and indifferent coding handsome boy.

If you like my text, please follow the public account "Let go of this cabbage and let me come"

05-Virtual Machine Stack

1. Overview of the virtual machine stack

The background of the virtual machine stack

Due to the cross-platform design, Java instructions are designed according to the stack. Different platforms have different CPU architectures, so they cannot be designed as register-based.

The advantage is that it is cross-platform, the instruction set is small, and the compiler is easy to implement, but the disadvantage is that the performance is reduced, and more instructions are needed to achieve the same function.

Stack and heap in memory

The stack is the unit of runtime, and the heap is the unit of storage.

That is: the stack solves the running problem of the program, that is, how the program is executed, or how to process the data. The heap solves the problem of data storage, that is, how to put the data and where to put it.

The main data is placed on the heap, and the local data variable is placed on the stack. This local data variable refers to the basic type of data, and the object is placed as a reference.

The basic content of the virtual machine stack

What is the Java virtual machine stack?

​ Java Virtual Machine Stack (Java Virtual Machine Stack), also called Java stack in the early days. When each thread is created, a virtual machine stack is created, which saves a stack frame (Stack Frame), corresponding to the Java method calls again and again.

The Java virtual machine stack is private to the thread.

life cycle

The life cycle is consistent with threads.

effect

The subjective Java program runs, it saves the local variables and partial results of the method, and participates in the call and return of the method.

Features of the stack (advantages)

  • The stack is a fast and effective way of allocating storage, the access speed is second only to the program counter.

  • The JVM has only two direct operations on the Java stack:

    ​ Each method is executed, accompanied by push to the stack (push, push)

    ​ Stack work after execution

  • There is no garbage collection problem for the stack

Possible exceptions in the stack

The Java virtual machine specification allows the size of the Java stack to be dynamic or fixed.

  • If a fixed-size Java virtual stack is used, the Java virtual machine stack capacity of each thread can be independently selected when the thread is created. If the stack capacity allocated by the thread request exceeds the maximum capacity allowed by the Java virtual machine stack, the Java virtual machine will throw a StackOverflowError exception.
  • If the Java virtual machine stack can be dynamically expanded, and cannot apply for enough memory when trying to expand, or there is not enough memory to create the corresponding virtual machine stack when a new thread is created, the Java virtual machine stack will be thrown An OutofMemoryError exception.

Set the size of the stack memory

We can use the parameter -Xss option to set the maximum stack space of the thread. The size of the stack directly determines the maximum reachable depth of the function call.

Local variables VS member variables (or attributes)

Basic data variables vs reference type variables (classes, arrays, interfaces)

Insert picture description here

A stack frame for a Java method

There is no GC in the virtual machine stack, but OOM will appear. Both program counters do not exist, only stacking and popping operations exist, so there is no garbage collection.

Interview question: What are the exceptions encountered in development?

In the Java virtual machine, the memory overflowed.

-Xss adjusts the memory of the stack

Second, the storage unit of the stack

What is stored in the stack

  • Each thread has its own stack, and the data in the stack exists in the format of a stack frame.
  • Each method being executed on this thread has its own corresponding stack frame.
  • The stack frame is a memory block and a data set that maintains various data information during the execution of the method.

How the stack works

  • JVM has only two direct operations on the Java stack, namely pushing and popping stack frames, following the principle of "first in, last out, last in, first out".

  • In an active thread, there will only be one active stack frame at a point in time. That is, only the stack frame (top stack frame) of the currently executing method is valid. This stack frame is called the current stack frame (Current Frame), and the method corresponding to the current stack frame is the current method (Current Method). The class that defines this method is the current class (Current Class).

  • All bytecode instructions run by the execution engine only operate on the current stack frame.

  • If other methods are called in this method, a corresponding new stack frame will be created and placed on the top of the stack, which is called the new current frame.
    Insert picture description here

  • The stack frames contained in different threads are not allowed to reference each other, that is, it is impossible to reference the stack frame of another thread in one stack frame.

  • If the current method calls other methods, when the method returns, the current stack frame will return the execution result of this method to the previous stack frame. Then, the virtual machine discards the current stack frame so that the previous stack frame becomes the current stack frame again.

  • There are two ways to return a function in a Java method, one is to return from a normal function, using the return instruction; the other is to throw an exception. No matter which method is used, it will cause the stack frame to be popped.

The internal structure of the stack frame

Stored in each stack frame:

  • Local Variables Table (Local Variables)
  • Operand stack (Operand Stack) (or expression stack)
  • Dynamic Linking (or method reference pointing to runtime constant pool)
  • Method return address (or the definition of the method exits normally or abnormally)
  • Some additional information
    Insert picture description here

There is a one-to-one correspondence between methods and stack frames.

review:

  • The basic concepts of OOP: class, object
  • Basic structure in the class: field (attribute, field, domain), method

Three, local variable table

Local variable table

  • Local variable table is also called local variable array or local variable table
  • Defined as a numeric array, mainly used to store method parameters and local variables defined in the method body. These data types include various basic types, object applications (reference), and returnAddress types.
  • Since the local variable table is built on the thread’s stack and is the thread’s private data, there is no data security problem
  • The required capacity of the local variable table is determined by the compiler and stored in the maximum local variables data item of the Code attribute of the method. The size of the local variable table will not be changed during the execution of the method. (That is, the size is determined at compile time)
  • The number of nested method calls is determined by the size of the stack. Generally speaking, the larger the stack, the more nested method calls. For a function, the more parameters and local variables it has, the larger the local gauge table is, and the larger its stack frame will be to meet the needs of increasing the information passed by the method call. In turn, function calls will take up more stack space, resulting in fewer nested calls.
  • The variables in the local variable table are only valid in the current method call. When the method is executed, the virtual machine completes the transfer process of parameter values ​​to the parameter variable list by using the local variable table. When the method call is over, the local variable table will be destroyed with the destruction of the method stack frame.

Understanding of Slot

  • The storage of parameter values ​​always starts at index 0 of the local variable array and ends at the index of array length -1.

  • Local variable table, the most basic storage unit is Slot (variable slot)

  • The local variable table stores various data types (8 types), reference types (referrence), and returnAddress type variables known to the compiler.

  • In the local variable table, the 32-bit type only occupies one slot (including the returnAddress type), and the 64-bit type (long and double) occupies two slots.

    • Byte, short, char are converted to int before storage, boolean is also converted to int, 0 means false, and non-zero means true.
    • Long and double occupy two slots.
  • JVM will assign an access index to each Slot in the local variable table, through this index you can successfully access the local variable value specified in the local variable table

  • When an instance method is called, its method parameters and local variables defined in the method body will be copied to each Slot in the local variable table in order

  • If you need to access a 64-bit local variable value in the local variable table, you only need to use the previous index. (For example: access long or double type variables)

  • If the current frame is created by the constructor or instance method, then the object reference this will be stored in the slot with index 0, and the remaining parameters will continue to be arranged in the order of the parameter list. (There is no this for static modified static methods)

Slot reuse

The slots in the local variable table in the stack frame can be reused. If a local variable exceeds its scope, the new local variable declared after its scope is likely to reuse the slot of the expired local variable Bit, so as to achieve the purpose of saving resources.

Example: Comparison of static variables and local variables

Classification of variables:

​ According to the data type: ①Basic data type ②Reference data type

​ According to the position declared in the class:

​ ①Member variables: before use, all go through default initialization assignment

​ Class variables: prepare phase of linking: assign values ​​to class variables by default---->initial phase: display assignments to class variables, that is, assign values ​​to static code blocks

​ Instance variables: as the object is created, instance variable space will be allocated in the heap space and assigned by default

​ ②Local variables: before use, must be displayed and assigned, otherwise, the compilation will not pass.

  • After the parameter list is allocated, it is allocated according to the order and scope of the variables defined in the method body,
  • We know that the class variable table has two opportunities to initialize, the first time is in the "preparation phase", the system initialization is performed, and the zero value is set for the class variable, the other time is in the "initialization" phase, and the programmer is given the definition in the code The initial value.
  • Unlike the initialization of class variables, there is no system initialization process in the local variable table, which means that once the local variable is defined, it must be initialized, otherwise it cannot be used.

Local variables cannot be used without copying.

Supplementary note

  • In the stack frame, the most closely related part of performance tuning is the aforementioned local variable table. When the method is executed, the virtual machine uses the local variable table to complete the method transfer.
  • The variables in the local variable table are also important garbage collection root nodes, as long as the objects directly or indirectly referenced in the local variable table will not be recycled.

Fourth, the operand stack

  • In addition to the local variable table, each independent stack frame also contains a
    Last-In-First-Out (Last-In-First-Out) operand stack, which can also be called an expression stack.

  • In the operand stack, during the execution of the method, according to bytecode instructions, data is written to or extracted from
    the stack, that is, push/pop.

    • Some bytecode instructions push the value onto the operand stack, and the remaining bytecode instructions remove the operand from the
      stack. After using them, push the result onto the stack.
    • For example: perform operations such as copying, swapping, summing, etc.
      Insert picture description here
  • The operand stack is mainly used to save the intermediate results of the calculation process, and at the same time as a temporary storage space for variables in the calculation process.

  • The operand stack is a work area of ​​the JVM execution engine. When a method is first executed, a new stack frame will also be created. The operand stack of this method is empty.

  • Each operand stack will have a clear stack depth for storing values. The maximum depth required is defined at compile time and stored in the Code attribute of the method as the value of max_stack.

  • Any element in the stack can be any Java data type.

    • The 32bit type occupies a stack unit depth
    • The 64bit type occupies two stack unit depths
  • The operand stack is not used to access the index for data access, but can only complete a data access through the stacking and popping operations in the table.


  • If the called method has a return value, its return value will be pushed into the operand stack of the current stack frame, and the next bytecode instruction to be executed in the PC register is updated.
  • The data type of the elements in the operand stack must strictly match the sequence of bytecode instructions, which is verified by the compiler during compilation, and it must be verified again in the data flow analysis phase of the class verification phase in the class loading process.
  • In addition, we say that the interpretation engine of the Java virtual machine is a stack-based execution engine , where the stack refers to the operand stack.

Five, code tracking

Six, stack top cache technology

As mentioned earlier, the zero-address instructions used by virtual machines based on the stack architecture are more compact, but when an operation is completed, more stacking and popping instructions must be used, which also means that more instructions will be needed. The number of instruction dispatches and memory read/write times.

Since the operands are stored in the memory, frequent memory read/write operations will inevitably affect the execution speed. To solve this problem, the designers of HotSpot JVM proposed Top-of-stack Cashing (ToS, Top-of-stack Cashing) technology, which caches all stack item elements in the registers of the physical CPU, thereby reducing memory read/write Times, improve the execution efficiency of the execution engine.

Seven, dynamic link

Dynamic linking or method reference pointing to the runtime constant pool

  • Each stack frame contains a reference to the method of the stack frame in the runtime constant pool. The purpose of including this reference is to support the code of the current method to achieve dynamic linking (Dynamic Linking). For example: invokedynamic instruction

  • When the Java source file is compiled into a bytecode file, all variables and method applications are stored as symbolic references (Symbolic Reference) in the constant pool of the class file. For example, when describing that a method calls another method, it is represented by the symbolic references pointing to the method in the constant pool, then the function of dynamic linking is to convert these symbolic references into direct references to the method.
    Insert picture description here

Frame data area: dynamic link, method return address, some attachment information

There is a special place in the bytecode file called the constant pool

Why do I need a constant pool?

The function of the constant pool is to provide some symbols and constants to facilitate the identification of instructions.

8. Method invocation: parsing and dispatch

Static link/dynamic link

In the JVM, the conversion of symbolic references into direct references for invoking methods is related to the binding mechanism of methods.

  • Static link

    ​ When a bytecode file is loaded into the JVM, if the target method called is known at compile time and the runtime remains unchanged. In this case, the process of converting the symbolic reference of the calling method into a direct reference is called static linking.

  • Dynamic link

    ​ If the called method cannot be determined at compile time, that is to say, the symbolic reference of the calling method can only be converted to a direct reference during the running of the program. Because this reference conversion process is dynamic, it is also called It is a dynamic connection. (How is it similar to the concept of polymorphism here)

Early binding/late binding

The binding mechanism of the method corresponding to the above static dynamic connection is: Early Binding and Late Binding. Binding is a process in which a symbolic reference is replaced by a direct reference of a field, method, or class, and this happens only once.

  • Early binding:

    ​ Early binding means that if the called target method is known by the compiler and remains unchanged at runtime , then this method can be bound to the type it belongs to. In this way, because the called target method is clear Which one is it, so you can use static linking to convert symbol references to direct references.

  • Late binding:

    ​ If the method to be called cannot be determined at compile time, only the relevant method can be bound according to the actual type during the program runtime. This binding method is also called late binding.

With the emergence of high-level languages, there are more and more object-oriented programming languages ​​similar to Java-like. Although there are certain differences in syntax styles of such programming languages, they have always maintained each other. This-a commonality is that they all support object-oriented features such as encapsulation, inheritance, and polymorphism. Since this type of programming language has polymorphic features, it naturally has two binding methods, early binding and late binding.

In fact, any ordinary method in Java has the characteristics of virtual functions, which are equivalent to virtual functions in C++ (in C++, the keyword virtual needs to be explicitly defined). If you do not want a method to have the characteristics of a virtual function in a Java program, you can use the keyword final to mark this method.

Virtual method/non-virtual method

Non-virtual method

  • If the method determines the specific calling version at compile time, this version is immutable at runtime. Such methods are called non-virtual methods.
  • Static methods, private methods, final methods, instance constructors, and parent methods are all non-virtual methods.
  • Other methods are called virtual methods.

In the parsing stage of class loading, it is possible to analyze whether the parsing is a virtual method or a non-virtual method.

The following method call instructions are provided in the virtual machine:

  • Ordinary call instruction

    1. invokestatic: Invoke a static method, and determine the only method version in the parsing phase
    2. invokespecial: Invoke methods, private and parent methods, and determine the only method version in the parsing phase
    3. invokevirtual: call all virtual methods
    4. invokeinterface: call interface method
  • Dynamic call instruction

    ​ invokeddynamic: dynamically resolve the method to be called, and then execute

The first four instructions are solidified in the virtual machine. The call and execution of the method cannot be considered as an intervention, while the invokedynamic instruction supports the user to determine the method version. The methods called by the invokestatic instruction and the invokespecial instruction are called non-virtual methods, and the rest (except for the final modification) are called virtual methods.

About the invokedynamic command

  • The JVM bytecode instruction set has been relatively stable. It was not until Java7 that an invokedynamic instruction was added. This is an improvement made by Java in order to implement dynamically typed language support.
  • But in Java7, there is no method to directly generate the invokedymic instruction, and the underlying bytecode tool, ASM, is needed to generate the invokeddynamic instruction. It was not until the emergence of Java 8 Lambda expressions and the generation of invokedynamic instructions that there was a direct way to generate them in Java.
  • The essence of the dynamic language type support added in Java 7 is the modification of the Java virtual machine specification, not the modification of the Java language rules. This block is relatively complicated, and the method calls in the virtual machine are increased. The most direct The beneficiary is a dynamic language compiler running on the Java platform.

Dynamically typed language, statically typed language

The difference between a dynamically typed language and a statically typed language is whether the type is checked at compile time or at run time. The former is a statically typed language, and the opposite is a dynamically typed language.

To put it more bluntly, statically typed language is to judge the type information of the variable itself: dynamic type language is to judge the type information of the variable value, the variable has no type information, and the variable value has type information. This is one of the dynamic languages. Important characteristics.

The essence of method rewriting

  1. Find the actual type of the object executed by the first element on the top of the operand stack, denoted as c.
  2. If it finds a method in type c that matches the simple name with the description in the constant, then the access permission verification is performed. If it passes, the direct reference of this method is returned, and the search process ends; if it fails, java.lang is returned. IllegalAccessError is abnormal.
  3. Otherwise, according to the inheritance relationship, the search and verification process of step 2 is performed on each parent class of c from bottom to top.
  4. If no suitable method is found, a java.lang.AbstractMethodError exception is thrown.

Introduction to IllegalAccessError :

The program tried to access or modify a property or call a method. You do not have permission to access this property or method. Generally, this will cause a compiler exception. If this error occurs at runtime, it means that an incompatible change has occurred in a class.

Virtual method table

  • In object-oriented programming, dynamic dispatch is frequently used. If you have to search for a suitable target in the method metadata of the class during each dynamic dispatch process, it may affect the execution efficiency. Therefore, in order to improve performance, JVM uses a virtual method table (virtual method table) in the method area of ​​the class (non-virtual methods will not appear in the table) to achieve. Use index tables instead of lookups.
  • There is a virtual method table in each class, and the actual entry of each method is stored in the table.
  • When is the virtual method table created? The
    virtual method table will be created and initialized during the linking phase of class loading.
    After the initial value of the class variable is prepared , the JVM will initialize the method table of the class.

[External link image transfer failed. The source site may have an anti-hotlink mechanism. It is recommended to save the image and upload it directly (img-W2mdUDRM-1609761101485) (E:\2020 Winter Vacation Study Plan\Study Notes\JVM\images\ Chapter 05_ Virtual method table.png)]

Nine, method return address

  • Store the value of the pc register that calls this method.
  • There are two ways to end a method:
    • Normal execution completed
    • An unhandled exception occurred, abnormal exit
  • No matter which way to exit, after the method exits, it will return to the place where the method was called. When the method exits normally, the caller's pc counter is used as the return address, that is, the address of the next instruction of the instruction that called the method. However, if exiting through an exception, the return address is determined by the exception table, and this part of information is generally not saved in the stack frame. (This is similar to the program interruption method in the computer composition principle, and finally the scene must be restored)

When a method starts to execute, there are only two ways to exit this method:
1. When the execution engine encounters a bytecode instruction (return) returned by any method, the return value will be passed to the upper method caller, referred to as Complete the export normally;

  • Which return instruction a method needs to use after the normal call is completed also needs to be determined by the actual data type of the method's return value.

  • In the bytecode instruction, a return instruction comprises ireturn (if the return value is boolean, byte, char,
    use short and int type), lreturn, freturn, dreturn and areturn, another
    external instruction and a return for the declared void Methods, instance initialization methods, initialization methods of classes and interfaces are
    used.

2. An exception (ExGeption) was encountered during the execution of the method, and the exception was not handled in the method, that is, as long as no matching exception handler is found in the exception table of this method, the method will exit . Referred to as abnormal completion of the export.

The exception handling when an exception is thrown during the execution of the method is stored in an exception handling table, so that it is convenient to
find the exception handling code when an exception occurs .

Essentially, the exit of the method is the process of popping the current stack frame. At this point, it is necessary to restore the local variable table and operand stack of the upper method. Push the return value into the operand stack of the caller's stack frame, set the pc register value, etc., and let the caller method continue to execute.

The difference between the normal completion exit and the abnormal completion exit is that the exit through the abnormal completion exit will not produce any return value to his upper caller.

10. Some additional information

The stack frame is also allowed to carry some additional information related to the implementation of the Java virtual machine. For example, information that supports program debugging.

Eleven, stack related interview questions

1. Give examples of stack overflow?

Fixed: StackOverflowError

Dynamic changes: OOM

The size of the stack can be set by -Xss

2. Can the stack size be adjusted to ensure that there is no overflow?

There is no guarantee, but the overflow can only be made later, and it cannot be said that it will not occur. If you give you five hundred yuan, it may be spent soon, but if you give you five thousand, it will take a little longer, and it will take a few more days.

3. Is the larger the allocated stack memory, the better?

Be nice to yourself, but not good to others. However, it appears later, but it is unavoidable not to overflow, and resources are limited, which squeezes other memory space.

4. Will garbage collection involve the virtual machine stack?

Will not,

Error (if overflow occurs) GC (garbage collection)
Program counter × ×
Virtual machine stack × (Just pop out of the stack and push into the stack)
Native method stack ×
heap √ (UNCLE)
Method area

5. Are the local variables defined in the method thread safe?

Analyze specific issues

What is thread safety?

​ If only one thread can manipulate this data, it must be thread-safe

​ If there are multiple threads operating this data, this data is shared data. If the synchronization mechanism is not considered, there will be thread safety issues.

The internal generation of internal deaths is thread-safe.

Guess you like

Origin blog.csdn.net/weixin_44226263/article/details/112197905