In-depth understanding of the Java Virtual Machine (third edition) - runtime stack frame structure

Runtime stack frame structure

Java虚拟机以方法作为最基本的执行单元,“栈帧”(Stack Frame)则是用于支持虚拟机进行方法调用和方法执行背后的数据结构,它也是虚拟机运行时数据区中的虚拟机栈(Virtual Machine Stack)的栈元素

栈帧存储了方法的局部变量表、操作数栈、动态连接和方法返回地址等信息If the reader carefully read Chapter 6, you should find still controls more than most of the concepts from the method table Class file format. Each method call process from start to end of execution, the process corresponds to a stack frame in which the virtual machine from the stack to push the stack.

Each stack frame includes a local variable table, the operand stack, the dynamic connection, the method returns an address and some additional extensions. In compiling Java program source code when 栈帧中需要多大的局部变量表,需要多深的操作数栈就已经被分析计算出来,并且写入到方法表的Code属性之中. In other words, a stack frame how much memory allocation, and will not be affected by the program runtime variable data, and depends only on the stack memory layout form of program source code and specific virtual machine implementation.

The method is called a thread chain may be very long, at an angle of Java programs look, same time, same thread inside, all methods in the call stack are simultaneously being executed. And 对于执行引擎来讲,在活动线程中,只有位于栈顶的方 法才是在运行的,只有位于栈顶的栈帧才是生效的,其被称为“当前栈帧”(Current Stack Frame),与 这个栈帧所关联的方法被称为“当前方法”(Current Method)。all the bytecode instruction execution engine running operation only for the current stack frame, on a conceptual model, a typical stack frame structure shown in Figure 8-1.

It is the overall structure of the virtual machine stack and the stack frame shown in FIG. 8-1, Next, we will learn more about the local variable stack frame of the table, the operand stack, the dynamic connection, the method returns the address of the function of each part and data structure.

Here Insert Picture Description

Local variable table

局部变量表(Local Variables Table)是一组变量值的存储空间,用于存放方法参数和方法内部定义的局部变量。When the program is compiled into Java Class file, determines the maximum capacity of the local variables of the desired allocation table in which the data item max_locals Code attribute method.

局部变量表的容量以变量槽(Variable Slot)为最小单位, "Java Virtual Machine Specification" and does not specify a variable slot should be occupied memory space, but it comes very oriented to each variable slot should be able to store a boolean, byte, char, short, int, float , Reference returnAddress or type of data, eight data types, can use 32-bit or less to store physical memory, but the description clearly states that "it should take for each variable memory space slot 32 length" is there are essential differences, which allows variable slot length may vary with different processors, operating systems or virtual machine implementation vary, to ensure the use of the 64-bit physical memory space even if the 64-bit virtual machine to achieve a variable slot, the virtual machine still use the alignment and padding means makes variable the groove 32 appears to be consistent with the virtual machine in appearance.

Since the aforementioned data types in the Java virtual machine, where they are then briefly explain. Groove can hold a variable data type within a 32-bit, Java中占用不超过32位存储空间的数据类型有boolean、byte、char、short、int、float、reference[1]和returnAddress这8种类型. The previous six kinds do not need further explanation, the reader can correspond to the concept of data types in accordance with the Java language to understand them (only understood it, the Java language and the Java virtual machine basic data type is the presence of the essential differences), while 第7种reference类型表示对一个对象实例的引用" Java virtual machine specification "neither explain its length, it did not specify how such references should have a structure. In general, however, the virtual machine to achieve at least through this reference should do two things, first, from the start address or the index to find objects in the Java heap data stored directly or indirectly, according to references, the second is based on Find quoted directly or indirectly to the data type of the object belongs to the type of information stored in the method area, otherwise it will not achieve syntax conventions "Java language specification" as defined in [2]. The eighth returnAddress types now rarely met, it is for the bytecode instructions jsr, jsr_w and ret services, pointing to the address of a byte code instructions, some very old Java virtual machine used to use it a few to implement the jump instructions of the exception process, but also to have all adopted instead of the exception table.

For 64-bit data types, Java virtual machine in an aligned manner the upper two continuous variables assigned slot space. Java language in a clear 64-bit data types only long and double two kinds. Here the data types long and double split bank practices and "Non atomic long and double agreement" allows one to read long and double data types is divided into two 32-bit read-write approach some classes
like, to the reader the book about the contents of Java memory model can be compared [3]. However, due to the local variable table is based on the thread's stack, belonging to the thread private data, regardless of read and write two consecutive variable slot whether atomic operation, will not cause data races and thread-safety issues.

Java虚拟机通过索引定位的方式使用局部变量表,索引值的范围是从0开始至局部变量表最大的变 量槽数量. 如果访问的是32位数据类型的变量,索引N就代表了使用第N个变量槽,如果访问的是64位 数据类型的变量,则说明会同时使用第N和N+1两个变量槽. For two adjacent grooves together two variables to store a 64-bit data, the virtual machine is not allowed in any way to access one, "Java Virtual Machine Specification," which clearly requires that the individual be encountered if such an operation byte code sequence, the virtual machine should be thrown in the loaded class verification stage.

当一个方法被调用时,Java虚拟机会使用局部变量表来完成参数值到参数变量列表的传递过程, 即实参到形参的传递. If the execution is an instance method (not modified static method), then the local variable table 第0位索引的变量槽默认是用于传递方法所属对象实例的引用, the method can access to this implicit parameter keyword "this". Other parameters according to the parameter table is arranged in order, from occupying the local variable starting tank 1, the parameter table after the assignment is completed, the remaining variables redistribution grooves and variable according to the order defined within the scope of the method body.

In order to save memory space consumed stack frame as possible, variable groove local variable table is reusable, variables defined within the body method, which does not necessarily cover the entire scope of the method thereof, if the current byte code counter PC the value is beyond the scope of a variable, then this variable corresponding to the variable slot can be reused to other variables. However, such a design in addition to saving space outside the stack frame, will be accompanied by a small number of additional side effects, for example, in some cases, variable slot reuse garbage collection will directly affect the behavior of the system, see Listing 8-1, Listing 8-2 and Listing 8-3 3 demo.

Here Insert Picture Description

Here Insert Picture Description

Here Insert Picture Description

Listing 8-1 to 8-3, placeholder能否被回收的根本原因就是:局部变量表中的变量槽是否还存有 关于placeholder数组对象的引用. The first modification, although the code has left the scope of the placeholder, but after this, no read operation occurred no local variable table, p laceholder variable slot originally occupied by yet other variables are multiplexed used, so as a part of the local variable table GC Roots remained significantly associated to it. This association was not interrupted in time, affect the vast majority of cases are mild. 但如果遇到一个方法,其后面的代码有一些耗时很长的操作,而前面 又定义了占用了大量内存但实际上已经不会再使用的变量,手动将其设置为null值(用来代替那句int a=0,把变量对应的局部变量槽清空)便不见得是一个绝对无意义的操作,这种操作可以作为一种在极 特殊情形(对象占用内存大、此方法的栈帧长时间不能被回收、方法调用次数达不到即时编译器的编 译条件)下的“奇技”来使用. Java language, a very famous book "Practical Java" the will "objects should not be used to manually assign null" as a recommended coding rules (I do not agree with this rule), but did not explain the specific reasons, For a long time there are readers of this rule wonder.

Although the sample code in Listing 8-1 to 8-3 illustrate Fu null operate in some extreme cases indeed it is useful, but should not be the author's view is assigned a null value for the operation of any particular dependency, but no need to it as a universal coding rules to promote. There are two reasons, from a coding point of view, in the proper scope of the variable to control the variable time is the most elegant solution to the recovery, as seen in Listing 8-3 as the scene in addition to doing the experiment almost useless. More to the point, from the implementation point of view, the use of enabling null operations to optimize memory recovery is based on an understanding of the conceptual model bytecode execution engine, and in the introduction to Chapter 6 byte code, the author at the end of also he wrote a summary of "public design, private implementation" (section 6.5) to emphasize the conceptual model and the actual implementation process is equivalent external looks, the interior looks completely different. When executing a virtual machine using an interpreter, usually closer to the conceptual model will be, but after a time compiler compiler optimization applied various measures after the difference between the two will be very large, only to ensure that results are consistent with the concept of program execution . In practice, the main form of time compilation is the virtual machine code execution, operation assigned a null value after a time compiler optimization will almost certainly be eliminated as invalid operation, this time the variable is set to null cents meaningless behavior. After instant bytecodes are compiled to native code, the enumeration of GC Roots are also significantly different from the run-time interpretation, in view of the previous example, after the first modified code in Listing 8 - 2 after time compilation, Sy stem. gc () can reclaim memory when executed correctly, there is no need to write code that looks like Listing 8-3.

There is "preparatory phase" on the local variable table, there is little actual development may have an impact, that is, local variables like class variables as described earlier. By studying Chapter 7, we already know 类的字段变量有两次赋初始值的过程,一次在准 备阶段,赋予系统初始值;另外一次在初始化阶段,赋予程序员定义的初始值. So it does not matter even at the initial stage of the programmer is not assigned to class variables, class variables still has an initial value of a determined, no ambiguity. But a local variable is not the same, if a local variable is defined but not assigned an initial value, then it is totally unusable. So do not think under any circumstances exist in Java as an integer variable defaults to 0, Boolean variables default to the default value rules such false and so on. As shown in Listing 8-4, the code actually can not run (in other languages, such as C and C ++ code that is similar to the run) in Java, but fortunately the compiler to be able to check at compile time and this is prompted by, or even manually by the compiler generates bytecode embodiment produced the effect of the following code, bytecode verification time will be found to lead to the virtual machine fails to load the class.

Here Insert Picture Description

Operand stack

操作数栈(Operand Stack)也常被称为操作栈,它是一个后入先出(Last In First Out,LIFO) 栈 。 同 局 部 变 量 表 一 样 , 操 作 数 栈 的 最 大 深 度 也 在 编 译 的 时 候 被 写 入 到 C o d e 属 性 的 m a x_ s t a c k s 数 据 项 之中。Each operand stack elements may be a long and double data types, including arbitrary Java. 32 occupied by data type stack bit capacity occupied by data type 1,64 stack 2 capacity. Data flow analysis Javac compiler ensures method performed at any time, the depth of the operand stack will not exceed the maximum value set in the ma x_ stacks of data items.

When beginning a method of execution, the operand stack is empty this method, during the execution of the method, have a variety of bytecode instructions and writes the extracted content to the operand stack, and the stack is stack operation. For example, when doing an arithmetic operation is performed by the operand stack after the operation according to operation instruction onto stack of calling, and call other methods such as the time parameters of the method is carried out by passing the operand stack. For example, integer addition, for example, the iadd bytecode instructions, this instruction requires two closest to the operand stack elements stored in the stack has two int type values ​​at run time, when this instruction is executed , these two will int values ​​and adding the stack, and then re-stack the result of addition.

操作数栈中元素的数据类型必须与字节码指令的序列严格匹配, When the compiler code compiler must strictly ensure that, in the data flow analysis class verification stage have to verify it again. Iadd instruction again to the above example, this command can only be used for the addition of an integer, when it is executed, the data type of the stack closest to the two elements must be an int, a long and can not appear to use a float command added iadd situation.

Also in the conceptual model, two different stack frames of different methods as a virtual machine stack element, are completely independent of each other. But some will be optimized to achieve most of the virtual machine, the stack frame appears to make two partially overlapping. Let the following stack frame number part and the overlapped portion of operand stack local variable stack frame of the table above, this not only saves some space, more importantly, can share a portion of data during direct method call, without the need for additional replication parameters passed, the overlapping process shown in Figure 8-2.

Java virtual machine interpretation execution engine called "stack-based execution engine" inside the "stack" is the operand stack. Would later explain in more detail based on the execution stack code that describes what it is more common to have differences with the execution engine based on the register.

Here Insert Picture Description

Dynamic Link

每个栈帧都包含一个指向运行时常量池中该栈帧所属方法的引用,持有这个引用是为了支持方 法调用过程中的动态连接(Dynamic Linking)。By explained in Chapter 6, we know that the constant pool Class file contains a lot of symbolic references, call the method bytecode instructions on the pond with constant symbolic reference method as a parameter. These symbols will reference portion was converted directly referenced class loader when the phase or first use, this conversion is referred to as static resolution. The other part is converted into a direct reference during each operation, this part is called dynamic linking. For description of these conversion processes, will be explained in further detail in section 8.3.

Methods return address

When a method begins execution, 只有两种方式退出this method.

  • The first way is 执行引擎遇到任意一个方法 返回的字节码指令, at this time there may be a method return value passed to the upper layer of the caller (calling a method called the current method of the caller or the caller method), if the method returns a value and return value will vary depending on the type of encounter to decide which method return instructions, this exit method is called " 正常调用完成”(Normal Method Invocation Completion).

  • Another method of exit is in the process of execution 遇到了异常,并且这个异常没有在方法体内得到妥善处 理. Whether abnormality generated inside the Java virtual machine, or the use of bytecode instructions athrow abnormality generated code, as long as no matching search exception handler in the exception table of the method, the method will lead to exit, exit this method the approach is called "abnormal call is complete (abrupt Method invocation completion)". One way to use the exception to complete the export way to the exit, will not give it the upper caller to provide any return value.

无论采用何种退出方式,在方法退出之后,都必须返回到最初方法被调用时的位置,程序才能继 续执行,You may need to save some of the information in the stack frame when the method returns, to help restore the state of implementation of its upper melody methods. 一般来说,方法正常退出时,主调方法的PC计数器的值就可以作为返回地址,栈帧中很可能会保存这 个计数器值. A method while abnormal exit, the return address is to be determined by the exception handler table, in general the stack frame information in this section is not saved.

方法退出的过程实际上等同于把当前栈帧出栈Therefore operation may be performed on exit are: recover upper layer method of local variables and operand stack table, the return value of the operand stack (if any) is pressed into the caller's stack frame, adjust the value of the counter PC to point the method of instructions following a call instruction and the like. Here the author wrote "probably" Since this discussion is based on a conceptual model, and only specific to a certain Java virtual machine implementation, which will perform the operation to be confirmed.

extra information

"Java Virtual Machine Specification" allows virtual machines to achieve an increase of some information not described in the specification into the stack frame, for example, information related to the collection and debugging, performance, this part of the information is completely dependent on the specific virtual machine implementation, not described in detail here above. In discussing the concept of dynamic linking will generally process the return address and other additional information all into one category, called stack frame information.

Published 897 original articles · won praise 2315 · Views 310,000 +

Guess you like

Origin blog.csdn.net/cold___play/article/details/105339510