Java method with the constant pool area

Introduction text global string pool (string pool also the pool called the literal String) class file constant pool (class constant pool) runtime constant pool (runtime constant pool) association between the three constant pool summary reference links

Foreword

Java's JVM memory can be divided into three zones: heap memory (heap), stack memory (stack) and the method area (method), also known as static storage area.

In the process of learning will often hear the term constant pool, while the section on data do == comparison, mentions a string constant pool, upon inquiry thatThe constant pool neither heap, nor stack memory , Then the constant pool might be relationships, and methods district, for which read "layman JVM" a book, to understand the constant pool and methods related areas, while the constant pool for the classification also have a certain understanding.

This article is based on JDK1.8 all of the code of conduct.

text

Before discussing the type of constant pool need to understand what is constant.

  • With a final modified member variable represents the constant value once given can not be changed!
  • There are three variables final modification: static variables, local variables and instance variables, the three types of constants, respectively.

In the Java memory allocation, a total of three kinds of constant pool:

Global string pool (string pool is also called a string literal pool)

String constant pool where in the Java memory area

  • In JDK6.0 and earlier, the string constant pool is on Perm Gen region (i.e. region method), in which case the object is stored in the constant pool.
  • In JDK7.0 version, a string constant pool is moved to the heap. At this time, the constant storage pool is referenced. In JDK8.0, permanent generation (area method) is substituted with the element space.

What string constant pool is?

在 HotSpot VM 里实现的 string pool 功能的是一个 StringTable 类,它是一个 Hash 表,默认值大小长度是1009;里面存的是驻留字符串的引用(而不是驻留字符串实例自身)。也就是说某些普通的字符串实例被这个 StringTable 引用之后就等同被赋予了“驻留字符串”的身份。这个 StringTable 在每个 HotSpot VM 的实例里只有一份,被所有的类共享。

StringTable 本质上就是个 HashSet<String>。这是个纯运行时的结构,而且是惰性(lazy)维护的。注意它只存储对java.lang.String 实例的引用,而不存储 String 对象的内容。 注意,它只存了引用,根据这个引用可以得到具体的 String 对象。

在 JDK6.0 中,StringTable 的长度是固定的,长度就是 1009,因此如果放入 String Pool 中的 String 非常多,就会造成 hash 冲突,导致链表过长,当调用 String#intern() 时会需要到链表上一个一个找,从而导致性能大幅度下降;

在 JDK7.0 中,StringTable 的长度可以通过参数指定:

-XX:StringTableSize=66666
复制代码

class 文件常量池(class constant pool)

我们都知道,class 文件中除了包含类的版本、字段、方法、接口等描述信息外,还有一项信息就是常量池(constant pool table),用于存放编译器生成的各种字面量(Literal)和符号引用(Symbolic References)字面量比较接近 Java 语言层面常量的概念,如文本字符串、被声明为 final 的常量值等。 符号引用则属于编译原理方面的概念,包括了如下三种类型的常量:

  • 类和接口的全限定名
  • 字段的名称和描述符
  • 方法的名称和描述符

常量池的每一项常量都是一个表,一共有如下表所示的11种各不相同的表结构数据,这每个表开始的第一位都是一个字节的标志位(取值1-12),代表当前这个常量属于哪种常量类型。


Each different type of constant types have different structures, concrete structures they would not be described herein, this article focuses on three conceptual distinction constant pool (constant reader To learn more about each type of data structure can be viewed "in-depth understanding java virtual machine "of Chapter VI, In fact, I have not yet understand, follow-back filled pit )。

Runtime constant pool (runtime constant pool)

Runtime constant pool is part of the zone method.

When the file is compiled into Java class files, that is, it will generate the above mentioned class constant pool, then the runtime constant pool is when to produce it?

JVM in the implementation of a class, must pass 加载、连接、初始化, and the connection also includes validation, preparation, resolve (resolve) in three stages. And after class when loaded into memory, JVM brings its class file constant pool is stored in the runtime constant pool, can be seen, runtime constant pool of every class has one. In said above, class constant pool is kept 字面量和符号引用, that is not an object instance are stored, but the symbol object reference value. And after a resolve, which is the symbol replace references to direct quote, query parsing process will go global string pool, which is mentioned above StringTable, to ensure that global string string pool runtime constant pool cited cited are consistent.

Association between the three constant pool

About time JVM execution, but also to the 字符串常量池.

在类加载阶段, JVM 会在堆中创建对应这些 class 文件常量池中的字符串对象实例,并在字符串常量池中驻留其引用。具体在 resolve 阶段执行。这些常量全局共享。
复制代码

Here that the more general, yes, resolve stage, but not as we think, and immediately create objects that reside in a string constant pool references. JVM specification was explicitly resolve the phase can be lazy.

JVM 规范里 Class 文件常量池项的类型,有两种东西:CONSTANT_Utf8 和CONSTANT_String。前者是 UTF-8 编码的字符串类型,后者是 String 常量的类型,但它并不直接持有 String 常量的内容,而是只持有一个 index,这个 index 所指定的另一个常量池项必须是一个 CONSTANT_Utf8 类型的常量,这里才真正持有字符串的内容。

在HotSpot VM中,运行时常量池里,

CONSTANT_Utf8 -> Symbol*(一个指针,指向一个Symbol类型的C++对象,内容是跟Class文件同样格式的UTF-8编码的字符串)
CONSTANT_String -> java.lang.String(一个实际的Java对象的引用,C++类型是oop)
复制代码

CONSTANT_Utf8 会在类加载的过程中就全部创建出来,而 CONSTANT_String 则是 lazy resolve 的,例如说在第一次引用该项的 ldc 指令被第一次执行到的时候才会 resolve。那么在尚未 resolve 的时候,HotSpot VM 把它的类型叫做JVM_CONSTANT_UnresolvedString,内容跟 Class 文件里一样只是一个 index;等到 resolve 过后这个项的常量类型就会变成最终的 JVM_CONSTANT_String,而内容则变成实际的那个 oop。

看到这里想必也就明白了, 就 HotSpot VM 的实现来说,加载类的时候,那些字符串字面量会进入到当前类的运行时常量池,不会进入全局的字符串常量池(即在 StringTable 中并没有相应的引用,在堆中也没有对应的对象产生)。所以上面提到的,经过 resolve 时,会去查询全局字符串池,最后把符号引用替换为直接引用。(即字面量和符号引用虽然在类加载的时候就存入到运行时常量池,但是对于 lazy resolve 的字面量,具体操作还是会在 resolve 之后进行的。)

关于 lazy resolution 需要在这里了解一下 ldc 指令

简单地说,它用于将 String 型常量值从常量池中推送至栈顶。

以下面代码为例:

    public static void main(String[] args) {
        String s = "abc";
    }
复制代码

For example, the code file is Test.java, first open the file in the directory Dos window, execute javac Test.javacompile, and then enter the javap -verbose TestView class files are compiled as follows:


Use ldc instruction "abc" is loaded into the operand stack, and then assign it to astore_1 s we define local variables, and then return.

Binding above say, at resolve phase (constant pool resolution), a string literal object to be created and its reference string constant pool reside, but this is a lazy resolve. In other words there is no real objects, string constants the pool naturally not, then how ldc instruction also pushed to the top of the stack value and the assignment? Or want a different angle, since the resolve stage is lazy, that there is always a time to really execute it right, what time?

Ldc instruction execution condition is triggered lazy resolution actions

ldc bytecode execution semantics here is: current running class constants pool (runtime constant pool, HotSpot VM is in ConstantPool + ConstantPoolCache) to find the entry corresponding to the index, if the item is not yet resolve the resolve, and returns the contents of resolve.
In the face of type String constants, resolve the process if it is found StringTable have matching content java.lang.String reference, this reference is returned directly; on the contrary, if the contents have not yet matched StringTable in reference to an instance of String, then in the Java heap creates a String object corresponding to the content, then this reference under StringTable record, and returns the reference.

Visible, ldc instruction is required to create a new String instance, it all depends on when the first execution of this one ldc instruction, StringTable whether a reference to a record corresponding to the contents of a String.

Do analysis with the following code shows:

 public static void main(String[] args) {
           String s1 = "abc";  
        String s2 = "abc";
        String s3 = "xxx";
    }
复制代码

class file to view it compiled as follows:

在这里插入图片描述
Here Insert Picture Description

Graphically show the way:
在这里插入图片描述
Here Insert Picture Description

String s1 = "abc"; resolve process in a string constant pool found no reference to "abc", they heap create a new "abc" object, and the object reference to a string constant into the pool, then this reference back to s1 .

String s2 = "abc"; resolve process will find StringTable has been cited in "abc" object is returned directly to the reference s2, and does not create any objects.

String s3 = "xxx"; Like the first line of code, create a heap object, and the object reference stored into StringTable, finally returns a reference to s3.

Intern constant pool and methods

 public static void main(String[] args) {
           String s1 = "ab";//#1
        String s2 = new String(s1+"d");//#2
        s2.intern();//#3
        String s4 = "xxx";//#4
        String s3 = "abd";//#5
        System.out.println(s2 == s3);//true
    }
复制代码

class file to view it compiled as follows:

By class file information indicates that, "ab", "d", "xxx", "abd" into the class file constant pool, because the class is lazy resolve of the stage, it will not create an instance of an object, not to dwell string constant pool.

Illustrated as follows:


Into the main method, for each line of code interpretation.

  • 1, ldc instruction will "ab" is loaded into the top of the stack, in other words, in the heap to create "ab" object and the reference object to the saved string constant pool.
  • 2, ldc instruction will "d" is loaded into the top of the stack, then there is a splicing operation, the interior is creating a StringBuilder object, all the way to append, toString method StringBuilder object of the last call to get a String object (the content is abd, note toString method will be a new String object), and assign it to s2 (assignment remains subject to s2 reference only).注意此时没有把“abd”对象的引用放入字符串常量池。
  • 3, intern method first to find out if there is a string constant pool "abd" a reference to the object, and if not, put the pile "abd" save object reference to a string constant pool, and returns the reference, but we I did not use a variable to receive it.
  • 4, meaningless, for illustrative purposes only "abd" literal class file # 5 is obtained.
  • 5,字符串常量池中已经有“abd”对象的引用,因此直接将该引用返回给 s3。

总结

1、全局字符串常量池在每个 VM 中只有一份,存放的是字符串常量的引用值。

2、class 常量池是在编译的时候每个 class 都有的,在编译阶段,存放各种字面量和符号引用。

3、运行时常量池是在类加载完成之后,将每个class常量池中的符号引用值转存到运行时常量池中,也就是说,每个 class 都有一个运行时常量池,类在解析之后,将符号引用替换成直接引用,与全局常量池中的引用值保持一致。

4、class 文件常量池中的字符串字面量在类加载时进入到运行时常量池,在真正在 resolve 阶段(即执行 ldc 指令时)时将该字符串的引用存入到字符串常量池中,另外运行时常量池相对于 class 文件常量池具备动态性,有些常量不一定在编译期产生,也就是并非预置入 class 文件常量池的内容才能进入到方法区运行时常量池,运行期间通过 intern 方法,将字符串常量存入到字符串常量池中和运行时常量池(关于优先进入到哪个常量池,私以为先进入到字符串常量池,具体实现还望大神指教)。

参考链接

Guess you like

Origin juejin.im/post/5dc2ce826fb9a04ab12bc4e6