unsigned number, table
When compilers in different languages are implemented, such as jython, jruby, etc., then you can use these languages to write code, compile them into bytecode files that conform to the jvm specification through their respective compilers, and then use jvm to execute.
The location and role of the Class file in the Java architecture
In the last blog, I roughly explained the architecture and execution principle of the Java virtual machine. This blog mainly explains the format of the class file that can be recognized, loaded and executed by the JVM.
For understanding the JVM and in-depth understanding of the Java language, learning and understanding the format of the class file is a must-have homework. The reason is very simple. The JVM will not understand the Java source files we write. We must compile the Java source files into class files before they can be recognized by the JVM. For the JVM, a class file is equivalent to an interface. Understanding this interface can help We better understand the behavior of the JVM; on the other hand, class files re-describe what we mean in source files in another way, understanding how class files re-describe the source files we write, and for a deep understanding of the Java language and syntax are very helpful. In addition, no matter what language it is, as long as it can be compiled into a class file, it can be recognized and executed by the JVM, so the class file is not only the foundation of cross-platform, but also the foundation of JVM cross-language. After understanding the class file format, for us to learn based on JVM of other languages would be of great help.
In a word, in the whole Java technology architecture, the class file is in the middle position, which has a linking effect on understanding the whole system. as the picture shows:
Overview of the Class File Format
Types of | name | quantity |
u4 | magic | 1 |
u2 | minor_version | 1 |
u2 | major_version | 1 |
u2 | constant_pool_count | 1 |
cp_info | constant_pool | constant_pool_count - 1 |
u2 | access_flags | 1 |
u2 | this_class | 1 |
u2 | super_class | 1 |
u2 | interfaces_count | 1 |
u2 | interfaces | interfaces_count |
u2 | fields_count | 1 |
field_info | fields | fields_count |
u2 | methods_count | 1 |
method_info | methods | methods_count |
u2 | attribute_count | 1 |
attribute_info | attributes | attributes_count |
Each item in the class file is explained in detail below.
Magic numbers and version numbers in class files
(1) magic
The four bytes at the beginning of the class file store the magic number of the class file. This magic number is the mark of the class file. It is a fixed value: 0XCAFEBABE . That is to say, it is a standard for judging whether a file is a file in class format. If the first four bytes are not 0XCAFEBABE, it means that it is not a class file and cannot be recognized by the JVM.
(2)minor_version 和 major_version
The four bytes following the magic number are this version number and the major version number of the class file. With the development of Java, the format of the class file will also be changed accordingly. The version number indicates when the class file was added or changed which features. For example, the version numbers of class files compiled by different versions of the javac compiler may be different, and the version numbers of class files that can be recognized by different versions of JVM may also be different. The class file compiled by the javac compiler, and the lower version of the JVM cannot recognize the class file compiled by the higher version of the javac compiler. If a class file of a higher version is executed using a lower version of the JVM, the JVM will throw a java.lang.UnsupportedClassVersionError . The specific version number change will not be discussed here, and readers who need it can consult the information by themselves.
Overview of constant pools in class files
In the class file, the data items related to the constant pool are located after the version number. The constant pool is a very important piece of data in the class file. The constant pool stores literal strings, constant values, the class name of the current class, field names, method names, descriptors of each field and method, reference information to the fields and methods of the current class, and references to other classes in the current class. Citation information, etc. The constant pool contains descriptions of almost all the information in the class, and many other parts in the class file are references to the data items in the constant pool, such as this_class, super_class, field_info, attribute_info, etc. to be mentioned later, and other bytes There is also a reference to the constant pool in the code instruction, and this reference to the constant pool is used as an operand of the bytecode instruction. In addition, the various items in the constant pool will also refer to each other.
The value of the item constant_pool_count in the class file is 1, indicating that each class has only one constant pool. The data in the constant pool is also item-by-item, and there is no gap to discharge sequentially. Each data item in the constant pool is accessed by index, which is similar to an array, except that the index of the first item in the constant pool is 1, not 0, if the constant pool with index 0 is referenced elsewhere in the class file entry, it means that it does not refer to any constant pool entry. Each data item in the class file has its own type. For the same reason, each data item in the constant pool also has its own type. The types of data items in the constant pool are as follows:
The type of data item in the constant pool | type flag | Type description |
CONSTANT_Utf8 | 1 | UTF-8 encoded Unicode string |
CONSTANT_Integer | 3 | int type literal |
CONSTANT_Float | 4 | float type literal |
CONSTANT_Long | 5 | long type literal |
CONSTANT_Double | 6 | double type literal |
CONSTANT_Class | 7 | a symbolic reference to a class or interface |
CONSTANT_String | 8 | String type literal |
CONSTANT_Fieldref | 9 | symbolic reference to a field |
CONSTANT_Methodref | 10 | a symbolic reference to a method declared in a class |
CONSTANT_InterfaceMethodref | 11 | a symbolic reference to a method declared in an interface |
CONSTANT_NameAndType | 12 | partial symbolic reference to a field or method |
Each data item is called a XXX_info item, for example, an item of type CONSTANT_Utf8 in a constant pool is a CONSTANT_Utf8_info. In addition, each info item has a flag value (tag), which indicates the type of the info item in the constant pool. As can be seen from the above table, a tag in CONSTANT_Utf8_info The value is 1, and the value of a tag in CONSTANT_Fieldref_info is 9.
Java程序是动态链接的, 在动态链接的实现中, 常量池扮演者举足轻重的角色。 除了存放一些字面量之外, 常量池中还存放着以下几种符号引用:
(1) 类和接口的全限定名
(2) 字段的名称和描述符
(3) 方法的名称和描述符
在详细讲解常量池中的各个数据项之前, 我们有必要先了解一下class文件中的特殊字符串, 因为在常量池中, 特殊字符串大量的出现,这些特殊字符串就是上面说的全限定名和描述符。 要理解常量池中的各个数据项, 必须先了解这些特殊字符串。