JVM - Class class file structure

I do not know if you have not generated after compile the .java file .class file wondered.

We all know that Java is a product of the class file after the elapse of the Java compiler to compile to Java class files. I think there are a few C programmers in Java after learning in cognitive thought would roughly C program after compiling generated .out files with .class file roughly the same in all respects, so I just started also confuse yourself, but with further study, we have to figure out .class file in the end is what.

If you .class file to view it with this IDEA integrated development environment tools, you will find it and the source code and there is no difference, because it was the way .class file decompile.

Different .class file and .out files

To understand the differences between two files, we must first understand the definition of the two documents.

.class file

java compiler when compiling java class file, the original text files (.java) translated into binary byte code, and in the .class file store bytecodes.

That java class file attributes, methods, and constants information class, respectively, will be stored in the .class file.

From this passage we extract our focus: .class files are binary byte code. Recognition by the JVM, the analysis is performed.

Can directly read the bytecode also work in Java code analysis tools and semantic issues of the necessary basic skills.

.out file

C language source code (.c files), compiled by the compiler to generate machine instructions in source code, and adds the description information stored in .out files (executable file). Executable files can be run to load the operating system, the computer executes machine instructions in the file.

From this passage we extract our focus: .out files are binary machine instructions. Run by the operating system is loaded.

At this time, the difference between the two files already evident: First, although the two are binary files, but the storage is completely different, a bytecode, a machine instruction. Then run different platforms, one is the operating system, a virtual machine.

Meaning .class file

Understand the above period, even though we already know from the nature of the two documents What is the difference, but when in use but still do not feel any difference, both are executable files ah, byte code and machine instructions in the end what difference does it make?

From the first lesson about the computer on the start, the teacher kept telling us, "the computer only recognize binary data inside the computer, the nature of its operation, that is a bunch 010101010101 ... ..." This is the machine string 010101 ... ... command, so the operating system can be loaded, run on the .out file.

What bytecode is it? Think about this, you can think about where Java's advantages. Remember there is no such word spread in the Java community, "write once, run anywhere." Yes, byte code is to provide the cornerstone of platform independence.

Java program is compiled bytecode but all produce the same, the byte code is loaded by the JVM to run on a variety of platforms. This unified program storage format, enabling cross-platform nature of Java.

That point the way, the other large neutral characteristics of virtual machines now have increasingly valued by developers - language-independent, which means that not only has the Java virtual machine can execute Java programs, like other languages ​​JRuby, Groovy, etc. It can run on the Java virtual machine.

the overall structure .class file

To understand the specific content .class file inside the store, we must first have a comprehensive understanding of the overall storage architecture .class files. Of course, before that, we first make a detailed definition of the .class file.

Any class or interface corresponds to a unique .class files, specifically as shown below:

Class is a set of binary stream file in bytes of the base units, the compact arrangement of the various data items in the Class file. And storage of data items similar to the big endian mode, do not understand their own Baidu big-endian mode.

Class file structure contains only two data types: unsigned table. It is not complicated.

Unsigned number is the basic data types, we u1, u2, u4, u8 respectively represent 1 byte, 2 bytes, 4 bytes, 8 bytes. Unsigned number can be used to describe figures, reference index, the value or number of UTF-8 encoded string configuration. If you are an abstraction of the above words, do not worry, come back to see the end, you will find yourself wondering been resolved.

Table unsigned number is a plurality of tables or other complex data structure as the data item constituted accustomed to "_info" end. Class file is essentially a table.
We need to focus on understanding the magic number, Class file version, constant pool, class index, the parent index, the index set of interfaces, field collection of tables set of tables method, an important part of the property sheet collections, etc. Class files. For a set of tables to explain the property, I will put The second in.

Magic Number and Class file version

To learn Class file structure and composition, will certainly require us to make a Class file for analysis. So we look at a simple piece of Java code and compile it, it would have been used after this code and Class files generated, so do not look at the past to be forgotten ~~

public class TestClass {
    private int m;

    public int inc() {
        return m+1;
    }
}

We .class files produced by the compiler for viewing using GHex16 hex text editor in Ubuntu 16.04. As shown below:

 

Magic number is the first four bytes of each Class file, this file to determine whether the action is a virtual machine can accept Class file. Its value is also very easy to remember, full of romance: CAFEBABE (? Coffee baby), and the Java logo seems to have some connection ~

Then only the magic number is stored in 4 bytes Class File version number: byte minor version number 5,6, 7,8 byte major version number, the role of these four bytes is typically used to allow we distinguish the current JDK version, high version of the JDK can be backward compatible with earlier versions, in turn, can not.

In general we do not need to be too concerned about the minor version number, the version number of the corresponding conversion JDK general steps are as follows:

For example, the figure above my minor version number is 0x00, the major version number is 0x34, convert decimal 52, the JDK version 52-45 + 1, which is 8, so my current JDK version JDK1.8, no errors .

Constant pool

What important part of the Class file is to say, I think it definitely belongs to the constant pool (most other items associated data types, one of the largest Class file space data items, table type data items) and attribute table, the section on property sheet next time we talk.

Java virtual machine runtime constant pool area method is to load the class constant pool after .class file into memory.

The constant pool count capacity

The constant pool is a first entrance u2 type (unsigned, two bytes) of data, because the capacity of the representative constant pool count value, but this value is not set because the number of fixed constant in a constant pool. From the image above you can see the capacity of this Class constant pool file is 0x0013.

It is worth mentioning that the capacity of the constant pool is counting from 0 instead of starting from the beginning of 1, Class file capacity, it was only counting the constant pool is starting at 1. The purpose of this design is to meet certain data point index value constant pool behind the need to express the meaning of "does not refer to any of a constant pool item" under certain circumstances.

So my Class file constant pool capacity converted to decimal 19, which is only 18 constant, the index value ranging from 1 to 18.

The constant pool storage project type

Speaking constant pool above capacity, then we need to analyze the content of the constant pool, before analyzing the content stored in the constant pool, we need a constant pool of storage types to make a presentation.

The main constant pool storing two constants: literal and symbolic references.

Literal constants close to the concept of Java language level, such as text strings, final constant value and so on.

The following three types of symbol references constants (on the fully qualified name of the descriptor on this later):

Javap command

It is said that after the work if we read the Class file byte code would not be exhausted, rest assured, now is an era of emphasis on efficiency, not what it used to be a programmer to write a program for puncturing one era .

We can use the computer often means Javap command scale output.

We look at the usage:

// Class TestClass file is compiled after TestClass.java file generated above us 
javap -verbose TestClass

Look at the output :( omitted information other than the constant pool)

Classfile /home/hg_yi/深入理解Java虚拟机/类文件结构/TestClass.class
  Last modified 2017-10-20; size 275 bytes
  MD5 checksum 4bb559d0c40918dfedd533c18bd75add
  Compiled from "TestClass.java"
public class TestClass
  minor version: 0
  major version: 52
  flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
   #1 = Methodref          #4.#15         // java/lang/Object."<init>":()V
   #2 = Fieldref           #3.#16         // TestClass.m:I
   #3 = Class              #17            // TestClass
   #4 = Class              #18            // java/lang/Object
   #5 = Utf8               m
   #6 = Utf8               I
   #7 = Utf8               <init>
   #8 = Utf8               ()V
   #9 = Utf8               Code
  #10 = Utf8               LineNumberTable
  #11 = Utf8               inc
  #12 = Utf8               ()I
  #13 = Utf8               SourceFile
  #14 = Utf8               TestClass.java
  #15 = NameAndType        #7:#8          // "<init>":()V
  #16 = NameAndType        #5:#6          // m:I
  #17 = Utf8               TestClass
  #18 = Utf8               java/lang/Object

We said before that there are constant Class file constant pool many data items to be referenced, so the results will be used in the output above the back, do not forget ~ ~

The output of the above, there are a lot of information is consistent with our earlier analysis of the results obtained Class file:

1. minor version and major version number:

2. The index values ​​range:

 3. The constant pool item type analysis

The code indicates the first constant points to item 4 and 15 constant.

As for java/lang/Object."<init>":()Vthis thing, we said before about the fully qualified name of the class descriptor, say wait a minute.

Access flag

After the constant pool, only take two bytes represents the access flag is used to identify some of the class or interface level information, including: The Class is a class or an interface; whether defined as a public type; whether defined as an abstract type; if it is class, whether declared as final and so on.

Specific flags and their meanings are as follows:

 After line of code such output normally using the structure of the scale from above Javap command:

 We view the access flag shows that this kind of description of the flags is correct, and therefore its value access_flags should be: 0x0001 | 0x0020 = 0x0021. If your original corresponding to the uppermost byte code, and will find the two bytes occupied by the displayed value after the constant pool coincide.

Class index, the index parent class, interface index set

After the access flag, the parent class is the class index and index, interface index collection. Class index, the index is a parent class u2 type of data, and the interface index set is a group of data types u2, Class three data files to determine inheritance of this class.

Tip: Because Java is a single inheritance, and all of the Java Object class has a parent class, in addition to java.lang.Object itself, so in addition to over-Object, the parent index Java classes are different from zero.

We continue to use the example of the beginning:

I figure above marked bytecodes are 0x0003,0x0004,0x0000, category index that is constant in the constant pool 3, 4 for the first index parent constant in the constant pool. And a bit different interface index, the first index -u2 type of interface data interface counters, represents the capacity of the index table. As I have said, it is understood the interface counter value 0, followed by the index table is no longer occupies any byte interfaces.

 We have just Javap combined output of the command:

  #3 = Class              #17            // TestClass
  #4 = Class              #18            // java/lang/Object

  #17 = Utf8               TestClass
  #18 = Utf8               java/lang/Object

3,4 constants can see point 17, 18 respectively and constant, and their values ​​are UTF-8 format TestClass and java / lang / Object.

This part of the analysis is complete.

Fields set of tables

Introduction fields of the table

This table is used to describe variables declared within a class or an interface.

Field includes a variable class-level (static) class and instance variables, but does not include a local variable.

Field contains what information?

Scope field (public, private, protected), static, final, volatile, transient (serialization), field data type, the field name.

In addition to field data type, the byte length field names need not be fixed and constant pool reference content, other modifiers are suitable for use flag bits.

Therefore, the main fields of the table stores the following information:

As used herein to describe attributes_info behind a field of extra information, such as: final static int m = 123;, the table will have a field ConstantValue attribute, which points to a constant value of 123.

Then we introduce what is called "the simple name, the fully qualified name, descriptor."

Simple name, fully qualified name descriptor

Let us look back at the other data fields in the table structure: and the name of the index descriptor index. They are references to the constant pool, representing the simple names of the columns and methods and descriptors.

Name refers not simply on the type and parameter modification method or field names, the name of this class simply inc is () method and m fields are "inc" and "m."

Descriptor is relatively complex, and that the problems we stay on top, is about descriptors. Action descriptor is used to describe the data type field of the parameter list, the method (including the number, type, order), and return values.

The basic data types and the void are represented by a capital character, and the object type represented by the fully qualified name of the object characters plus L

For array types, will be used for each dimension of a front "[" character described, for example, define a two-dimensional array "java.lang.String [] []" type, are recorded as: "[[Ljava / lang / String; ", when it comes to which I must mention the full name of the defining representation.

The fully qualified name of the name suggests is the full name, but its representation and we usually write something different, such as fully qualified the beginning of our test class named "org / fenixsoft / clazz / TestClass;", it is to type the full name "." is replaced by "/", and to add the final ";" indicates the end of the fully qualified name.

Use descriptors

First parameter list, the return value.

What do you mean you do not understand? Directly look at an example:

void inc()

java.lang.String.toString()

int indexOf(char[] source, int sourceOffset, int sourceCount, char[] target, int targetOffset, int targetCount, int fromIndex)

Examples Analysis
class index, parent index field is set after the interface index table, it is the first type data u2 capacity counter field_flags. From the above chart shows the value of 0x0001, that is only one field, the next u2 is access_flags sign ... ... and so on, fixed data fields table the analysis is complete.

Note: Finally, a collection of fields in the table does not list inherited from the field from the parent class, but may list the columns inside the original Java code does not exist, such as in inner classes in order to maintain access to external class, will automatically adding the external field points to the class instance.

There byte code for, if the description of two fields not, modify fields that duplicate names is legal. This Java which is clearly impossible.

Table collection method

The method table with the fields of the table is very similar to where I sign given access methods table:

 Original link: https://blog.csdn.net/championhengyi/article/details/78300611

Guess you like

Origin www.cnblogs.com/blwy-zmh/p/11847859.html