java virtual machine specification (se8) - class file format (b)

4.4 constant pool

  java virtual machine instruction does not depend on the class, the interface operation, when a class instance or array layout. In contrast, rely on the instruction information symbol the constant pool.

  All general structure of the constant pool entries have the following:

cp_info {
    u1 tag;
    u1 info[];
}

  Constant pool table each project is based on 1-bit start flag to indicate what kind of cp_info entry. Content info array is determined by the flag. Valid identifiers and corresponding values ​​in Table 4.4-A. Each flag must later with 2 or more bytes, which give information specified constants. Additional information format is determined by the identity value.

  Table 4.4-A constant pool identity

Constant Type Value
CONSTANT_Class 7
CONSTANT_Fieldref 9
CONSTANT_Methodref 10
CONSTANT_InterfaceMethodref 11
CONSTANT_String 8
CONSTANT_Integer 3
CONSTANT_Float 4
CONSTANT_Long 5
CONSTANT_Double 6
CONSTANT_NameAndType 12
CONSTANT_Utf8 1
CONSTANT_MethodHandle 15
CONSTANT_MethodType 16
CONSTANT_InvokeDynamic 18

 

4.4.1 CONSTANT_Class_info structure

  CONSTANT_Class_info structure used to represent a class or an interface:
CONSTANT_Class_info {
    u1 tag;
    u2 name_index;
}

  Project CONSTANT_Class_info structure is as follows:

  tag

    Flag value CONSTANT_Class (7).

  name_index

    Name_index The value must be a valid index in the constant pool table. In this constant pool entry must be an index position CONSTANT_Utf8_info structure represents a valid binary class or interface name, which is encoded using internal form.

  Because arrays are objects, and bytecode anewarray multianewarray (not including the bytecode new) reference to an array can be "class" in the constant pool by CONSTANT_Class_info structure. For such an array of "class" is the class name of the array type descriptor (4.3.2).

  For example, a two-dimensional array type int [] [] represents the class name of [[the I, however, the Thread [] represents the class name of [Ljava / lang / Thread.

  Only the dimension of the array is 255 or less when the array type descriptor is valid

4.4.2 CONSTANT_Fieldred_info, CONSTANT_Methodref_info, and CONTANT_InterfaceMethodref_info structure

  Fields, methods, and interface methods use similar structure to represent:

  

CONSTANT_Fieldref_info {
    u1 tag;
    u2 class_index;
    u2 name_and_type_index;
}

CONSTANT_Methodref_info {
    u1 tag;
    u2 class_index;
    u2 name_and_type_index;
}

CONSTANT_InterfaceMethodref_info {
    u1 tag;
    u2 class_index;
    u2 name_and_type_index;
}

  This entry structure is as follows:

  tag

    The tag value is CONTANT_Fieldref_info CONSTANT_Fieldref (9).

    The tag value is CONTANT_Methodref_info CONSTANT_Methodref (10).

    The tag value is CONTANT_InterfaceMethodref_info CONSTANT_InterfaceMethodref (11).

  class_index

    Class_index value must be a valid index constant_pool table. Entries in the index must be a position CONSTANT_Class_info structure that represents a field or method, as a member of the class or interface type.

    class_index CONSTANT_Methodref_info structure must be a class type is an interface type

    CONSTANT_InterfaceMethodref_info entries must be represented in a class_index interface type

    class_index entry CONSTANT_Fieldref_info structure may be a class type or interface type.

  name_and_type_index

    Name_and_type_index value must be a valid index constant_pool table. Entries in the index must be a position CONSTANT_NameAndType_info structure, and the descriptor indicates the name of the field or method.

  In CONSTANT_Fieldref_info, the descriptor must be indicating a field descriptor. It must be other method descriptor.

  If the name of the method to the structure CONSTANT_Methodref_info '<' ( '\ u003c') begins, then the name must be a special name <init> represents an example of an initialization method. Its return type must be void.

4.4.3 CONSTANT_String_info structure

  CONSTANT_String_info structure used to represent constant objects of type String

CONSTANT_String_info {
    u1 tag;
    u2 string_index;
}

  Project CONSTANT_String_info structure is as follows:

  tag

    The tag value is CONSTANT_String_info CONSTANT_String (8).

  string_index

    Class_index value must be a valid index constant_pool table. Entries in the index must be a position CONSTANT_Utf8_info structure, a sequence of Unicode code points need to initialize a String object.

4.4.4 CONSTANT_Integer_info structure and CONSTANT_Float_info

  4 and the structure represented CONSTANT_Integer_info CONSTANT_Float_info constant byte value (int and float).

CONSTANT_Integer_info {
    u1 tag;
    u4 bytes;
}

CONSTANT_Float_info {
    u1 tag;
    u4 bytes;
}

  The structure of the project are as follows:

  tag

    tag CONSTANT_Integer_info structure is CONSTANT_Integer (3).

    CONSTANT_Float_info configuration of the tag value CONSTANT_Float (4).

  bytes

    CONSTANT_Integer_info item represents the bytes stored value byte int, int constant value for the big endian mode.

    CONSTANT_Float_info bytes of constant float value indicating item, using IEEE 754 single-precision floating point format. Endian single format for the big endian mode.

    CONSTANT_Float_info value is determined by the following structure represented rules. Byte value is first converted according to an int, then:

    1, if the bit is converted 0x7f800000, then the float is positive infinity

    2, if the bit is converted 0xff800000, then the float is negative infinity

    3, if the bits in 0x7f800001 converted to 0x7fffffff or between 0xff800001 to 0xFFFFFFFF, then the float value NaN.

    4, in other cases, according to the bit value flaot calculated as follows:

    

int s = ((bits >> 31) == 0) ? 1 : -1;
int e = ((bits >> 23) & 0xff);
int m = (e == 0) ?
          (bits & 0x7fffff) << 1 :
          (bits & 0x7fffff) | 0x800000;

    Then the value of float equal to the mathematical expression . Resultss · m · 2e-150

 4.4.6 CONSTANT_NameAndType_info structure

  CONSTANT_NameAndType_info structure used to represent fields and methods, but which are not specified in the class or interface type belong:

CONSTANT_NameAndType_info {
    u1 tag;
    u2 name_index;
    u2 descriptor_index;
}

  Project CONSTANT_NameAndType_info structure is as follows:

  tag

    tag item CONSTANT_NameAndType_info structure is CONSTANT_NameAndType (12).

  name_index

    Name_index The value must be a valid index of the constant pool table. This index must be a position CONSTANT_Utf8_info structure represents a particular method name <init> or a valid indication of the name of a non-limiting field or method.

  desciptor_index

    Name_index The value must be a valid index of the constant pool table. This index must be a position CONSTANT_Utf8_info structure represents an effective method of field descriptor or descriptors.

4.4.7 CONSTANT_Utf8_info structure

  CONSTANT_Utf8_info structure used to represent a string constant value:

  

CONSTANT_Utf8_info {
    u1 tag;
    u2 length;
    u1 bytes[length];
}

  Structure projects are as follows:

  tag

    CONSTANT_Utf8_info the tag value CONSTANT_Utf8 (1).

  length

    The number of values ​​in the array of bytes byte length (the length of the string is not).

  bytes[]

    Bytes byte array containing strings.

    Value is not the byte (byte) 0

    Value is not the byte (byte) 0xf0 to (byte) 0xff.

  Contents of the string using UTF-8 encoding correction. Using a modified UTF-8 encoding for each code point is represented using only one byte code point sequence contains only non-null ASCII characters, it can represent all Unicode code points in code space. Modified UTF-8 does not end with a null value. The encoding process is as follows:

  In the code point '\ u0001' to '\ u007F' mid-range, using a single byte to represent:

  

0 bits 6-0

  7 shows the numerical value of the code point.

  null code points ( '\ u0000') and the code point '\ u0080' to '\ u07FF' byte range using a pair of x and y is represented:

  x:

1 1 0 bits 10-6

  Y:

1 0 bits 5-0

  This two-byte code point values:

((x & 0x1f) << 6) + (y & 0x3f)

  In the code point '\ u0800' to '\ uFFFF' use of three bytes x, y, z represented:

  x:

1 1 1 0 bits 15-12

  Y:

1 0 bits 11-6
 
  with:
1 0 bits 5-0
 
  The three-byte code point value indicated by:
  
((x & 0xf) << 12) + ((y & 0x3f) << 6) + (z & 0x3f)

  Code points above U + FFFF characters (a so-called supplemental character) is represented by two agents encode UTF-16 code units which represented. Each agent unit represented by the three-byte codes. This means supplemented by a six-byte characters u, v, w, x, y and z represent:

u:
1 1 1 0 1 1 0 1
v:
1 0 1 0 (bits 20-16)-1
w:
1 0 bits 15-10
x:
1 1 1 0 1 1 0 1
y:
1 0 1 1 bits 9-6
z:
1 0 bits 5-0

  Six-byte code point values:

0x10000 + ((v & 0x0f) << 16) + ((w & 0x3f) << 10) +
((y & 0x0f) << 6) + (z & 0x3f)

  Multibyte character bytes in big endian mode (high byte first) order stored in the class file.

  This format "standard" UTF-8 format with two differences. First, a 2-byte format rather than one-byte null character format (char) 0 for encoding, in order to correct the UTF-8 character string is never embedded nulls. Next, using only the standard UTF-8 1 byte, 2 bytes, and 3-byte format. Java virtual machine does not recognize the standard four-byte UTF-8 format; it uses its own 2 * three-byte format.

Guess you like

Origin www.cnblogs.com/lilinwei340/p/11408611.html