How much information of the source code does the compiled class file keep in Java?

Yulin :

When a class file from another project of my own was decompiled by IntelliJ (by Fernflower de-compiler), I was marveled at the closeness of the decompiled code compared to the source code, even the method local variable names are the same as the original source code.

I don't know anything about how the Java compilation process works and how the JVM works, my naive understanding is that the names of the public stuff may need to be kept after compilation, but names of local variables, they are simply mnemonics to facilitate human reading, totally useless outside of their scope, and I don’t think the JVM needs this information.

So, is this information simply figured out by the de-compiler through some magic or does the compiled class retain a lot of information and what for?

Marco13 :

In the end, it depends on the actual compiler and the exact compilation settings.

As you noted, the JVM itself does not need any local variable names. (Strictly speaking, it doesn't really need method names either. It is even possible to have two methods with the same name and arguments that only differ in the return type, but I'd have to look up some more details about this in the spec to say something more profoundly). But the class file can contain additional debug information that goes beyond the information that is required by the JVM.

The standard Java compiler is javac. And the documentation already contains some hints about the possible debug information:

-g

Generates all debugging information, including local variables. By default, only line number and source file information is generated.

-g:none

Does not generate any debugging information.

-g:[keyword list]

Generates only some kinds of debugging information, specified by a comma separated list of keywords. Valid keywords are:

  • source : Source file debugging information.
  • lines: Line number debugging information.
  • vars: Local variable debugging information.

One can try this out with an example:

public class ExampleClass {
    public static void main(String[] args) {

        ExampleClass exampleClass = new ExampleClass();
        exampleClass.exampleMethod();
    }

    public void exampleMethod() {
        String string = "This is an example";
        for (int counter = 0; counter < 10; counter++) {
            String localResult = string + counter;
            System.out.println(localResult);
        }
    }
}

Compiling this with

javac ExampleClass.java -g:none

will generate a class file. Printing information about this class file with

javap -c -v -l ExampleClass.class

(where -c means to disassemble the output, -v means that the output should be verbose, and -l means that the line number information should be printed), the output is as follows:

public class ExampleClass
  minor version: 0
  major version: 52
  flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
   #1 = Methodref          #13.#22        // java/lang/Object."<init>":()V
   #2 = Class              #23            // ExampleClass
   #3 = Methodref          #2.#22         // ExampleClass."<init>":()V
   #4 = Methodref          #2.#24         // ExampleClass.exampleMethod:()V
   #5 = String             #25            // This is an example
   #6 = Class              #26            // java/lang/StringBuilder
   #7 = Methodref          #6.#22         // java/lang/StringBuilder."<init>":()V
   #8 = Methodref          #6.#27         // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   #9 = Methodref          #6.#28         // java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
  #10 = Methodref          #6.#29         // java/lang/StringBuilder.toString:()Ljava/lang/String;
  #11 = Fieldref           #30.#31        // java/lang/System.out:Ljava/io/PrintStream;
  #12 = Methodref          #32.#33        // java/io/PrintStream.println:(Ljava/lang/String;)V
  #13 = Class              #34            // java/lang/Object
  #14 = Utf8               <init>
  #15 = Utf8               ()V
  #16 = Utf8               Code
  #17 = Utf8               main
  #18 = Utf8               ([Ljava/lang/String;)V
  #19 = Utf8               exampleMethod
  #20 = Utf8               StackMapTable
  #21 = Class              #35            // java/lang/String
  #22 = NameAndType        #14:#15        // "<init>":()V
  #23 = Utf8               ExampleClass
  #24 = NameAndType        #19:#15        // exampleMethod:()V
  #25 = Utf8               This is an example
  #26 = Utf8               java/lang/StringBuilder
  #27 = NameAndType        #36:#37        // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
  #28 = NameAndType        #36:#38        // append:(I)Ljava/lang/StringBuilder;
  #29 = NameAndType        #39:#40        // toString:()Ljava/lang/String;
  #30 = Class              #41            // java/lang/System
  #31 = NameAndType        #42:#43        // out:Ljava/io/PrintStream;
  #32 = Class              #44            // java/io/PrintStream
  #33 = NameAndType        #45:#46        // println:(Ljava/lang/String;)V
  #34 = Utf8               java/lang/Object
  #35 = Utf8               java/lang/String
  #36 = Utf8               append
  #37 = Utf8               (Ljava/lang/String;)Ljava/lang/StringBuilder;
  #38 = Utf8               (I)Ljava/lang/StringBuilder;
  #39 = Utf8               toString
  #40 = Utf8               ()Ljava/lang/String;
  #41 = Utf8               java/lang/System
  #42 = Utf8               out
  #43 = Utf8               Ljava/io/PrintStream;
  #44 = Utf8               java/io/PrintStream
  #45 = Utf8               println
  #46 = Utf8               (Ljava/lang/String;)V
{
  public ExampleClass();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: return

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=2, locals=2, args_size=1
         0: new           #2                  // class ExampleClass
         3: dup
         4: invokespecial #3                  // Method "<init>":()V
         7: astore_1
         8: aload_1
         9: invokevirtual #4                  // Method exampleMethod:()V
        12: return

  public void exampleMethod();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=2, locals=4, args_size=1
         0: ldc           #5                  // String This is an example
         2: astore_1
         3: iconst_0
         4: istore_2
         5: iload_2
         6: bipush        10
         8: if_icmpge     43
        11: new           #6                  // class java/lang/StringBuilder
        14: dup
        15: invokespecial #7                  // Method java/lang/StringBuilder."<init>":()V
        18: aload_1
        19: invokevirtual #8                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        22: iload_2
        23: invokevirtual #9                  // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
        26: invokevirtual #10                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        29: astore_3
        30: getstatic     #11                 // Field java/lang/System.out:Ljava/io/PrintStream;
        33: aload_3
        34: invokevirtual #12                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        37: iinc          2, 1
        40: goto          5
        43: return
      StackMapTable: number_of_entries = 2
        frame_type = 253 /* append */
          offset_delta = 5
          locals = [ class java/lang/String, int ]
        frame_type = 250 /* chop */
          offset_delta = 37
}

That's quite a lot of information, but nothing beyond the actual structure of the class itself.

(You mentioned that the names of "public stuff" has to be kept. But the name of "private stuff" also has to be kept - at the very least, for reflection. With methods like Class#getDeclaredFields, you can still access private fields, for example - so the name must be available somewhere).


Now, the opposite is to compile it with

javac ExampleClass.java -g

to retain all debugging information. Printing the result as described above yields

public class ExampleClass
  minor version: 0
  major version: 52
  flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
   #1 = Methodref          #13.#36        // java/lang/Object."<init>":()V
   #2 = Class              #37            // ExampleClass
   #3 = Methodref          #2.#36         // ExampleClass."<init>":()V
   #4 = Methodref          #2.#38         // ExampleClass.exampleMethod:()V
   #5 = String             #39            // This is an example
   #6 = Class              #40            // java/lang/StringBuilder
   #7 = Methodref          #6.#36         // java/lang/StringBuilder."<init>":()V
   #8 = Methodref          #6.#41         // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   #9 = Methodref          #6.#42         // java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
  #10 = Methodref          #6.#43         // java/lang/StringBuilder.toString:()Ljava/lang/String;
  #11 = Fieldref           #44.#45        // java/lang/System.out:Ljava/io/PrintStream;
  #12 = Methodref          #46.#47        // java/io/PrintStream.println:(Ljava/lang/String;)V
  #13 = Class              #48            // java/lang/Object
  #14 = Utf8               <init>
  #15 = Utf8               ()V
  #16 = Utf8               Code
  #17 = Utf8               LineNumberTable
  #18 = Utf8               LocalVariableTable
  #19 = Utf8               this
  #20 = Utf8               LExampleClass;
  #21 = Utf8               main
  #22 = Utf8               ([Ljava/lang/String;)V
  #23 = Utf8               args
  #24 = Utf8               [Ljava/lang/String;
  #25 = Utf8               exampleClass
  #26 = Utf8               exampleMethod
  #27 = Utf8               localResult
  #28 = Utf8               Ljava/lang/String;
  #29 = Utf8               counter
  #30 = Utf8               I
  #31 = Utf8               string
  #32 = Utf8               StackMapTable
  #33 = Class              #49            // java/lang/String
  #34 = Utf8               SourceFile
  #35 = Utf8               ExampleClass.java
  #36 = NameAndType        #14:#15        // "<init>":()V
  #37 = Utf8               ExampleClass
  #38 = NameAndType        #26:#15        // exampleMethod:()V
  #39 = Utf8               This is an example
  #40 = Utf8               java/lang/StringBuilder
  #41 = NameAndType        #50:#51        // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
  #42 = NameAndType        #50:#52        // append:(I)Ljava/lang/StringBuilder;
  #43 = NameAndType        #53:#54        // toString:()Ljava/lang/String;
  #44 = Class              #55            // java/lang/System
  #45 = NameAndType        #56:#57        // out:Ljava/io/PrintStream;
  #46 = Class              #58            // java/io/PrintStream
  #47 = NameAndType        #59:#60        // println:(Ljava/lang/String;)V
  #48 = Utf8               java/lang/Object
  #49 = Utf8               java/lang/String
  #50 = Utf8               append
  #51 = Utf8               (Ljava/lang/String;)Ljava/lang/StringBuilder;
  #52 = Utf8               (I)Ljava/lang/StringBuilder;
  #53 = Utf8               toString
  #54 = Utf8               ()Ljava/lang/String;
  #55 = Utf8               java/lang/System
  #56 = Utf8               out
  #57 = Utf8               Ljava/io/PrintStream;
  #58 = Utf8               java/io/PrintStream
  #59 = Utf8               println
  #60 = Utf8               (Ljava/lang/String;)V
{
  public ExampleClass();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: return
      LineNumberTable:
        line 1: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       5     0  this   LExampleClass;

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=2, locals=2, args_size=1
         0: new           #2                  // class ExampleClass
         3: dup
         4: invokespecial #3                  // Method "<init>":()V
         7: astore_1
         8: aload_1
         9: invokevirtual #4                  // Method exampleMethod:()V
        12: return
      LineNumberTable:
        line 4: 0
        line 5: 8
        line 6: 12
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      13     0  args   [Ljava/lang/String;
            8       5     1 exampleClass   LExampleClass;

  public void exampleMethod();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=2, locals=4, args_size=1
         0: ldc           #5                  // String This is an example
         2: astore_1
         3: iconst_0
         4: istore_2
         5: iload_2
         6: bipush        10
         8: if_icmpge     43
        11: new           #6                  // class java/lang/StringBuilder
        14: dup
        15: invokespecial #7                  // Method java/lang/StringBuilder."<init>":()V
        18: aload_1
        19: invokevirtual #8                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
        22: iload_2
        23: invokevirtual #9                  // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
        26: invokevirtual #10                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
        29: astore_3
        30: getstatic     #11                 // Field java/lang/System.out:Ljava/io/PrintStream;
        33: aload_3
        34: invokevirtual #12                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        37: iinc          2, 1
        40: goto          5
        43: return
      LineNumberTable:
        line 9: 0
        line 10: 3
        line 11: 11
        line 12: 30
        line 10: 37
        line 14: 43
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
           30       7     3 localResult   Ljava/lang/String;
            5      38     2 counter   I
            0      44     0  this   LExampleClass;
            3      41     1 string   Ljava/lang/String;
      StackMapTable: number_of_entries = 2
        frame_type = 253 /* append */
          offset_delta = 5
          locals = [ class java/lang/String, int ]
        frame_type = 250 /* chop */
          offset_delta = 37
}
SourceFile: "ExampleClass.java"

The main differences are

  • the class contains the source file name
  • the constant pool has many more entries
  • the methods contain a LineNumberTable and a LocalVariableTable.

For example, consider the exampleMethod():

LineNumberTable:
  line 9: 0
  line 10: 3
  line 11: 11
  line 12: 30
  line 10: 37
  line 14: 43
LocalVariableTable:
  Start  Length  Slot  Name   Signature
     30       7     3 localResult   Ljava/lang/String;
      5      38     2 counter   I
      0      44     0  this   LExampleClass;
      3      41     1 string   Ljava/lang/String;

The details about the structure of these attributes are given in the documentation of the LineNumberTable and the LocalVariableTable.

For the LineNumberTable, it says

It may be used by debuggers to determine which part of the code array corresponds to a given line number in the original source file.

For the LocalVariableTable, it says

It may be used by debuggers to determine the value of a given local variable during the execution of a method.

In the output of javap, the names of the local variables are already resolved. However, the actual information that is contained in the table itself is only an index into the constant pool (that's why it has more entries when debugging information is retained). For example, the entry for the localResult variable is shown as

     30       7     3 localResult   Ljava/lang/String;

although it actually only contains a reference to the entry

  #27 = Utf8               localResult

of the constant pool.


So, are these information simply figured out by the de-compiler through some magic or dose the compiled class retain a lot of information and what for?

As shown above, the compiled class can retain a lot of information. After all, one of the main purposes of an IDE is to provide a nice, visual interface to a debugger. And therefore, most compilers that are in one way or the other triggered by an IDE will by default try to retain as much debug information as possible.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=327918&siteId=1