java compilation coding problem

Recently caused by an encoding problem. Made me wonder about another encoding question.
That is, we generally use utf-8 encoding when writing java source files, but the encoding of the tomcat console (the black window directly started in the bin) is gbk. Why is there no garbled problem in tomcat control?

At first I thought that since my java source file is encoded according to utf-8, then there will be no problem if it must be decoded according to utf-8. Why is there no garbled problem in tomcat decoding with gbk?

Then check the javac compilation process:
   when javac is compiled, it will be compiled according to the -Dfile.encoding parameter, -Dfile.encoding is generally the default character of the system, and gbk under windows. When we write files under ecplise, we generally set the encoding to utf-8, so it will be compiled in utf-8 when compiling.

   Here comes the point: The encoding used for java compilation is unicode encoding, so there is a transcoding process. The encoding of java source files is transcoded to unicode. Here I did an experiment:
  
    public static void main(String[] args) {
	System.out.println((int)'我');
}

    Modify the encoding of the java source file to gbk, and run the above code with utf-8, and you will find that the unicode encoding printed is 25105. It can be seen that java uses unicode encoding to encode source files. Equivalent to unicode is a bridge.

    Then just print the output to the console.
    What does it mean when we execute the following code?
  
   
     String a = "Hello";
	byte[] bytes = a.getBytes("utf-8");

     This code means to convert unicode encoding to utf-8 encoding. Then hand it over to the console to display.
     
     Now back to the original question: why is there no garbled problem in the tomcat console?

Because 1. When compiling the source file, java will use unicode encoding according to the encoding of the source file. put in memory. Regardless of gbk or utf-8, it will look up the code table to generate a unified unicode encoding.
2. When the console is output, java will convert it according to the current -Dfile.encoding parameter. On windows, the encoding is gbk, so java will convert the unicode encoding to gbk, and then decode the gbk used by the tomcat console. So there is no garbled problem.

In other words: For java, we don't have to think about it so much, just make sure that the encoding used by the code that calls String.getBytes("encoding") is the same as the encoding that is ultimately used for decoding.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326392249&siteId=291194637