(IO in-depth operation) character encoding

In the computer world, only 0 and 1 are recognized. If you want to describe the encoding of some characters, you need to combine these binary data, so there is Chinese. If you want to display the content correctly when encoding, you must have Decoding, so encoding and decoding must adopt a unified variable standard, and garbled codes will appear when they are not unified.

Then the commonly used codes in development are as follows:

  • GBK/gb2312: National standard code, which can describe Chinese information. GB2312 only describes simplified Chinese, while GBK includes simplified and traditional;
  • ISO8859-1: International universal code, which can be used to describe all letter information, if it is a pictograph, it needs to be code converted;
  • UBICODE code: It is stored in hexadecimal format, which can describe all text information;
  • UTF: Pictographs use hexadecimal encoding, and ordinary letters use ISO8859-1 universal encoding, which is suitable for fast transmission and saves bandwidth. It has become the first choice for opening up, mainly using "UTF-8" encoding.

If you want to know the encoding rules supported in the current system, you can use the following code to list the properties of the machine.

System.getProperties().list(System.out);

The garbled codes appearing in the project are that the encoding and decoding standards are not unified, and the best way to solve the garbled codes is to use UTF-8 for all codes.

Guess you like

Origin blog.csdn.net/weixin_46245201/article/details/112857142