Common encoding (code table)

ASCII: American Standard Code for Information Interchange

7 with a byte can be represented (in English only, digits, and special symbols)

 

ISO8859-1 (Latin-1): Europe or Latin code table code table

With 8 bits of a byte. Also known as Latin-1 (Latin coding) or "Western European languages." ASCII code includes letters only, and is not fully occupied position 256 encoding, it refers to ASCII, in the range of vacant 0xA0-0xFF added 192 letters and symbols, whereby for use of diacritics Latin alphabet language. To support the German, French and so on. So it is still a single-byte coding, but more comprehensive than ASCII.

 

GB2312: China's Chinese code table

(Operating system is in Chinese, use Notepad when the default encoding is gb2312)

 

GBK: China's Chinese code table upgrade combines more Chinese character symbol.

 

Unicode: International standard code, integration of a variety of text.

All text are two bytes to represent, Java language is Unicode.

 

UTF-8: up to three bytes to represent a character.

 

(In the future the most contact is ISO8859-1, GBK, UTF-8)

ISO8859-1: a byte

GBK: two bytes contain the expansion of Chinese characters and English

UTF-8: Unicode, the implementation of use. 1 to 3 bytes are unequal. English one byte is stored, Chinese 3 bytes is stored, in order to save space.

 

How to avoid garbled?

Same piece of code table as long as the encoding used when decoding with

 

Save the computer in what form the data?

Binary

Save the file when characters => binary coding

Web browser rendering binary => character encoding

sublime default encoding is UTF-8

Guess you like

Origin www.cnblogs.com/crazier/p/11291594.html