UTF-8 character encoding

ascii code: a 7-bit low byte to 128 characters for English, a uniform high zero. Later, as more and more text ascii it was incorporated into the high seven also been incorporated into it.

GB2312: emergence of Chinese characters, ascii code can not be satisfied, so there have been GB2312.

UNICODE: For all countries to recognize and use a unified coding, invented the coding, also known as Unicode .

--------------------------------------------

Because of the provisions of UNICODE text and binary mapping, but does not specify how to store in practice, because the number of bits need to take a different word is not fixed, long waste, too short and some long is not met, there is Some of the reasons for the operating system and so on.

UTF-8: As countries Internet, the emergence of the global village, countries people get together to see a small movie, a picture may be displayed in multiple languages, so they need a unified, efficient and appropriate codec encoding, UNICODE as a blueprint, obviously can not meet these three requirements, it appears UTF-8:

  Another UTF-16 , UTF-32 and the like. UTF-8 is not fixed length coding, but a variable length encoding. It can be 1 to 4 bytes of one symbol, byte length varies depending on the symbol. This is the kind of relatively clever design, if the first bit of a byte is 0, then this is a single-byte character; if the first bit is 1, the number of consecutive 1, it means how many characters occupy the current character section. Unicode code Note unicode character encoding and stores encoded utf-8 representation is different, for example, "strictness" are 4E25, UTF-8 encoding is E4B8A5, this 7 which explains the, UTF-8 encoding into account not only the encoding , also contemplated storage, E4B8A5 is stored into the 4E25 based on the identification code. General Chinese characters in utf-8 is 3 bytes, the encoding scheme is the most common 1110xxxx 10xxxxxx 10xxxxxx.

Reference article: https://blog.csdn.net/weixin_30402343/article/details/95836628

Guess you like

Origin www.cnblogs.com/YsirSun/p/12656451.html