[Reserved] the difference between UNICODE and ASCII

Original Address: https://blog.csdn.net/lx697/article/details/5914417

Recent projects related to the issue of internationalization, since no prior contact with the UNICODE encoding, therefore, during the project, collected some information on the ASCII and UNICODE.

1.ASCII features

ASCII is used to represent the English character encoding specifications. Each ASCII character occupies one byte, thus, the maximum number of characters that can be represented by the ASCII code is 255 (00H-FFH). This is for the English, there is no problem, generally what is used the first 128 (00H - 7FH, the highest bit is 0). And the highest bit is 1 and the other characters 128 (80H-FFH) is referred to as "the ASCII extended", is generally used to store English tabs, part of phonetic characters and the like other symbols.

But for more complex languages such as Chinese, 255 characters it is clearly not enough. As a result, various countries have developed their own text encoding standard, which the Chinese character coding standard called the "GB2312-80", and it is compatible with ASCII encoding standard, in fact, did not really take advantage of extended ASCII standardize this, put a Chinese characters with two extended ASCII characters to represent, to distinguish ASCII code section.
However, this method has a problem, the biggest problem is the ASCII character encoding and expansion of Chinese overlap. Many software uses extended ASCII code of the English tab to draw a table, such software used in Chinese system, these tables will be mistaken for Chinese characters garbled. In addition, due to the countries and regions have their own character encoding rules that conflict with each other, which gives national and regional exchange of information has brought a lot of trouble.

2.UNICODE generation

To really solve this problem, not from the perspective of extended ASCII, UNICODE as a new coding system came into being, it can be all of the text in Chinese, French, German ...... and so on unify consideration, both for each character assigned a separate code.

3. What is UNICODE

ASCII and Unicode is a character encoding method as it takes up two bytes (0000H-FFFFH), hold 65,536 characters, which can accommodate encode all the world's languages. In Unicode, all characters are handled by a character that has a unique Unicode code.

4. The benefits of using the UNICODE

Use Unicode encoding can make your project supports multiple languages, make your international project. I.e. without incurring distortion systems at different languages

[Reserved] the difference between UNICODE and ASCII

Guess you like