Distal commonly used encoding format

Computer coding

ascii

Only machine-executable binary code, a byte (byte) is eight bits (bit) 256 kinds of characters can be expressed
in English corresponding to the 128-bit ascii code

unicode

Placed in a multilingual environment is not enough to use, we found a workaround for the Complete Works: unicodeThis symbol set defines the binary code symbols, but does not specify how to store.
For instance binary code 100111000100101, aShi 01000001.
This variable length, one computer can not divide a specific character, but if unified into a fixed byte length will cause great waste of memory.

utf-8

utf-8 is unicode one implementation, the characteristics of variable length.
utf-8 rules:

- 使用1-4个字节表达字符
- 如果第一个字节的开头为0,那么这个为单字节字符,可表达128位字符与ascii码一样
- 如果字符需要用n(n > 1)个字节表示, 那么第一个字节的开头n位为1,n+1位为0,第2个至第n个字节的前两位一致设为10

utf-8 table

Change

Byte count utf-8 range Decimal Hex unicode
1 0xxxxxxx 00000000 ~ 01111111 0 ~ 127 0 ~ 7F
2 110xxxxx 10xxxxxx 11000001 10000000 ~ 11011111 10111111 128 ~ 2047 80 ~ 7FF
3 1110xxxx 10xxxxxx 10xxxxxx 111000000 10100000 10000000 ~ 11101111 10111111 10111111 2048 ~ 65535 800 ~ FFFF
4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 11110000 10010000 10000000 10000000 ~ 11110111 10111111 10111111 10111111 65536 ~ 2097151 10000 ~ 1FFFFF

Text class resource utf-8

- html
- css
- javascript

Picture categories Resources

base64
sixth power of 2 64

Principle:
. 8 3 = 6 . 4

  1. Taking a set of three characters, reproduced as in the form of 6 * 4
  2. Conversion table generation control base64

Thoughts: advantages and disadvantages?
Disadvantages: content increases, which can not be cached into html
advantages: You can print pass, picture resources in a Web page does not need to load additional http request
video Audio resources

Reference links

Ruan Yifeng character encoding notes

Original: Large column  distal commonly used encoding format


Guess you like

Origin www.cnblogs.com/wangziqiang123/p/11618399.html