The difference between mysql encoding utf8 and utf8mb4 and the solution when Mysql imports data: Unknown collation: 'utf8mb4_0900_ai_ci'

1. Introduction

MySQL added this utf8mb4 encoding after 5.5.3, mb4 means most bytes 4, and is specially used to be compatible with four-byte unicode. utf8mb4 is a superset of utf8, no conversion is required except changing the encoding to utf8mb4. Of course, in order to save space, it is usually enough to use utf8.

2. Content description

As mentioned above, since utf8 can store most Chinese characters, why use utf8mb4? It turns out that the utf8 encoding supported by mysql has a maximum character length of 3 bytes. If a 4-byte wide character is encountered, an exception will be inserted. . The largest Unicode character that can be encoded by three-byte UTF-8 is 0xffff, which is the Basic Multilingual Plane (BMP) in Unicode. That is to say, any Unicode characters that are not in the Basic Multitext Plane cannot be stored using Mysql's utf8 character set. Including Emoji expressions (Emoji is a special Unicode encoding, commonly found on ios and android mobile phones), many uncommonly used Chinese characters, and any new Unicode characters, etc. (the disadvantage of utf8).

Usually, when a computer stores characters, it allocates storage space according to different types of characters and encoding methods. For example, the following encoding methods;

①In ASCII encoding, an English letter (case-insensitive) occupies one byte of space, and a Chinese character occupies two bytes of space. A binary number sequence, when stored as a digital unit in a computer, is generally an 8-bit binary number, converted to decimal. The minimum value is 0, and the maximum value is 255.

②In UTF-8 encoding, an English character occupies one byte of storage space, and a Chinese (including traditional) occupies three bytes of storage space.

③In Unicode encoding, an English word takes up two bytes of storage space, and a Chinese (including traditional) takes up two bytes of storage space.

④ In UTF-16 encoding, the storage of an English alphabet character or a Chinese character requires 2 bytes of storage space (some Chinese characters in the Unicode extension area require 4 bytes).

⑤ In UTF-32 encoding, the storage of any character in the world needs to occupy 4 bytes of storage space.

Since utf8 is compatible with most characters, why should utf8mb4 be extended?

With the development of the Internet, many new types of characters have been produced, such as emoji, which is the little yellow face expression we usually send when chatting. The appearance of this character is not among the basic multi-plane Unicode characters. , resulting in the inability to use utf8 storage in MySQL, MySQL then extended the utf8 characters and added the encoding utf8mb4.

Therefore, if you want to allow users to use special symbols when designing a database, it is best to use utf8mb4 encoding for storage, which makes the database more compatible, but this design will consume more storage space.
 

【Singing Actual Combat】

Mysql reported an error when importing data: Unknown collation: 'utf8mb4_0900_ai_ci' 
recently exported data from the Internet and wanted to build it locally. Such an error was reported?
[ERR] 1273 - Unknown collation: 'utf8mb4_0900_ai_ci'
What is the cause of this error?

It is because the version of the database we exported the data to was 8.0, and the version of the database we imported was 5.6.
Because it was imported from a higher version to a lower version, it caused a 1273 error.
That is, database a is version 8.0, and database b is version 5.6. We export the data of database a and want to import it directly into b, so this error is reported

Solution 1: Open the sql file and replace all utf8mb4_0900_ai_ci
in the file with utf8_general_ci utf8mb4 with utf8mb3 or  utf8 

Save and run the sql file again, it will be successful!

Solution 2:

Export the table structure and save the sql, replace all utf8mb4_0900_ai_ci in the sql file of the table structure with utf8_general_ci , replace
utf8mb4 with utf8mb3 or  utf8   to save the sql, create a new library, import the table structure, and use the Navicat tool to synchronize the table structure to the target on mysql8.0 database. After the synchronization is completed, the sql file exported again can be compatible with the local version 5.6 database!

Guess you like

Origin blog.csdn.net/happyzhlb/article/details/126505658