Summary of the difference between utf8_unicode_ci and utf8_general_ci in Mysql

Summary of the difference between utf8_unicode_ci and utf8_general_ci in Mysql

Article Directory

  • What is the difference between utf8_general_ci and utf8_unicode_ci in Mysql? In programming languages, unicode is usually used to process Chinese characters to prevent garbled characters. In MySQL, why do people use utf8_general_ci instead of utf8_unicode_ci?

  • ci is case insensitive, that is, "case insensitive", a and A will be treated as the same in character judgment;
    bin is binary, a and A will be treated differently.

For example, if you run:
SELECT * FROM table WHERE txt ='a'
then you can't find the line with txt ='A' in utf8_bin, but utf8_general_ci can.

utf8_general_ci 不区分大小写,这个你在注册用户名和邮箱的时候就要使用。
utf8_general_cs 区分大小写,如果用户名和邮箱用这个 就会照成不良后果
utf8_bin:字符串每个字符串用二进制数据编译存储。 区分大小写,而且可以存二进制的内容

Short summary

  1. There is no substantial difference between utf8_unicode_ci and utf8_general_ci between Chinese and English.
  2. The utf8_general_ci proofreading speed is fast, but the accuracy is slightly worse.
  3. utf8_unicode_ci has high accuracy, but the proofreading speed is slightly slower.

If your application has German, French or Russian, please use utf8_unicode_ci. Generally, utf8_general_ci is sufficient.

Detailed summary

  1. For a language, only when the utf8_unicode_ci sorting is not well done, the utf8 character set collation rules related to the specific language are executed. For example, for German and French, utf8_unicode_ci works well, so there is no need to create special utf8 proofreading rules for these two languages.
  2. utf8_general_ci is also applicable to German and French, except that'?' is equal to's' instead of'ss'. If your application can accept these, you should use utf8_general_ci because it is fast. Otherwise, use utf8_unicode_ci because it is more accurate.

utf8_unicode_ci is more accurate, utf8_general_ci is faster. Under normal circumstances, the accuracy of utf8_general_ci is enough for us. After I read the source code of many programs, I found that most of them also use utf8_general_ci, so utf8_general_ci is generally used when creating a new database.

Guess you like

Origin blog.csdn.net/qq_46480020/article/details/112853437