Incorrect string value: ‘\xF0\x9F\x92\x95\’

table of Contents

Problem Description

problem analysis

​Problem solving

Problem extension


Problem Description

During the work, I suddenly received a 5xx SMS and email alert. Checking Nginx logs and business logs through 5xx HTTP status codes found the following errors:

problem analysis

1 At  first glance , it should be said that updating the name "\xF0\x9F\x92\x95" written in utf8 code failed.

2 What is that utf8 encoded string "\ xF0 \ x9F \ x92 \ x95" is it? Through investigation, it is found that this is an emoji symbol encoded by a string, and the emoji symbol represents the meaning of two hearts.

The above emoji result is obtained from Emoji Unicode Tables, the address is as follows:

https://apps.timwhitlock.info/emoji/tables/unicode#block-6a-additional-emoticons

3Look  again , it should be that the user wants to update the name field with the emoji symbol representing two hearts.

4 Why update the user name field by representatives of two hearts emoji symbols fail? From the utf8 encoding of the emoji symbol of Two Hearts is the string "\xF0\x9F\x92\x95", we can see that the emoji symbol is 4 characters.

5  Then why does the emoji coded as 4 characters fail when updating the name? Since the name is stored and the database used is mysql, we check that the field code of the user table is utf8.

6 Then why can't the utf8-encoded user table be written into a 4-character emoji? Use the following command to view the maximum number of characters that can be written in utf8 encoding.

select * from information_schema.CHARACTER_SETS where CHARACTER_SET_NAME = 'utf8';

It can be seen from the figure that the utf8 encoding supports a maximum of 3 characters.

7  the final look , since the user table is written utf8 coding supports up to 3 characters, 4 characters does not support writing emoji encoding symbols.

​Problem solving

Since the user table does not support the writing of emoji symbols with 4 character codes, we only need to modify the user table to support the writing of emoji symbols with 4 character codes.

MySQL version after 5.5.3 supports a new character encoding: utf8mb4. We can check how many characters the utf8mb4 encoding supports.

select * from information_schema.CHARACTER_SETS where CHARACTER_SET_NAME = 'utf8mb4';

It can be seen from the figure that the writing of up to 4 characters is supported. So utf8mb4 encoding can solve the writing of emoji emoticons.

This block only needs to modify the encoding of the user table to utf8mb4, the command is as follows:

ALTER TABLE user CONVERT TO CHARACTER SET utf8mb4;

Problem extension

The storage consumption of utf8mb4 encoding in some characters is greater than that of utf8 encoding. If the table needs to store emoji using utf8mb4 encoding, it is not recommended to use utf8 encoding.

 

Guess you like

Origin blog.csdn.net/jack1liu/article/details/110287319