问题原因
emoji表情字符无法直接入utf8
编码的mysql数据库,因为其utf8
编码长度大多为4个字节(Emoji Unicode Tables),而utf8
编码的mysql数据库每个字符的最大长度是3个字节。这种情况下会报 error code [1366]; Incorrect string value: '\xF0\x9F\x8D\xB0'
类似的错误。
mysql utf8
The utf8 character set in MySQL has these characteristics:
- No support for supplementary characters (BMP characters only).
- A maximum of three bytes per multibyte character.
—— The utf8 Character Set (3-Byte UTF-8 Unicode Encoding)
mysql utf8mb4
The character set named utf8 uses a maximum of three bytes per character and contains only BMP characters. The utf8mb4 character set uses a maximum of four bytes per character and supports supplementary characters:
For a BMP character, utf8 and utf8mb4 have identical storage characteristics: same code values, same encoding, same length.
For a supplementary character, utf8 cannot store the character at all, whereas utf8mb4 requires four bytes to store it. Because utf8 cannot store the character at all, you have no supplementary characters in utf8 columns and need not worry about converting characters or losing data when upgrading utf8 data from older versions of MySQL.
—— The utf8mb4 Character Set (4-Byte UTF-8 Unicode Encoding)
解决方法
修改数据库编码
将数据库编码改为utf8mb4
存取时做转换
Java转换工具: