How to handle emoji expressions in java

Recently, I encountered the problem of emoji expression storage in the project. Later, I searched a lot on the Internet.
Two methods are summarized for processing:
1. Filter by regular matching, the code is as follows:
//emoji expression filter
	    public static String filterEmoji(String source) {
	         if(source != null){
	             Pattern emoji = Pattern.compile("[\ud83c\udc00-\ud83c\udfff]|[\ud83d\udc00-\ud83d\udfff]|[\u2600-\u27ff]",Pattern.UNICODE_CASE | Pattern.CASE_INSENSITIVE);
	             Matcher emojiMatcher = emoji.matcher (source);
	             if (emojiMatcher.find()){
	                 source = emojiMatcher.replaceAll("");
	                 return source;
	             }
	         return source;
	        }
	        return source;
	     }
2. For storage by modifying the database code, it is best to use this code when creating the database:
A lot of people say that the character set encoding of the mysql database is changed to utf8mb4_unicode_ci. After the change, the emoji expression stored is 4 question marks. Later, I checked the Internet and found that the emoji expression is processed by introducing the jar method of emoji-java. The following are two conversions. Code: //Convert emoji expressions in strings containing emoji expressions into corresponding aliases String result = EmojiParser.parseToAliases(name); //Convert the obtained aliases into corresponding emoji expressions name = EmojiParser.parseToUnicode(name) ; Specific reference: https://github.com/vdurmont/emoji-java Although the above method can be used, it needs to compare the emoji library in the jar package, and the speed is relatively slow. Later, I continued to study The above is solved by modifying the database encoding. Finally, by configuring the my.ini file, you can modify the database encoding format to the following format. It turns out that the garbled characters appear because the format of my character_set_server is latin1, which can be changed to utf8mb4. , you can see the successful screenshot:
 Table fields:
 Table engine:
 database:
data:
 Regardless of the garbled characters stored in the database, it will be displayed when the extracted environment supports emoji.

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326490186&siteId=291194637