Solve the problem of string reading caused by database unicode encoding

The database setting code is: utf8mb4_unicode_ci.
In business, the database has stored characters with spaces to compare the real input string data with spaces, and you will encounter the same appearance on the surface, but after program comparison it returns False.
Example:
database visualization The query results are as follows:
Insert picture description here

The results of the program query:

>>> data
[('Free\xa0Pray')]

This obviously cannot be matched normally, and the unicodedata library can solve the problem.
solution:

new_str = unicodedata.normalize("NFKD", unicode_str)
>>> input_name
'Free Pray'
>>> db_name
'Free\xa0Pray'
>>> input_name == db_name
False
>>> import unicodedata
>>> new_name = unicodedata.normalize("NFKD", db_name)
>>> new_name
'Free Pray'
>>> new_name == input_name
True

For details, please refer to the official document: https://docs.python.org/2/library/unicodedata.html#unicodedata.normalize

If you think the article is helpful to you, please like, bookmark, and follow!

Guess you like

Origin blog.csdn.net/Lin_Hv/article/details/108406007