Several ways to determine whether it is a Chinese character in python

1. Use Python’s built-in ord()

The ord() function converts characters to Unicode encoding, and then determines whether its range is within the range of Chinese characters:
Sample code:

def is_chinese(char):
    if '\u4e00' <= char <= '\u9fff':
        return True
    else:
        return False

2. Use Python’s built-in unicodedata library:

Using Python's built-in unicodedata library can be used to determine whether a character is a Chinese
character. Sample code:

import unicodedata
def is_chinese(char):
    if 'CJK' in unicodedata.name(char):
        return True
    else:
        return False

3. Use regular expressions

You can use regular expressions to determine whether a character is a Chinese character. For example, use [^\u4e00-\u9fa5] to match all non-Chinese characters, and [^\x00-\xff] to match all double-byte characters, including Chinese characters and symbols.
Sample code:

import re

# 判断字符是否为汉字
def is_chinese(word):
    pattern = re.compile(r'[^\u4e00-\u9fa5]')
    if pattern.search(word):
        return False
    else:
        return True

4. Use Chinese character set

You can use the Chinese character set to determine whether a character is a Chinese character. For example, use the GB2312 character set or GBK character set to encode each Chinese character into a double-byte character and determine whether a character is in this character set.
Sample code:

# 判断字符是否为汉字
def is_chinese(word):
    if b'\xb0\xa1' <= word.encode('gb2312') <= b'\xd7\xf9':
        return True
    else:
        return False

5. Use third-party libraries

You can also use some third-party libraries to determine whether a character is a Chinese character. For example, the xpinyin library can convert a string into Pinyin and determine whether the string is a Chinese character.
Sample code:

from xpinyin import Pinyin

# 判断字符是否为汉字
def is_chinese(word):
    pinyin = Pinyin()
    if pinyin.get_pinyin(word, '').isalpha():
        return False
    else:
        return True

Guess you like

Origin blog.csdn.net/sinat_29891353/article/details/129353893