每天一点点,记录学习点滴
python fuzzywuzzy 模糊匹配,计算相似度
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
1:简单匹配
a = fuzz.ratio('this is a shot','this is a shat')
Out[37]: 93
2:非完全匹配
b = fuzz.partial_ratio('this is a shot','this is o shot')
Out[38]: 93
3:忽略顺序匹配
d = fuzz.token_sort_ratio('我是谁?我在哪?','我在哪?我是谁?')
Out[40]: 100
4:去重子集匹配
e = fuzz.token_set_ratio('this is a shot','this is is a shot')
Out[42]: 100
5:返回模糊匹配的字符串和相似度,如果不需要全部数据,只要
其中几个,可以最后设置条件limit = n 即可
choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
f = process.extract("New York Jets",choices)
Out[44]:
[(‘New York Jets’, 100),
(‘New York Giants’, 79),
(‘Atlanta Falcons’, 29),
(‘Dallas Cowboys’, 22)]
模糊查找匹配字符串和相似度
扫描二维码关注公众号,回复:
8736333 查看本文章
g = process.extractOne('Cow',choices)
Out[46]:
(‘Dallas Cowboys’, 90)