Fuzzy matching string python Fuzzywuzzy

Python provides fuzzywuzzy module, not only for calculating the degree of similarity between the two strings, but also provides an interface from a large number of candidate ordered concentrated to find the most similar sentence.

(1) Installation

pip install fuzzywuzzy

(2) Description Interface

Two modules: fuzz, process, fuzz is mainly used for the match between two strings, process mainly used to sort search.

fuzz.ratio (s1, s2) directly calculate the similarity between s2 and s2, the return value represents identical 0-100,100;

fuzz.partial_ratio (S1, S2) a partial match, if a substring S1 S2 still return 100;

fuzz.token_sort_ratio (S1, S2) only compare S1, S2 are the same words, without regard to the order of between words;

fuzz.token_set_ratio (S1, S2) are not taken into account for comparison fuzz.token_sort_ratio word appears;

process.extract (S1, ListS, limit = n), to identify Top n represents most similar sentence S1 from the list listS;

process.extractOne (S1, ListS), returns the most similar
. 1
2
3
. 4
. 5
. 6
. 7
. 8
. 9
10
. 11
(3)

from fuzzywuzzy import fuzz
a = 'a b c'
b = ' a c b '
c = 'a c'

fuzz.ratio(a, c)
>> 75
fuzz.ratio(b, c)
>> 60

fuzz.partial_ratio(a, c)
>> 67
fuzz.partial_ratio(b, c)
>> 100

fuzz.token_sort_ratio(a, c)
>> 75
fuzz.token_sort_ratio(b, c)
>> 75

fuzz.token_set_ratio(a, c)
>> 100
fuzz.token_set_ratio(b, c)
>> 100
--------------------- 

Guess you like

Origin www.cnblogs.com/ly570/p/10935454.html