Building prefix dictfrom the default dictionary ...
Loading model from cache
Default mode:我 在 学习 自然语言 处理
All mode:我 在 学习 自然 自然语言 语言 处理
Loading model cost 0.578 seconds.
Prefix dict has been built successfully.
import jieba
seg_list = jieba.lcut_for_search("如果放到旧字典中将会出错")print(seg_list)
Building prefix dictfrom the default dictionary ...
Loading model from cache
['如果','放到','旧','字典','中将','会','出错']
Loading model cost 0.590 seconds.
Prefix dict has been built successfully.
这种情况下在进行词语分割时容易混淆词语,实际“中”和“将”分
别是两个不同的词。为此可以用suggest_freq(segment,
tune=False)方法
Parameter:- segment : The segments that the word is expected
to be cut into.If the word should be treated as a
whole,use a str.- tune : If True, tune the word frequency.
对于参数tune,如果设置True,我们就增大segment中词的出现
频率。
import jieba
jieba.suggest_freq(('中','将'),True)
seg_list = jieba.lcut_for_search("如果放到旧字典中将会出错")print(seg_list)
Building prefix dictfrom the default dictionary ...
Loading model from cache
Loading model cost 0.569 seconds.
Prefix dict has been built successfully.['如果','放到','旧','字典','中','将','会','出错']