利用jieba第三方库对文件进行关键字提取

已经爬取到的斗破苍穹文本以TXT形式存储

代码

import jieba.analyse
path = 'C:/Users/Administrator/Desktop/bishe/doupo.text'
fp = open(path,'r')
content = fp.read()
try:
    jieba.analyse.set_stop_words('C:/Users/Administrator/Desktop/bishe/aa.txt')
    tags = jieba.analyse.extract_tags(content, topK=15, withWeight=True)
    for item in tags:
        print(item[0]+'\t'+str(int(item[1]*1000)))
finally:
    fp.close()

结果

猜你喜欢

转载自www.cnblogs.com/lanbofei/p/8980102.html
今日推荐