py-词频统计


title: py-词频统计
date: 2018-12-21 12:32:35
tags: [解析式,lambda]
categories: Python

词频统计

import string

path = 'C:/Users/Desktop/Walden.txt'

with open(path, 'r', encoding='utf-8-sig') as text:
    # 首字母小写、去掉标点符号
    words = [s.strip(string.punctuation).lower() for s in text.read().split()]
    # 去除重复的单词
    all_words = set(words)
    # 统计每个单词
    count_words = {index:words.count(index) for index in all_words}

# 对字典按照出现的次数降序排列,lambda表达式
for word in sorted(count_words, key=lambda x: count_words[x], reverse=True):
    print('{}-{} times.'.format(word, count_words[word]))
    

猜你喜欢

转载自blog.csdn.net/henuyh/article/details/85160248