python文本分析

filename = 'alice.txt'

try:
    with open(filename) as f_obj:
        contents = f_obj.read()
except FileNotFoundError as e:
    msg = "Sorry, the file " + filename + " does not exist."
    print(msg)
else:
    # Count the approximate number of words in the file.
    #split第一个参数指定分隔符,第二个参数指定分割次数
    words = contents.split()
    num_words = len(words)
    print("The file " + filename + " has about " + str(num_words) + " words.")

方法split()以空格为分隔符将字符串分拆成多个部分,并将这些部分都存储到一个列表中。
结果是一个包含字符串中所有单词的列表,虽然有些单词可能包含标点。为计算Alice in
Wonderland包含多少个单词,我们将对整篇小说调用split(),再计算得到的列表包含多少个元
素,从而确定整篇童话大致包含多少个单词:

发布了145 篇原创文章 · 获赞 6 · 访问量 8055

猜你喜欢

转载自blog.csdn.net/sinat_23971513/article/details/105054578