Python竟然可以这样玩?

版权声明:禁止转载至其它平台,转载至博客需带上此文链接。 https://blog.csdn.net/qq_41841569/article/details/89146215

img

image
当然在学习Python的道路上肯定会困难,没有好的学习资料,怎么去学习呢?  学习Python中有不明白推荐加入交流群号:973783996 群里有志同道合的小伙伴,互帮互助,  群里有不错的视频学习教程和PDF!
img

img

img

img

img

对《还珠格格》进行词频统计

img
img

img

对《还珠格格》的词频统计生成词云标签

img

将《2016年中国政府工作报告》变成词云是这样的

img

然后是《小时代》

img
img

img

以小燕子照片为词云背景

img

对《射雕英雄传》进行词频统计并以郭靖剧照作为词云背景

img

有没有满满的即视感?

img

img

img
img
一个Web端的电影数据库交互
img
img
img

可以了解整个香港电影史,从早期合拍上海片,到胡金栓的武侠片,到李小龙时代,然后是成龙,接着周星驰

imgimg
对职责要求的词频分析,提炼出必需技能
imgimg

用爬虫爬下上万知乎女神照片
img
img

最后,展示一下Python代码:

词频统计和词云的代码

from wordcloud import WordCloud
import jieba
import PIL
import matplotlib.pyplot as plt
import numpy as np

def wordcloudplot(txt):
    path = 'd:/jieba/msyh.ttf'
    path = unicode(path, 'utf8').encode('gb18030')
    alice_mask = np.array(PIL.Image.open('d:/jieba/she.jpg'))
    wordcloud = WordCloud(font_path=path, background_color="white", margin=5, width=1800, height=800, mask=alice_mask, max_words=2000, max_font_size=60, random_state=42)
    wordcloud = wordcloud.generate(txt)
    wordcloud.to_file('d:/jieba/she2.jpg')
    plt.imshow(wordcloud)
    plt.axis("off")
    plt.show()

def main():
    a = []
    f = open(r'd:\jieba\book\she.txt', 'r').read()
    words = list(jieba.cut(f))
    for word in words:
        if len(word) > 1:
            a.append(word)
    txt = r' '.join(a)
    wordcloudplot(txt)

if __name__ == '__main__':
    main()

爬知乎女神的代码

import requests
import urllib
import re
import random
from time import sleep

def main():
    url = 'xxx'
    headers = {xxx}
    i = 925
    for x in xrange(1020, 2000, 20):
        data = {'start': '1000',
                'offset': str(x),
                '_xsrf': 'a128464ef225a69348cef94c38f4e428'}
        content = requests.post(url, headers=headers, data=data, timeout=10).text
        imgs = re.findall('<img src=\\\\\"(.*?)_m.jpg', content)
        for img in imgs:
            try:
                img = img.replace('\\', '')
                pic = img + '.jpg'
                path = 'd:\\bs4\\zhihu\\jpg4\\' + str(i) + '.jpg'
                urllib.urlretrieve(pic, path)
                print ('下载了第' + str(i) + u'张图片')
                i += 1
                sleep(random.uniform(0.5, 1))
            except:
                print ('抓漏1张')
                pass
        sleep(random.uniform(0.5, 1))

if __name__ == '__main__':
    main()

猜你喜欢

转载自blog.csdn.net/qq_41841569/article/details/89146215