喜欢许巍的一些歌曲,真实而洒脱,那就爬虫玩会。获取数据用到了requests,绘制柱状图时用到了pyecharts,制作云词用到了wordcloud。
打开网易云音乐,找到许巍的漫步,按F12,让后再刷新网页,在Network界面的preview找到热评和点赞信息,headers可以找到传输模式、url等信息,将这些关键信息提取出来,用于requests入参。
url = 'http://music.163.com/weapi/v1/resource/comments/R_SO_4_168097?csrf_token=' headers = { 'Host':'music.163.com', 'Origin':'http://music.163.com', 'Referer':'http://music.163.com/song?id=168097', 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36' } user_data = { 'params':'FJN4+rq5e3sLz/pzSqht9plb0EzzIWY36gWXi/vzzVSGZ8DSvyMtLZa2lCRCgUVCTQKt6PLjvOyTtjl9y1/QXHQajyj7oZzl1iFRLzgsD1haZ/u1kl1l46pfX2zqS67VWKcHpMwkpAOsAWVMLhg1qfZbZT/2auyHxxI4fTjYD5DdwLWQ4424NNCQrHAaLyOj', 'encSecKey':'8fb5829f126f68b601d75ca3523bb51f7a0644b4dbdbf4675c50790c59b2bca9c17e5d108d1c47ac552b743e961fb928f2535dd27948a1094d3a324d2e2a9a447de2778c0fd07f8dcb029135712d8c805b9fbbbce42244918146414a50e0b408061ab22b2e697366c273ac9e3be25f102cd94f8c01299cca119ec20de86bf0b1' } response = requests.post(url,headers=headers,data=user_data) data = json.loads(response.text) hotcomments = [] for hotcommment in data['hotComments']: item = { 'userId':hotcommment['user']['userId'], #用户ID 'likedCount': hotcommment['likedCount'] # 点赞数 'content':hotcommment['content'], #评论 } hotcomments.append(item) userId = [content['userId'] for content in hotcomments] liked_count = [content['likedCount'] for content in hotcomments] content_list = [content['content'] for content in hotcomments] bar = Bar("点赞柱状图") bar.add( "点赞数",userId, liked_count, is_stack=True,mark_line=["min", "max"],mark_point=["average"]) bar.render() back_color = imread('background.png') content_text = " ".join(content_list) wc = WordCloud(r'C:\Windows\Fonts\ygyxsziti2.0.ttf', mask=back_color, #画布形状 #background_color='grey', #画布背景颜色 width=2000, #画布宽度,mask不生效时生效 height=900, #画布高度,mask不生效时生效 margin=2, min_font_size=4, #字体最小值 max_font_size=95, #字体最大值 max_words=100) #最大词汇量 wc.generate(content_text) plt.figure() plt.imshow(wc,interpolation='bilinear') plt.axis('off') plt.savefig('wc_savefig.jpg',dpi=200) #保存图片,可设dpi,如果用savefig保存图片,需要放在show前,否则打开图片显示空白 plt.show() wc.to_file('wc_to_file.jpg') #保存图片
热评图效果:
云词图效果:
在保存输出云图时,经过实验对比WordCloud自带的to_file比matplotlib.pyplot的savefig显示质量高些。