Python crawler novice introductory teaching (6): making word cloud diagrams

Preface

The text and pictures in this article are from the Internet and are for learning and communication purposes only, and do not have any commercial use. If you have any questions, please contact us for processing.

Python crawler, data analysis, website development and other case tutorial videos are free to watch online

https://space.bilibili.com/523606542

Preamble content

Python crawler beginners introductory teaching (1): crawling Douban movie ranking information

Python crawler novice introductory teaching (2): crawling novels

Python crawler beginners introductory teaching (3): crawling Lianjia second-hand housing data

Python crawler novice introductory teaching (4): crawling 51job.com recruitment information

Python crawler beginners' introductory teaching (5): Crawling the video barrage of station B

Basic development environment

  • Python 3.6
  • Pycharm

Use of related modules

  • jieba
  • wordcloud

Install Python and add it to the environment variables, pip installs the required related modules.

The last article crawled the barrage data of the B station video. Some barrage word cloud analysis can be done in this regard, so that the crawler data is no longer too monotonous.

The code content is still very brief, you can understand it by looking at the comments

import jieba
import wordcloud
# 读取文件内容
f = open('弹幕.txt', encoding='utf-8')
txt = f.read()
# jiabe 分词 分割词汇
txt_list = jieba.lcut(txt)
string = ' '.join(txt_list)
# 词云图设置
wc = wordcloud.WordCloud(
        width=1000,         # 图片的宽
        height=700,         # 图片的高
        background_color='white',   # 图片背景颜色
        font_path='msyh.ttc',    # 词云字体
        scale=15,
)
# 给词云输入文字
wc.generate(string)
# 词云图保存图片地址
wc.to_file('out.png')

 


As shown in the figure above, there are many such   keywords in place . Such keywords have no practical meaning. We can set stop words in the word cloud settings. 

stopwords={'到位'}

If the picture you want to input is not a square picture and you want to set it to the shape you want, you need to find a transparent picture first and 
import it into the  imageio  module

import jieba
import wordcloud
import imageio
# 导入imageio库中的imread函数,并用这个函数读取本地图片,作为词云形状图片
py = imageio.imread('.\\0.jpg')  #  如何你想要改变词云图的形状,可以添加
# 读取文件内容
f = open('B站弹幕.txt', encoding='utf-8')
txt = f.read()
# jiabe 分词 分割词汇
txt_list = jieba.lcut(txt)
string = ' '.join(txt_list)
# 词云图设置
wc = wordcloud.WordCloud(
        width=1000,         # 图片的宽
        height=700,         # 图片的高
        background_color='white',   # 图片背景颜色
        font_path='msyh.ttc',    # 词云字体
        mask=py,     # 所使用的词云图片
        scale=15,
        stopwords={'到位'},         # 停用词
        # contour_width=5,
        # contour_color='red'  # 轮廓颜色
)
# 给词云输入文字
wc.generate(string)
# 词云图保存图片地址
wc.to_file('out.png')

 

Guess you like

Origin blog.csdn.net/m0_48405781/article/details/113247830