[python] word cloud map production

Word cloud map making python

After practicing the word cloud diagram for a period of time, I will tell you the detailed process of making the word cloud diagram.

renderings

insert image description here

Tool preparation

  • 1、python3
  • 2. Install the third-party library wordcloud
  • 3. Install numpy and pillow libraries.
  • 4. Install the jieba library
  • 5. Install the matplotlib library
from wordcloud import WordCloud
import numpy as np
from PIL import Image
from matplotlib import colors
import collections
#这些都是需要使用的库

Installation method: I mostly install directly with the help of settings in pycharm. But there are also installation failures, you can "c" by yourself

code display

# -*- coding: utf-8 -*-
import jieba
from wordcloud import WordCloud
import numpy as np
from PIL import Image
from matplotlib import colors
import collections


def chinese_jieba():
    # 读取目标文本
    with open(r'文本.txt', encoding='utf-8') as fp:
        txt = fp.read()
        fp.close()
    wordlist_jieba = jieba.lcut(txt) # 将文本分割,返回列表
    txt_jieba = " ".join(wordlist_jieba) # 将列表拼接为以空格为间断的字符串
    return txt_jieba
    
def stopwords_read():
    # 读取停用词,也可自己根据需求写入
    stopwords_ = ['里', '拍']
    with open('chinesestopwords.txt', 'r', encoding='utf-8') as f:
        for line in f:
            if len(line) > 0:
                stopwords_.append(line.strip())
    return stopwords_
    

def wordcloud_generate():
    stopwords_ = stopwords_read()#读取停用词
    txt = chinese_jieba()#读取文本
    background_image = np.array(Image.open('椭圆背景.jpg'))#自定义背景轮廓
    colormaps = colors.ListedColormap(['#871A84', '#BC0F6A', '#BC0F60', '#CC5F6A', '#AC1F4A'])  # 自定义字体色,该系列是蓝紫色
    wordcloud = WordCloud(font_path='simhei.ttf',  # 字体
                              prefer_horizontal=0.99,#大部分都是横向排放
                              background_color='white',  # 背景色
                              max_words=100,  # 显示单词数
                              max_font_size=400,  #最大字号
                              stopwords=stopwords_,  # 过滤噪声词
                              mask=background_image,#背景轮廓
                              colormap=colormaps,#使用自定义颜色
                              collocations=False
                              ).generate(txt)
    image = wordcloud.to_image()
    image.show()  # 展示图片
    wordcloud.to_file('词云图.jpg')  # 保存图片

if __name__ == '__main__':
    wordcloud_generate()

以上代码可直接运行
生成的图片可以去存放这个代码的文件夹下寻找

code analysis

  1. Basic operating conditions:
    firstly, all libraries are installed;
    secondly, store the code, target text, stop word text, font, and background image in the same folder (if not stored in the same folder, you need to store The resource reference path in the code is changed to an absolute path) as shown in the figure:
    insert image description here

  2. Text
    I store the text in a txt file. The specific information comes from keyword crawling of smart pension on Weibo. In fact, the word cloud diagram also reflects this theme well.

  3. Stop Words
    What are stop words?
    First of all, we have to clearly understand that after word segmentation of the text, some scattered and useless words may be obtained, which are sometimes not conducive to the presentation of the theme of the text, such as "this", "that", etc., then we can include these words in the stop In terms of words, these words will not be displayed on the picture.

  4. Word cloud image text color
    There are many ways to set the text color. Here we use custom color matching. Here are some color collections I recommend for everyone to use. After I have selected thousands of colors, they are pretty good-looking colors.

['#43045F', '#4E0362', '#C63264', '#FF9799', '#FFBAAB'] #紫色
['#7e9680', '#79616f', '#AE6378', '#D87F81', '#EAB595'] #杂色
  1. Other parameters
    I have commented many parameters in the program, you can combine your understanding and modify it according to your own needs.
  2. Clarity
    If the word cloud is generated based on the background image, the definition of the generated word cloud is the same as that of the background image. If the background image is 100 by 100 pixels, the generated word cloud is also 100 by 100. It is recommended that you choose a high resolution background image.

The following is my git address, welcome everyone to download for free, exchange and learn:

https://github.com/HYHJessica/

Guess you like

Origin blog.csdn.net/CBCY_csdn/article/details/125676309