Python word cloud drawing


insert image description here

Effect display and preparation work:

Effect display diagram:
insert image description here
Preparation

  • Pycharm installs third-party libraries numpy, jieba, wordcloud
  • Word cloud text preparation (.txt)
  • Preparation of background image ( I use PS )

Explanation of using pycharm to install third-party libraries:
(Personally, I feel more convenient than downloading with pip at the terminal)

insert image description here
background image preparation

  • Find a random one on Baidu
  • Use PS to adjust the background to pure white, and change the character graphics to any other solid color
    as follows:
    insert image description here

enter code writing

put it firstText and background imagesPut in that python file
insert image description here

Steps for adding text and pictures:
insert image description here
insert image description here
Click to copy the path of the file, open the file by pasting in the file management, copy and paste the text and pictures to the corresponding file

insert image description here

Ideas:

  • open text
  • useRegularRemove special symbols from it
  • usejiebaTokenize its text
  • Count the number of occurrences of words and sort them
  • Use dictionary, del to filter unwanted words
  • Finally, use wordcloud library to make word cloud

library operations called

import string
import jieba
import numpy as np
from PIL import Image
from matplotlib import colors
import re
import matplotlib.pyplot as plt
from wordcloud import  WordCloud

file operation

mask=np.array(Image.open('1.png')) # 导入图片

txt=open(r'新时代中国特色社会主义.txt',mode='r',encoding="utf-8")
#打开文本,编码格式为utf-8
txt1=txt.read()#原始文本
txt2=re.sub(r'[^\u4e00-\u9fa5]','',txt1)
# 利用正则,删除特殊符号
txt3=jieba.cut(txt2)#可迭代对象  分词
txt4={
    
    }
for i in txt3:
    if i not in txt4:
        txt4[i]=1
    else:
        txt4[i]+=1
txt5=sorted(txt4.items(),key=lambda x:x[0],reverse=True)
# 统计词语出现次数,并排序
txt6={
    
    }
for word,count in txt5:
    txt6[word]=count
list=["的","是","和"]# 过滤器
for i in list:
    del txt6[i]

wordcloud

wordcloud=WordCloud(
    background_color='white',#背景
    font_path='simsun.ttc',#字体
    max_words=400,#最大数量
    mask=mask,
    max_font_size=90,#最大字号
    width=400,#宽
    height=400,#高
).generate_from_frequencies(txt6)
plt.figure(figsize=(8,8))
plt.imshow(wordcloud,interpolation='bilinear')
plt.axis('off')#关闭坐标轴
plt.show()
wordcloud.to_file('词云.png')

After that, you can make a word cloud normally

Guess you like

Origin blog.csdn.net/weixin_72138633/article/details/131025351