Analysis using python achieve micro letter friends (simple)

I. Features

This paper describes the use of web client micro-channel data acquisition to achieve personal micro-channel buddy data acquisition, and some simple data analysis features include:

1. crawling buddy list, show your friends nickname, gender and geographical save and signature, as xlsx format files

2. Statistics Friends geographical distribution, and to make a word cloud and visual display on the map

Second, the dependent libraries

1, Pyecharts: a library for generating echarts graph, echarts Baidu is open a data visualization library, generated using echarts FIG visualization great, the library may be generated using pyechart echarts data in python FIG.

2, Itchat: an open source micro-channel interface to a personal number, using python call micro-channel has never been easier.

3, Jieba: Simple word manipulation library.

4, Numpy: NumPy system is an open source computing numerical Python extension. This tool can be used to store and process large matrix.

5, Pandas: pandas NumPy is based on a tool, the tool is created to solve data analysis tasks.

6, Pillow: image processing.

7, wxpy: wxpy basis itchat on by a large number of interfaces optimized to enhance the ease of use of the module, and feature-rich expansion. (Itself provides micro-channel)

Note:. Pyecharts may install version 0.5 * better

Tripartite above library can be achieved by the installation operator command (cmd), the specific command: pip install ***

Third, the operation

 

. 1  from wxpy Import *            # import modules 
2 BOT = Bot. (Cache_path = True)    # initialization robot scan code selection log 
. 3 friend_all bot.friends = ()    # Get Friends micro channel information

First there is a two-dimensional code, and then scan Log

 

Well, is this a successful login display

 

 Then you can operate a number of friends, personal information

1 Print (len (friend_all)) # number of friends

2 Print (friend_all [0] .raw) # outputs personal information 

 

The results showed that the

 

Fourth, next to all the friends information into a file xlsx

All Friends get information

 1 for a_friend in friend_all:
 2     NickName = a_friend.raw.get('NickName', None)
 3     #昵称
 4     #Sex = a_friend.raw.get('Sex', None)
 5     Sex = {1: "", 2: "", 0: "其它"}.get(a_friend.raw.get('Sex', None), None)
 6     #性别(优化)
 7     City = a_friend.raw.get('City', None)
 8     #城市
 9     Province = a_friend.raw.get('Province', None)
10     #省份
11     Signature = a_friend.raw.get('Signature', None)
12     #个性签名
13     HeadImgUrl = a_friend.raw.get('HeadImgUrl', None)
14     #头像地址
15     HeadImgFlag = a_friend.raw.get('HeadImgFlag', None)
16     #小Flag
17     list_0=[NickName, Sex, City, Province, Signature, HeadImgUrl, HeadImgFlag]
18     #存为一维数组
19     lis.append(list_0)
20     #叠加数据

 

存为xlsx文件

 1 def list_excel(filename,lis):
 2     '''
 3     将列表写入excel中,其中列表中的元素是列表.
 4     filename:保存的文件名(含路径)
 5     lis:元素为列表的列表,如下:
 6     lis = [["名称", "价格", "出版社", "语言"],
 7     ["暗时间", "32.4", "人民邮电出版社", "中文"],
 8     ["拆掉思维里的墙", "26.7", "机械工业出版社", "中文"]]
 9     '''
10     import openpyxl
11     wb = openpyxl.Workbook()   #激活worksheet
12     sheet = wb.active
13     sheet.title = 'sheet1'     #创建一个表格
14     file_name = filename +'.xlsx'
15     for i in range(0, len(lis)):
16         for j in range(0, len(lis[i])):
17             sheet.cell(row=i+1, column=j+1, value=str(lis[i][j]))
18             #每行每列的存入数据
19     wb.save(file_name)
20     print("写入数据成功!")
21 list_excel('wechat',lis)

 

效果如下:

可以看到其好友基本分布再广东省,个性签名也是非常的杀马特

 

五、实现词云图(我们也可以从存储在本地的 excel 中读取数据进行分析,并查看数据形式。在执行以 下代码之前,我们需要先把 excel 文件加一个列标题行)

例如nickname sex city province signature headImgUrl headImgFlag

 1 #导入模块
 2 from wordcloud import WordCloud
 3 import matplotlib.pyplot as plt
 4 import pandas as pd
 5 from pandas import DataFrame
 6 
 7 word_list= df['city'].fillna('0').tolist()
 8 #将 dataframe 的列转化为 list,其中的 nan 用“0”替换
 9 new_text = ' '.join(word_list)
10 wordcloud = WordCloud(font_path='simhei.ttf',  background_color="black").generate(new_text)
11 #设计图背景颜色,字体
12 plt.imshow(wordcloud)
13 plt.axis("off")
14 plt.show() 

 

还可以将词云图存为HTML形式

 1 #利用 pyechart 做词云
 2 import pandas as pd
 3 #count = df.city.value_counts() #对 dataframe 进行全频率统计,排除了 nan
 4 city_list = df['city'].fillna('NAN').tolist()#将 dataframe 的列转化为 list,其中的 nan 用“NAN” 替换
 5 count_city = pd.value_counts(city_list)#对 list 进行全频率统计 
 6 from pyecharts.charts.wordcloud   import WordCloud  #设置对象
 7 name = count_city.index.tolist()
 8 value = count_city.tolist()
 9 wordcloud = WordCloud(width=1300, height=620)
10 wordcloud.add("", name, value, word_size_range=[20, 100])
11 wordcloud.show_config()
12 wordcloud.render(r'D:\python\wechatcloud.html')

 

再看看效果:

 

 六、转化为地图形式

注:安装地图数据包:pip install echarts-china-provinces-pypkg        pip install echarts-countries-pypkg 

 1 province_list = df['province'].fillna('NAN').tolist()
 2 #将 dataframe 的列转化为 list,其中的 nan 用 “NAN”替换
 3 count_province = pd.value_counts(province_list)
 4 #对 list 进行全频率统计
 5 
 6 from pyecharts import Map
 7 value =count_province.tolist()
 8 attr =count_province.index.tolist()
 9 map=Map("各省微信好友分布", width=1300, height=700)
10 map.add("", attr, value, maptype='china', is_visualmap=True,visual_text_color='#000',is_label_show = True)
11 #显示地图上的省份
12 map.show_config()
13 map.render(r'D:\python\wechatProMap.html') 

 

效果:

好了,以上微信好友分析就介绍到这了。

 

Guess you like

Origin www.cnblogs.com/liyanyinng/p/10963105.html