[Python] The first micro-channel small project

A micro channel buddy data analysis Application

 

1. crawling buddy list, show your friends nickname, gender and geographical and signature, save the file as xlsx format
2. Statistics Friends geographical distribution, and to make a word cloud and visual display on the map
3. Get all the friend's head, merger to a large map

Second, the need for libraries

1, Pyecharts: a library for generating echarts graph, echarts Baidu is open a data visualization library, generated using echarts FIG visualization great, the library may be generated using pyechart echarts data in python FIG.

2, Itchat: an open source micro-channel interface to a personal number, using python call micro-channel has never been easier.

3, Jieba: Simple word manipulation library.

4, Numpy: NumPy system is an open source computing numerical Python extension. This tool can be used to store and process large matrix.

5, Pandas: pandas NumPy is based on a tool, the tool is created to solve data analysis tasks.

6, Pillow: image processing.

7, wxpy: wxpy basis itchat on by a large number of interfaces optimized to enhance the ease of use of the module, and feature-rich expansion. (Itself provides micro-channel)

Note:. Pyecharts may install version 0.5 * better

Tripartite above library can be achieved by the installation operator command (cmd), the specific command: pip install ***

Another: a visual display on a map to install the map data packet:

pip install echarts-china-provinces-pypkg

pip install echarts-countries-pypkg

Third, the operating environment:

Use spyder editor for Anaconda.

Third, the points do Marching

1, let the program to log micro letter, and get information about my friend.

from wxpy import * # import modules 
 bot = Bot (cache_path = True) # robot initialization, selection scan code log 
 friend_all = bot.friends () # Get Friends micro channel information

Run login code will automatically pop up a two-dimensional code page, open shown in Figure 1, after using a mobile phone scan code agreed to enter the micro-channel
and access to relevant information on the micro-letter friends.

2, get my information ()

(1) using the acquired Interactive

in[2]:print(friend_all[0].raw)#friend_all[0]是我的微信昵称,.raw 则是获取我的全部信息
{'UserName': '@8c0c266b8a6e26de8ac633c1b8e9da89bc28c2b8f2ea97084f66518d2b5280ba', 'City': '', 'DisplayName': '', 'PYQuanPin': '', 'RemarkPYInitial': '', 'Province': '', 'KeyWord': '', 'RemarkName': '', 'PYInitial': '',
'EncryChatRoomId': '', 'Alias': '', 'Signature': '千帆过尽还是你',
'NickName': '杨宇平', 'RemarkPYQuanPin': '',
'HeadImgUrl': '/cgi-bin/mmwebwx-bin/webwxgeticon?seq=1697661224&username=@8c0c266b8a6e26de8ac633c1b8e9da89bc28c2b8f2ea97084f66518d2b5280ba&skey=@crypt_35dd26c8_6e6d15d86316931db6d7bbb2bfe9b2e8',
'UniFriend': 0, 'Sex': 2, 'AppAccountFlag': 0, 'VerifyFlag': 0,
'ChatRoomId': 0, 'HideInputBarFlag': 0, 'AttrStatus': 0, 'SnsFlag': 1, 'MemberCount': 0, 'OwnerUin': 0, 'ContactFlag': 0, 'Uin': 548324490, 'StarFriend': 0, 'Statues': 0, 'MemberList': [], 'WebWxPluginSwitch': 0, 'HeadImgFlag': 1}

(2)文件式获取

(在原来的三行代码上加上最后一行即可)

from wxpy import *
bot=Bot(cache_path=True)
friend_all = bot.friends()
print(friend_all[0].raw)#friend_all[0]是我的微信昵称,.raw 则是获取我的全部信息

显示效果:

3、获取我的好友数量

(在原先的代码上加入此行即可,直接放在交互式,文件式都可以)

print(len(friend_all)) #好友的数量

结果:(显示博主177个好友)

 

4、把全部的好友信息转化为一个xlsx文件

获取好友信息

 

for a_friend in friend_all:
    NickName = a_friend.raw.get('NickName', None)
    #昵称
    #Sex = a_friend.raw.get('Sex', None)
    Sex = {1: "", 2: "", 0: "其它"}.get(a_friend.raw.get('Sex', None), None)
    #性别(优化)
    City = a_friend.raw.get('City', None)
    #城市
    Province = a_friend.raw.get('Province', None)
    #省份
    Signature = a_friend.raw.get('Signature', None)
    #个性签名
    HeadImgUrl = a_friend.raw.get('HeadImgUrl', None)
    #头像地址
    HeadImgFlag = a_friend.raw.get('HeadImgFlag', None)
    #小Flag
    list_0=[NickName, Sex, City, Province, Signature, HeadImgUrl, HeadImgFlag]
    #存为一维数组
    lis.append(list_0)
    #叠加数据

并存为xlsx文件

def lis2e17(filename,lis):#把数据存到表格中
    import openpyxl
    wb = openpyxl.Workbook()
    sheet = wb.active
    sheet.title = 'list2excel17'
    file_name = filename +'.xlsx'
    title=['NickName','Sex','City','Province','Signature','HeadImgUrl','HeadImgFlag']
    
    for i in range(0, len(lis)):
        for j in range(0, len(lis[i])):
            sheet.cell(row=i+1, column=j+1, value=str(lis[i][j]))
    
    wb.save(file_name)
print("写入数据成功!")
print(lis2e17('yyp',lis))

结果:

 

 

5、把好友的地区用词云统计

(1)给原先的yyp.xlsl加上行标题,例如:nickname sex city province signature headImgUrl headImgFlag(因为下面的词云运用,要用到city索引,才能调用city那一列),另存为yyp_1.xlsx文件

 

#对数据进行初步探索
#方法一
#粗略获取好友的统计信息
Friends = bot.friends()
data = Friends.stats_text(total=True, sex=True,top_provinces=30, top_cities=500)
print(data)
from pandas import read_excel 
df = read_excel('yyp_1.xlsx',sheetname='list2excel17') #把yyp.xlsx加上列标题行,另存为yyp_1.xlsx,读取新的表格
df.tail(5)
df.city.count()
df.city.describe()

 

#把好友信息(地区)统计,词云
from wordcloud import WordCloud
import matplotlib.pyplot as plt
import pandas as pd
from pandas import DataFrame

word_list= df['city'].fillna('0').tolist()
#将 dataframe 的列转化为 list,其中的 nan 用“0”替换
new_text = ' '.join(word_list)
wordcloud = WordCloud(font_path='simhei.ttf',  background_color="black").generate(new_text)
#设计图背景颜色,字体
plt.imshow(wordcloud)
plt.axis("off")
plt.show()

 

 

 (2)将词云图转为HTML的形

这需要用到pyecharts库的0.5版本,anaconde3原先的是1.0版本的,需要删了这个,安装0.5版本

具体操作可参考博文 https://www.jianshu.com/p/eaad92f6d9ee 

代码实现如下:

#利用 pyechart 做词云
import pandas as pd
#count = df.city.value_counts() #对 dataframe 进行全频率统计,排除了 nan
city_list = df['city'].fillna('NAN').tolist()#将 dataframe 的列转化为 list,其中的 nan 用“NAN” 替换
count_city = pd.value_counts(city_list)#对 list 进行全频率统计 
from pyecharts.charts.wordcloud import WordCloud  #设置对象
name = count_city.index.tolist()
value = count_city.tolist()
wordcloud = WordCloud(width=1300, height=620)
wordcloud.add("", name, value, word_size_range=[20, 100])
wordcloud.show_config()
wordcloud.render(r'D:\Python\wechatcloud.html')

效果如下:

6、把好友的地区转为地图形式

 要加入模块 import pandas as pd 在添加下面代码

province_list = df['province'].fillna('NAN').tolist()
#将 dataframe 的列转化为 list,其中的 nan 用 “NAN”替换
count_province = pd.value_counts(province_list)
#对 list 进行全频率统计

from pyecharts import Map
value =count_province.tolist()
attr =count_province.index.tolist()
map=Map("各省微信好友分布", width=1300, height=700)
map.add("", attr, value, maptype='china', is_visualmap=True,visual_text_color='#000',is_label_show = True)
#显示地图上的省份
map.show_config()
map.render(r'D:\Python\wechatProMap.html')

显示效果如下:

 

7、总代码:

# -*- coding: utf-8 -*-
"""
Created on Sun Jun  2 23:38:29 2019

@author: yyp
"""
from wxpy import *  
import pandas as pd  #地图那要用到的模块
bot=Bot(cache_path=True)
friend_all = bot.friends()
print(friend_all[0].raw)#friend_all[0]是我的微信昵称,.raw 则是获取我的全部信息
a=len(friend_all)#输出好友个数
print(a)
lis=[]

for a_friend in friend_all:
     NickName = a_friend.raw.get('NickName',None)
     #Sex = a_friend.raw.get('Sex',None)
     Sex ={1:"",2:"",0:"其它"}.get(a_friend.raw.get('Sex',None),None)
     City = a_friend.raw.get('City',None)
     Province = a_friend.raw.get('Province',None)
     Signature = a_friend.raw.get('Signature',None)
     HeadImgUrl = a_friend.raw.get('HeadImgUrl',None)
     HeadImgFlag = a_friend.raw.get('HeadImgFlag',None)
     list_0=[NickName,Sex,City,Province,Signature,HeadImgUrl,HeadImgFlag]
     lis.append(list_0)


def lis2e17(filename,lis):#把数据存到表格中
    import openpyxl
    wb = openpyxl.Workbook()
    sheet = wb.active
    sheet.title = 'list2excel17'
    file_name = filename +'.xlsx'
    title=['NickName','Sex','City','Province','Signature','HeadImgUrl','HeadImgFlag']
    for i in range(0, len(lis)):
        for j in range(0, len(lis[i])):
            sheet.cell(row=i+1, column=j+1, value=str(lis[i][j]))
    wb.save(file_name)
print("写入数据成功!")
print(lis2e17('yyp',lis))


#对数据进行初步探索
#方法一
#粗略获取好友的统计信息
#另存文件
Friends = bot.friends()
data = Friends.stats_text(total=True, sex=True,top_provinces=30, top_cities=500)
print(data)
from pandas import read_excel 
df = read_excel('yyp_1.xlsx',sheetname='list2excel17') #把yyp.xlsx加上列标题行,另存为yyp_1.xlsx,读取新的表格
df.tail(5)
df.city.count()
df.city.describe()

'''#把好友信息(籍贯)统计,词云(常规)
from wordcloud import WordCloud
import matplotlib.pyplot as plt
import pandas as pd
from pandas import DataFrame
word_list= df['city'].fillna('0').tolist()
#将 dataframe 的列转化为 list,其中的 nan 用“0”替换
new_text = ' '.join(word_list)
wordcloud = WordCloud(font_path='simhei.ttf',  background_color="black").generate(new_text)
#设计图背景颜色,字体
plt.imshow(wordcloud)
plt.axis("off")
plt.show()'''

'''#利用 pyechart 做词云(以HTML的形式显示)
import pandas as pd
#count = df.city.value_counts() #对 dataframe 进行全频率统计,排除了 nan
city_list = df['city'].fillna('NAN').tolist()#将 dataframe 的列转化为 list,其中的 nan 用“NAN” 替换
count_city = pd.value_counts(city_list)#对 list 进行全频率统计 
from pyecharts.charts.wordcloud import WordCloud  #设置对象
name = count_city.index.tolist()
value = count_city.tolist()
wordcloud = WordCloud(width=1300, height=620)
wordcloud.add("", name, value, word_size_range=[20, 100])
wordcloud.show_config()
wordcloud.render(r'D:\Python\wechatcloud.html')'''

#把好友的地区显示在地图上(以地图的形式呈现)
province_list = df['province'].fillna('NAN').tolist()#将 dataframe 的列转化为 list,其中的 nan 用 “NAN”替换
count_province = pd.value_counts(province_list)#对 list 进行全频率统计
from pyecharts import Map
value =count_province.tolist()
attr =count_province.index.tolist()
map=Map("各省微信好友分布", width=1300, height=700)
map.add("", attr, value, maptype='china', is_visualmap=True,visual_text_color='#000',is_label_show = True)#显示地图上的省份
map.show_config()
map.render(r'D:\Python\wechatProMap.html')

 

Guess you like

Origin www.cnblogs.com/yyp-20190107/p/10971319.html