参考https://blog.csdn.net/qinyuanpei/article/details/79360703，基于python3.6实现微信朋友圈性别、地区、个性签名、头像四个维度的分析。

我的GitHub项目地址https://github.com/sanciyuan/wechat_analysis_itchat

1、准备工作

1.1 环境要求

WIN10
python3.6
pycharm编译器

1.2 第三方库要求

itchat：itchat是一个开源的微信个人号接口
jieba：结巴分词的 Python 版本，对文本信息进行分词处理。
matplotlib： Python 中图表绘制模块，在本文中用以绘制柱形图和饼图
snownlp：一个 Python 中的中文分词模块，在本文中用以对文本信息进行情感判断。
PIL： Python 中的图像处理模块，在本文中用以对图片进行处理。
numpy： Python中的数值计算模块，在本文中配合 wordcloud 模块使用。
wordcloud： Python 中的词云模块，在本文中用以绘制词云图片。
TencentYoutuyun：腾讯优图提供的 Python 版本 SDK ，在本文中用以识别人脸及提取图片标签信息。

安装问题①：

上述几个第三方库均可通过pip install方式安装，除了TencentYoutuyun

通过普通的pip方法无法实现安装，会报错

pip install TencentYoutuyun

解决方法①：

先去github下载官方sdk的zip压缩包至某目录下，如E:\Project\test\Python_sdk-master.zip

https://github.com/Tencent-YouTu/Python_sdk

打开命令行，定位到下载目录下，再进行pip安装，至安装成功

e:
cd Project\test
pip install Python_sdk-master.zip

安装问题②：

PIL目前只支持python2.x版本，不支持python3.x。

解决方法②：

Pillow是PIL的一个派生分支，但如今已经发展成为比PIL本身更具活力的图像处理库。

Pillow的Github主页：https://github.com/python-pillow/Pillow
Pillow的文档(对应版本v3.0.0)：https://pillow.readthedocs.org/en/latest/handbook/index.html
Pillow的文档中文翻译(对应版本v2.4.0)：http://pillow-cn.readthedocs.org/en/latest/

Python 3.x 安装Pillow

给Python安装Pillow非常简单，使用pip或easy_install只要一行代码即可。

在命令行使用PIP安装：

pip install Pillow

或在命令行使用easy_install安装：

easy_install Pillow

安装完成后，使用from PIL import Image就引用使用库了。比如：

from PIL import Image
im = Image.open("bride.jpg")
im.rotate(45).show()

2、好友性别

获取好友性别信息，统计男、女、未知的数量，计算比例并制作饼图可视化。

def analyseSex(firends):
    # 将friends中的Sex信息抽取出来，map返回的是迭代器，转化为list格式
    sexs = list(map(lambda x:x['Sex'],friends[1:]))
    # item返回（键，值）元组，第一维为键，第二维为值，即个数
    # 取性别个数变为列表
    counts = Counter(sexs).items()
    counts = sorted(counts, key=lambda x:x[0], reverse=False)
    counts = list(map(lambda x:x[1],counts))

    labels = ['Unknow','Male','Female']
    colors = ['red','yellowgreen','lightskyblue']
    plt.figure(figsize=(8,5), dpi=80)
    plt.axes(aspect=1) 
    plt.pie(counts, 
            labels=labels, 
            colors=colors, 
            labeldistance = 1.1, 
            autopct = '%3.1f%%',
            shadow = False, 
            startangle = 90, 
            pctdistance = 0.6 
    )
    plt.legend(loc='upper right',)
    plt.title(u'%s的微信好友性别组成' % friends[0]['NickName'])
    plt.savefig("analyseSex.jpg")
    plt.show()

3、好友地区

获取好友的地区信息，保存至本地'location.csv'，再统计各地区的好友数量存至'location_analysis.xls’。

def analyseLocation(friends):
    freqs = {}
    headers = ['NickName','Province','City']
    with open('location.csv','w',encoding='utf-8',newline='',) as csvFile:
        # DictWriter以字典的形式写入内容
        # 设置写入格式
        writer = csv.DictWriter(csvFile, headers)
        #  writeheader()实现添加文件头（数据名）
        writer.writeheader()
        for friend in friends[1:]:
            row = {}
            row['NickName'] = friend['NickName']
            row['Province'] = friend['Province']
            row['City'] = friend['City']
            # 统计城市数目
            if(friend['Province']!=None):
                if(friend['Province'] not in freqs):
                   freqs[friend['Province']] = 1
                else:
                   freqs[friend['Province']] += 1
            writer.writerow(row)
    print(freqs)
    print(type(freqs))
    print(len(freqs))

    key_list = list(freqs.keys())
    value_list = list(freqs.values())
    book = xlwt.Workbook(encoding='utf-8')
    sheet = book.add_sheet('sheet1')
    sheet.write(0, 0, 'Province')  # 其中的'0-行, 0-列'指定表中的单元
    sheet.write(0, 1, 'Num')
    for i in range(len(freqs)):
        sheet.write(i+1, 0, key_list[i])
        sheet.write(i+1, 1, value_list[i])

    book.save('location_analysis.xls')

使用BDP报表工具https://me.bdp.cn/home.html分析地区信息

点击上方“数据源” ，添加数据“excel”，上传'location_analysis.xls'——新建图表——普通图表、仪表盘示例、微信运营分析——维度：province，数值：num——右侧，图表类型：地图，最后生成下图。

4、好友头像

调用腾讯优图的人脸识别接口DetectFace和图片标签识别分类接口imagetag，再生成使用人脸头像的饼图和头像标签的词云。

- 接口
`DetectFace(self, image_path, mode = 0, data_type = 0)`
- 参数
	- `image_path` 待检测的图片路径
	- `mode` 是否大脸模式，默认非大脸模式
    - `data_type` 用于表示image_path是图片还是url, 0代表图片，1代表url

- 接口
`imagetag(self, image_path, data_type = 0, seq = '')`
- 参数
    - `image_path` 标识图片信息
    - `data_type` 用于表示image_path是图片还是url, 0代表图片，1代表url

def analyseHeadImage(frineds):
    # Init Path
    basePath = os.path.abspath('.')
    baseFolder = basePath + '\\HeadImages\\'
    if not os.path.exists(baseFolder) :
    # if(os.path.exists(baseFolder) == False):
        os.makedirs(baseFolder)

    # Analyse Images
    faceApi = FaceAPI()
    use_face = 0
    not_use_face = 0
    image_tags = ''
    for index in range(1,len(friends)):
        friend = friends[index]
        # Save HeadImages
        imgFile = baseFolder + '\\Image%s.jpg' % str(index)
        imgData = itchat.get_head_img(userName = friend['UserName'])
        if not os.path.exists(imgFile):
        # if(os.path.exists(imgFile) == False):
            with open(imgFile,'wb') as file:
                file.write(imgData)

        # Detect Faces
        time.sleep(1)
        result = faceApi.detectFace(imgFile)
        if result == True:
            use_face += 1
        else:
            not_use_face += 1 

        # Extract Tags
        result = faceApi.extractTags(imgFile)
        image_tags += ','.join(list(map(lambda x:x['tag_name'],result)))
    
    labels = [u'使用人脸头像',u'不使用人脸头像']
    counts = [use_face,not_use_face]
    colors = ['red','yellowgreen','lightskyblue']
    plt.figure(figsize=(8,5), dpi=80)
    plt.axes(aspect=1) 
    plt.pie(counts, #性别统计结果
            labels=labels, #性别展示标签
            colors=colors, #饼图区域配色
            labeldistance = 1.1, #标签距离圆点距离
            autopct = '%3.1f%%', #饼图区域文本格式
            shadow = False, #饼图是否显示阴影
            startangle = 90, #饼图起始角度
            pctdistance = 0.6 #饼图区域文本距离圆点距离
    )
    plt.legend(loc='upper right',)
    plt.title(u'%s的微信好友使用人脸头像情况' % friends[0]['NickName'])
    plt.savefig("analyseHeadImage.jpg")
    plt.show() 

    image_tags = image_tags.encode('iso8859-1').decode('utf-8')
    back_coloring = np.array(Image.open('face.jpg'))
    wordcloud = WordCloud(
        font_path='simfang.ttf',
        background_color="white",
        max_words=1200,
        mask=back_coloring, 
        max_font_size=85,
        random_state=75,
        width=800, 
        height=480, 
        margin=15
    )

    wordcloud.generate(image_tags)
    plt.imshow(wordcloud)
    plt.axis("off")
    plt.savefig("wordcloudHeadImage.jpg")
    plt.show()

5、好友个性签名

先获取好友签名信息，再对文本做预处理，调用SnowNLP接口对文本做情感分析。

def analyseSignature(friends):
    signatures = ''
    emotions = []
    pattern = re.compile("1f\d.+")
    for friend in friends:
        signature = friend['Signature']
        if(signature != None):
            signature = signature.strip().replace('span', '').replace('class', '').replace('emoji', '')
            signature = re.sub(r'1f(\d.+)','',signature)
            if(len(signature)>0):
                nlp = SnowNLP(signature)
                emotions.append(nlp.sentiments)
                # 关键词提取，返回5个TF/IDF权重最大的关键词
                signatures += ' '.join(jieba.analyse.extract_tags(signature,5))
    with open('signatures.txt','wt',encoding='utf-8') as file:
         file.write(signatures)

    # Sinature WordCloud
    back_coloring = np.array(Image.open('flower.jpg'))
    wordcloud = WordCloud(
        font_path='simfang.ttf',
        background_color="white",
        max_words=1200,
        mask=back_coloring, 
        max_font_size=75,
        random_state=45,
        width=960, 
        height=720, 
        margin=15
    )

    wordcloud.generate(signatures)
    plt.imshow(wordcloud)
    plt.axis("off")
    plt.show()
    wordcloud.to_file('signatures.jpg')
    
    # Signature Emotional Judgment
    count_good = len(list(filter(lambda x:x>0.66,emotions)))    # 正面情感统计
    count_normal = len(list(filter(lambda x:x>=0.33 and x<=0.66,emotions))) # 中性情感统计
    count_bad = len(list(filter(lambda x:x<0.33,emotions)))     # 负面情感统计
    # 计算情感比例值
    print(count_good * 100/len(emotions))
    print(count_normal * 100/len(emotions))
    print(count_bad * 100/len(emotions))
    labels = [u'负面消极',u'中性',u'正面积极']
    values = (count_bad,count_normal,count_good)
    plt.rcParams['font.sans-serif'] = ['simHei'] 
    plt.rcParams['axes.unicode_minus'] = False
    plt.xlabel(u'情感判断')
    plt.ylabel(u'频数')
    plt.xticks(range(3),labels)
    plt.legend(loc='upper right',)
    plt.bar(range(3), values, color = 'rgb')
    plt.title(u'%s的微信好友签名信息情感分析' % friends[0]['NickName'])
    plt.savefig("analyseSignatureEmotional.jpg")
    plt.show()

python数据挖掘分析微信朋友圈