Chunkize warning while installing gensim 疑难杂症

问题描述：UserWarning: detected Windows; aliasing chunkize to chunkize_serial
warnings.warn(“detected Windows; aliasing chunkize to chunkize_serial”)
解决方法：在import gensim前面加入：

import warnings
warnings.filterwarnings(action='ignore', category=UserWarning, module='gensim')
import gensim

更新pip出错

问题描述：cmd进行pip升级的时候，提示无法升级（主要是在下载很多的包的时候，会一般都是通过pip进行安装相对方便）
解决方法：

python -m pip install -U pip

使用word2vec

问题描述：如何进行使用word2vec
解决方法：如果要使用本身自带的api，那么就需要导入gensim包

# 中文Word2vec的使用方法
from gensim.models.deprecated.word2vec import Word2Vec
# 导入训练集（这个模型需要自己额外下载哦）
model = Word2Vec.load("D:\\pythonworkplace\model\Word60.model")
# 判断两个词语的相关性
print (model.similarity('漂亮', '美丽'))
# 找出与该词语最贴近的10个词（默认是10个）
result = model.most_similar("美丽" ，topn = 20)
for e in result:
    print(e)
 # 找出几个词中相关性最少的词语
 print(model.doesnt_match("早餐 晚餐 午餐 真美".split()))

下载python画图包

解决方法：

import matplotlib.pyplot as plt

使用python绘制图

绘制直线

# 画直线
x = [0, 1]
y = [0, 1]
plt.figure()
plt.plot(x, y)
plt.show()

绘制直方图（并且可以将图进行保存到本地）

import matplotlib as mpl
# 如果需要保存为图片，那么就要使用下面这句话
# mpl.use('Agg')
import matplotlib.pyplot as plt
import numpy as np

# 必须配置中文字体，否则会显示成方块
# 注意所有希望图表显示的中文必须为unicode格式，fname='zh.ttf'对应你所需要使用的字体
# custom_font = mpl.font_manager.FontProperties(fname='zh.ttf')

font_size = 8  # 字体大小
fig_size = (6, 4)  # 图表大小
# 每个标签所表示的名字
names = (u'BR', u'CLR', u'ML-LOC', u'RAKEL', u'MLCBN')
# X轴所对应的内容
subjects = (u'30', u'50', u'70')
# 每个直方图的数值（因为，我有五个算法，所以就需要使用5个数组）
scores = (
    (0.1624, 0.1678, 0.1698),
    (0.1763, 0.1698, 0.1663),
    (0.1846, 0.1741, 0.1725),
    (0.1835, 0.1824, 0.1752),
    (0.1958, 0.1867, 0.1718)
)

# 更新字体大小
mpl.rcParams['font.size'] = font_size
# 更新图表大小
mpl.rcParams['figure.figsize'] = fig_size
# 设置柱形图宽度
bar_width = 0.1

index = np.arange(len(scores[0]))
# 绘制第一个算法
rects1 = plt.bar(index, scores[0], bar_width, color='#00FF00', label=names[0])
# 绘制第二个算法
rects2 = plt.bar(index + bar_width, scores[1], bar_width, color='#FF0000', label=names[1])
# 绘制第三个算法
rects3 = plt.bar(index + bar_width + bar_width, scores[2], bar_width, color='#0000FF', label=names[2])
# 绘制第四个算法
rects4 = plt.bar(index + bar_width + bar_width + bar_width, scores[3], bar_width, color='#000000', label=names[3])
# 绘制第五个算法
rects5 = plt.bar(index + bar_width + bar_width + bar_width + bar_width, scores[4], bar_width, color='#6A5ACD', label=names[4])
# X轴标题
plt.xticks(index + bar_width + 0.1, subjects)
# Y轴范围
plt.ylim(ymax=1, ymin=0)
# 图表标题
plt.title(u'Hamming Loss Of Different Algorithms')
# X轴标签
plt.xlabel("Training Set Percentage(%)")
# Y轴标签
plt.ylabel("Hamming Loss Value")
# 设置图例显示的位置，bbox_to_anchor=(1, 0.85)可以用来调整位置，ncol表示用几层显示
plt.legend(loc='center right', bbox_to_anchor=(1, 0.85), fancybox=True, ncol=1)

'''
# 添加数据标签(主要是为了将图进行保存的时候，能够更加清晰)
def add_labels(rects):
    for rect in rects:
        height = rect.get_height()
        plt.text(rect.get_x() + rect.get_width() / 2, height, height, ha='center', va='bottom')
        # 柱形图边缘用白色填充，纯粹为了美观
        rect.set_edgecolor('white')

add_labels(rects1)
add_labels(rects2)
add_labels(rects3)
add_labels(rects4)
add_labels(rects5)
'''
# 将图片进行保存
# plt.savefig('scores_par.png')
# 图表输出到本地
plt.show()

效果如下：
在这里插入图片描述
PS：对于图例的位置的话，第一个参数还可以有如下的设置

    right
	upper center
	upper right
	center
	center right
	lower left
	upper left
	lower center
	center left
	best
	lower right

手把手教你如何解决机器学习里面的疑难杂症

文章目录