TextBlob实战之朴素贝叶斯文本分类

版权声明:转载请注明出处,谢谢~~ https://blog.csdn.net/m0_37306360/article/details/85873907

更多实时更新的个人学习笔记分享,请关注:
知乎:https://www.zhihu.com/people/yuquanle/columns
微信订阅号:AI小白入门
ID: StudyForAI



TextBlob实现文本分类

1.准备数据集:训练集和测试集

train = [
...     ('I love this sandwich.', 'pos'),
...     ('this is an amazing place!', 'pos'),
...     ('I feel very good about these beers.', 'pos'),
...     ('this is my best work.', 'pos'),
...     ("what an awesome view", 'pos'),
...     ('I do not like this restaurant', 'neg'),
...     ('I am tired of this stuff.', 'neg'),
...     ("I can't deal with this", 'neg'),
...     ('he is my sworn enemy!', 'neg'),
...     ('my boss is horrible.', 'neg')
... ]
test = [
...     ('the beer was good.', 'pos'),
...     ('I do not enjoy my job', 'neg'),
...     ("I ain't feeling dandy today.", 'neg'),
...     ("I feel amazing!", 'pos'),
...     ('Gary is a friend of mine.', 'pos'),
...     ("I can't believe I'm doing this.", 'neg')
... ]

2.创建朴素贝叶斯分类器

from textblob.classifiers import NaiveBayesClassifier

3.把训练丢进去训练

nb_model = NaiveBayesClassifier(train)

4.预测新来的样本

dev_sen = "This is an amazing library!"
print(nb_model.classify(dev_sen))
pos

也可以计算属于某一类的概率

dev_sen_prob = nb_model.prob_classify(dev_sen)
print(dev_sen_prob.prob("pos"))
0.980117820324005

5.计算模型在测试集上的精确度

print(nb_model.accuracy(test))
0.8333333333333334

猜你喜欢

转载自blog.csdn.net/m0_37306360/article/details/85873907