1. Bayesian decision theory
1. Bayes' theorem
(1) Bayes' theorem:
略
(2) Naive Bayes classifier:
"朴素贝叶斯分类器"(Naive Bayes Classifier)假设不同的条件是相互独立的.设输入为a,其属性为a1...an,全部标签构成的集合为ω,其中的元
素为ω1...ωm,则分类器如下:
ω = arg max ω i ∈ ω P (ω i ∣ a 1, a 2... an) = arg max ω i ∈ ω P (a 1, a 2... an ∣ ω i) P (ω i) P (a 1, a 2... an) = arg max ω i ∈ ω P (a 1, a 2... an ∣ ω i) P (ω i) = arg max ω i ∈ ω P (ω i) ∏ j = 1 n P (aj ∣ ω i) ω = \ underset {ω_i∈ω} {\ arg \ max max {P (ω_i \, | \, a_1, a_2 .. .a_n)} = \ underset {ω_i∈ω} {\ arg \ max} fra \ frac {P (a_1, a_2 ... a_n \, | \, ω_i) P (ω_i)} {P (a_1, a_2. ..a_n)}} = \ underset {ω_i∈ω} {\ arg \ max} {P (a_1, a_2 ... a_n \, | \, ω_i) P (ω_i)} = \ underset {ω_i∈ω} {\ arg \ max} P (ω_i) \ displaystyle \ prod_ {j = 1} ^ nP (a_j \, | \, ω_i) ω=ωi∈ωargmaxP ( ωi∣a1,a2...an)=ωi∈ωargmaxP(a1,a2...an)P(a1,a2...an∣ωi) P ( ωi)=ωi∈ωargmaxP(a1,a2...an∣ωi) P ( ωi)=ωi∈ωargmaxP ( ωi)j=1∏nP(aj∣ωi)
(3) Example (text classification):
"词袋模型"(Bag of Words Model)认为1篇文章的内容主要取决于其中不同词汇出现的频率,和词汇的顺序关系不大.设V为常用词汇构成的集合,Vk
表示V中第j个单词,aj为该篇文章的第j个词,ni表示标签为ωi的所有文章中的总单词数,nik表示Vj在上述文章中出现的次数,则:
ω = arg max ω i ∈ ω P (ω i) ∏ j = 1 m P (aj = V k ∣ ω i) = arg max ω i ∈ ω P (ω i) ∏ j = 1 mnik + 1 ni + ∣ V ∣ ω = \ underset {ω_i∈ω} {\ arg \ max} P (ω_i) \ displaystyle \ prod_ {j = 1} ^ mP (a_j = V_k \, | \, ω_i) = \ underset {ω_i∈ω} {\ arg \ max} P (ω_i) \ displaystyle \ prod_ {j = 1} ^ m \ frac {n_ {ik} +1} {n_i + | V | | ω=ωi∈ωargmaxP ( ωi)j=1∏mP(aj=Vk∣ωi)=ωi∈ωargmaxP ( ωi)j=1∏mni+∣V∣ni k+1