李宏毅机器学习4

  • 贝叶斯公式:
    P ( A i B ) = P ( B A i ) P ( A i ) i = 1 n P ( B A i ) P ( A i ) P\left(A_{i} | B\right)=\frac{P\left(B | A_{i}\right) P\left(A_{i}\right)}{\sum_{i=1}^{n} P\left(B | A_{i}\right) P\left(A_{i}\right)}

    • P(A|B) 指在 B 发生的情况下 A 发生的可能性,即已知 B 发生后 A 的条件概率,也可以理解为先有 B 再有 A,由于源于 B 的取值而被称作 A 的后验概率
    • P(A) 指 A 的先验概率或边缘概率(先验可以理解为事件 A 的发生不考虑任何 B 方面的因素)
    • P(B) 指 B 的先验概率或边缘概率,也可以作为标准化常量
    • P(B|A) 指已知 A 发生后 B 的条件概率,即先有 A 再有 B,由于源于 A 的取值而被称作 B 的后验概率
  • Naive Bayes(参考wiki)
    Using Bayes’ theorem, the conditional probability can be decomposed as
    p ( C k x ) = p ( C k ) p ( x C k ) p ( x ) p\left(C_{k} | \mathbf{x}\right)=\frac{p\left(C_{k}\right) p\left(\mathbf{x} | C_{k}\right)}{p(\mathbf{x})}
    分母常数,分子是joint probability model p ( C k , x 1 , , x n ) p\left(C_{k}, x_{1}, \ldots, x_{n}\right)
    p ( C k , x 1 , , x n ) = p ( x 1 , , x n , C k ) = p ( x 1 x 2 , , x n , C k ) p ( x 2 , , x n , C k ) = p ( x 1 x 2 , , x n , C k ) p ( x 2 x 3 , , x n , C k ) p ( x 3 , , x n , C k ) = = p ( x 1 x 2 , , x n , C k ) p ( x 2 x 3 , , x n , C k ) p ( x n 1 x n , C k ) p ( C n C k ) p ( C k ) \begin{aligned} p\left(C_{k}, x_{1}, \ldots, x_{n}\right) &=p\left(x_{1}, \ldots, x_{n}, C_{k}\right) \\ &=p\left(x_{1} | x_{2}, \ldots, x_{n}, C_{k}\right) p\left(x_{2}, \ldots, x_{n}, C_{k}\right) \\ &=p\left(x_{1} | x_{2}, \ldots, x_{n}, C_{k}\right) p\left(x_{2} | x_{3}, \ldots, x_{n}, C_{k}\right) p\left(x_{3}, \ldots, x_{n}, C_{k}\right) \\ &=\ldots \\ &=p\left(x_{1} | x_{2}, \ldots, x_{n}, C_{k}\right) p\left(x_{2} | x_{3}, \ldots, x_{n}, C_{k}\right) \ldots p\left(x_{n-1} | x_{n}, C_{k}\right) p\left(C_{n} | C_{k}\right) p\left(C_{k}\right) \end{aligned}
    Naive conditional independence assume that all features in x are mutually independent, conditional on the category C k C_k :
    p ( x i x i + 1 , , x n , C k ) = p ( x i C k ) p\left(x_{i} | x_{i+1}, \ldots, x_{n}, C_{k}\right)=p\left(x_{i} | C_{k}\right)
    Thus the joint model can be expressed as
    p ( C k x 1 , , x n ) p ( C k , x 1 , , x n ) = p ( C k ) p ( x 1 C k ) p ( x 2 C k ) p ( x 3 C k ) = p ( C k ) i = 1 n p ( x i C k ) \begin{aligned} p\left(C_{k} | x_{1}, \ldots, x_{n}\right) & \propto p\left(C_{k}, x_{1}, \ldots, x_{n}\right) \\ &=p\left(C_{k}\right) p\left(x_{1} | C_{k}\right) p\left(x_{2} | C_{k}\right) p\left(x_{3} | C_{k}\right) \cdots \\ &=p\left(C_{k}\right) \prod_{i=1}^{n} p\left(x_{i} | C_{k}\right) \end{aligned}

  • LR和linear regression之间的区别与区别:
    逻辑回归和线性回归都是广义的线性回归
    线性模型的优化目标函数是最小二乘,而逻辑回归则是似然函数
    线性回归的输出是实域上连续值,LR的输出值被S型函数映射到[0,1],通过设置阀值转换成分类类别
    liner regression期望拟合训练数据,通过feature的线性加权来预测结果; logistic regression是在训练一个最大似然分类器。

猜你喜欢

转载自blog.csdn.net/weixin_37409506/article/details/90573020