Introduction-Frequency vs. Bayesian
\(X\) : data
\[\begin{equation} \begin{aligned} X=(x_{1}\quad x_{2}\quad \cdots\quad X_{N})^{T}_{N \times p} \\ =\left(\begin{array}{cccc}x_{1} & x_{12} & \cdots & x_{1 p} \\ x_{11} & x_{22} & \cdots & x_{2 p} \\ \vdots & & & \\ x_{m} & x_{N 2} & \cdots & x_{n p}\end{array}\right)_{N \times p} \end{aligned} \end{equation} \]
\(\theta\):parameters
\(x \sim p(x|\theta)\)
Frequency school
Think that $ \ theta $ is an unknown constant, \ (x \) is a random variable \ (\ quad rv \)
\[\theta_{MLE}= arg\max_{\theta} \log P(x|\theta) \]
among them:
\[L(\theta) = \log P(x|\theta) \]
\[x_{i} \sim^{iid} p(x|\theta) \]
\[P(x|\theta) = \prod_{i}^{N} p(x_{i}|\theta) \\ log P(x|\theta) = \sum_{i}^{N} p(x_i | \theta) \]
Bayesian
Think that \ (\ theta \) is a variable \ (rv \) , and obey a certain distribution \ (\ theta \ sim p (\ theta) \) In general, \ (p (\ theta) \) is called the first Test
Bayes' theorem
\[P(\theta|X) = \frac{P(X|\theta) P(\theta)}{P(X)} \]
among them
\ (P (\ theta | X) \) is the posterior probability
\ (p (\ theta) \) is the prior probability
\(P(X)= \int_{\theta} P(X|\theta)P(\theta)\)
\ (\ Theta \) in \ (P (X | \ theta) \) is likelihood
(likelihood estimation)
MAP: Maximum posterior probability estimate
\[\begin{aligned} \theta_{MAP} = \arg \max_{\theta} P(\theta|X)\\ \propto \arg \max P(X|\theta) P(\theta) \end{aligned} \]
Bayesian estimation
\[p(\theta|x) = \frac{p(x|\theta)p(\theta)}{\int_{\theta}p(x|\theta)p(\theta)d\theta} \]
Bayesian prediction
Sample data \ (X \)
Forecast data required \ (\ widehat {x} \)
The bridge \ (\ quad \ theta \)
\[\begin{equation} \begin{split} p(\widehat{x}|X) = \int_{\theta}p(\widehat{x},\theta|X)d\theta \\ = \int_{\theta}p(\widehat{x}|\theta)p(\theta|X)d\theta \end{split} \end{equation} \]