Confidence intervals and Significance test

Confidence intervals

The mean of sampling distribution for sample proportion is going to be the population proportion:
为了估计 $p$ ，我们进行大小为 $100$ 的采样，得到一系列的 $\hat{p}$ ， $\hat{p}$ 是服从均值为 $p$ ，方差为 $\sigma^2/n$ 的正态分布。
$P(p-2\sigma_{\hat{p}}\leq\hat{p}\leq p+2\sigma_{\hat{p}})=0.95$
这里写图片描述

对于每一个 $\hat{p}$ ，都会得到一个 $p$ 的估计区间，可以看到， $p$ 落在这个区间的概率确实是随着估计次数的增多越来越接近于0.95。
这里写图片描述

Interpreting confidence levels and confidence intervals

Estimating a population proportion

这里写图片描述

Significance test (hypothesis testing)

The idea of significance test

案例背景：假设四个人抽签打扫卫生
原假设：抽签是随机的
Bill没有被抽到的概率
这里写图片描述

原假设： $P(\text{new test accurate})=0.99$
这里写图片描述

Examples of null and alternative hypothesis

null hypothesis: things are happening as expected (no difference hypothesis)
alternative hypothesis: this is the claim that you have new news to suspect the null hypothesis
这里写图片描述

hypotheses are about population parameters that we care about, not sample statistics.

Significance level and p value

假设检验的基本思想：为了推翻或者支持原假设，我们从总体抽样，并且计算sample statistics，如果在给定原假设为真的情况下得到这个sample statistics或者更极端情况的概率（p-value）小于我们设定的阈值（significance level），那么我们就说我们有证据拒绝原假设；反之不能拒绝。
这里写图片描述
比如目前网站的背景是白色的，人们每天在网站的平均停留时间是20min，现在我想提升人们的停留时间，把网站背景改为黄色，那么如何知道改成黄色之后确实达到了预期的效果呢？这就是significance test。

significance level: 为了验证我们的假设，我们会从总体中进行抽样，并且计算sample statistics，significance level就是这样一个我们提前设定的阈值：如果我们得到的statistics出现的概率（或者更极端情况）低于我们设置的significance level，那么就可以说我们有证据拒绝原假设；反之，如果这个概率高于significance level，那么我们不可以拒绝原假设。

扫描二维码关注公众号，回复： 2911716 查看本文章

p-value：probability value，可以理解为一个条件概率，就是原假设是真的情况下，我们得到的sample mean>=25的概率（the probability you get your statistics or more extreme given the null hypothesis is true，也是能够拒绝原假设的最小的significance level）
wiki-P-value
这里写图片描述
A P-value is the probability of getting a statistic at least as extreme as the one we observed when the null hypothesis is true（P值是一种概率，一种在原假设为真的前提下出现观察样本以及更极端情况的概率。）.
The p-value is used as an alternative to rejection points to provide the smallest level of significance at which the null hypothesis would be rejected.

这里写图片描述

Error probability and power

error
type I: rejecting a true null hypothesis (弃真)
type II: failure to reject the null when it is false （存伪）
Significance level is the probability of rejecting a true null hypothesis(Type I error)
Power of a test is the probability of correctly rejecting the null hypothesis (1-P(type 2 error))
这里写图片描述