Confidence intervals
The mean of sampling distribution for sample proportion is going to be the population proportion:
为了估计
,我们进行大小为
的采样,得到一系列的
,
是服从均值为
,方差为
的正态分布。
对于每一个
,都会得到一个
的估计区间,可以看到,
落在这个区间的概率确实是随着估计次数的增多越来越接近于0.95。
Interpreting confidence levels and confidence intervals
Estimating a population proportion
Significance test (hypothesis testing)
The idea of significance test
案例背景:假设四个人抽签打扫卫生
原假设:抽签是随机的
Bill没有被抽到的概率
原假设:
Examples of null and alternative hypothesis
null hypothesis: things are happening as expected (no difference hypothesis)
alternative hypothesis: this is the claim that you have new news to suspect the null hypothesis
hypotheses are about population parameters that we care about, not sample statistics.
Significance level and p value
假设检验的基本思想:为了推翻或者支持原假设,我们从总体抽样,并且计算sample statistics, 如果在给定原假设为真的情况下得到这个sample statistics或者更极端情况的概率(p-value)小于我们设定的阈值(significance level),那么我们就说我们有证据拒绝原假设;反之不能拒绝。
比如目前网站的背景是白色的,人们每天在网站的平均停留时间是20min,现在我想提升人们的停留时间,把网站背景改为黄色,那么如何知道改成黄色之后确实达到了预期的效果呢?这就是significance test。
significance level: 为了验证我们的假设,我们会从总体中进行抽样,并且计算sample statistics,significance level就是这样一个我们提前设定的阈值:如果我们得到的statistics出现的概率(或者更极端情况)低于我们设置的significance level,那么就可以说我们有证据拒绝原假设;反之,如果这个概率高于significance level,那么我们不可以拒绝原假设。
p-value:probability value, 可以理解为一个条件概率,就是原假设是真的情况下,我们得到的sample mean>=25的概率(the probability you get your statistics or more extreme given the null hypothesis is true,也是能够拒绝原假设的最小的significance level)
wiki-P-value
A P-value is the probability of getting a statistic at least as extreme as the one we observed when the null hypothesis is true(P值是一种概率,一种在原假设为真的前提下出现观察样本以及更极端情况的概率。).
The p-value is used as an alternative to rejection points to provide the smallest level of significance at which the null hypothesis would be rejected.
Error probability and power
type I: rejecting a true null hypothesis (弃真)
type II: failure to reject the null when it is false (存伪)
Significance level is the probability of rejecting a true null hypothesis(Type I error)
Power of a test is the probability of correctly rejecting the null hypothesis (1-P(type 2 error))
One-tailed and two-tailed