Statistics -- Inferential Statistics Fundamentals

 
  
  
   
  
 Inferential Statistics refers to methods that rely on probability theory and distributions in particular to predict population values based on sample data.
In statistics when we use the term distribution we usually mean a probability distribution. The normal distribution, Binomial distribution and uniform distribution.
Distribution: it is a function that shows the possible values for a variable and how often they occur. A distribution is defined by the underlying probability and not the graph. A distribution is not a graph itself. The graph is just a visual representation.
Discrete uniform distribution: all outcomes have an equal chance of occurring. Each probability distribution has a visual representation.
They approximate a wide variety of random variables. Distribution of sample means with large enough. Sample sizes could be approximated to normal. All computable statistics are elegant. Decisions based on normal distribution insites have a good track record.
Normal distribution: Controlling for the mean. Standard deviation fixed or in statistical jargon controlling for the standard deviation. A low mean would result in the same shape of the distribution. But on the left side of the plane. A bigger mean would move the graph to the right.
Every distribution can be standardized say the mean and the variance of a variable are MU and sigma squared respectively.
Standardization is the process of transforming this variable to one with a mean zero and a standard deviation of one.
Logically a normal distribution can also be standardized. The reslut is called a standard normal distribution.
The standardized variable is called the Z score and is equal to the original variable minus its mean divided by its standard deviation.
Adding and subtracting values to all data points does not change the standard deviation.
Taking a single value, as we did in descriptive statistics is definitely suboptimal.
The sampling distribution of the mean will approximate a normal distribution. No matter the understanding distribution, the sampling distribution approximates a normal. Not only that but its mean is the same as the population mean.
Variance is depend on the size of the samples. It is the population variance divided by the sample size since the sample size is in the denominator. The bigger the sample size, the lower the variance. In other words, the closer the approximation we get.
CLT allows us to perform tests, solve problem tests, solve problems and make inferences using the Normal distribution, even when the population is not normally distributed.
Standard error is the standard deviation of the distribution formed by the sample means. In other words, the standard deviation of the sampling distribution. It shows the variability. The standard error is used for almost all statistical tests because it shows how well you approximated the true mean.
The standared deviation is Sigma divided by the square root of n.
Estimate: it is an approximation depending solely on sambil information a specific value. There are two types of estimates. Point estimates and confidence intervals estimates.
A point estimates is a single number. While a confidence interval naturally is an interval. The two are closely related.
In fact, the point estimate is located exactly in the middle of the confidence interval. However, confidence intervals provide much more information and are preferred when making inferences.
Estimators are like judges. We are always looking for the most efficient unbiassed estimators and unbiased estimator has an expected value equal to the population parameter.
unbiased estimator: expected value = population parameter.
Efficient means the unbiased estimator with the smallest variance.
The word statistic is the broader term. A point estimate is a statistic.
A point estimate is a single number given by an estimator. The estimator in this case is a point estimator and is the formula for the mean.
 
Statistics -- Inferential Statistics Fundamentals

猜你喜欢