Econ 325 (004)


Econ 325 (004)
Winter Session, Term 1, 2019
M. Vaney
Lab 2 - Demonstration of the Central Limit Theorem
Due: Monday November 25. Submit your work online.
Purpose
In this lab R is used to demonstrate the Central Limit Theorem, a theorem that provides a
theoretical basis for estimation and inference even for underlying populations that are not
normally distributed. The lab reinforces the use of .do Öles as an e¢ cient way to execute a
series of commands and the use of loops to automate repetitive tasks. The lab also introduces
a few additional R commands.
Central Limit Theorem
Given a random sample of size n from underlying distribution f(x) with 1
< E fXg < 1
(Önite mean) and 0 < 2 < 1 (Önite variance), the sample mean will be distributed as
approximately normal with X. This can also be expressed as limn!1 X.
One implication of this for estimation is that even if the underlying distribution is not normally
distributed, by appealing to the Central Limit Theorem we may treat the sample mean,
X
n; as an approximately normally distributed random variable. The following Ögure shows
the underlying distribution of a random variable X as a solid line. Clearly X is not normally
distributed. The random variable X has realizations only over the interval [0; 3] rather than
(1;1);
X is not symmetric, X is not uni-modal. However, taking random samples of
size n and computing the sample mean for each di§erent random sample we see that the
distribution of the sample mean (red dashed line) has many of the features characteristic of
a normally distributed random variable (uni-modal, symmetric, bell-shaped).
How closely the sample mean conforms to a normal distribution will depend on features of
the underlying distribution and the sample size. The larger the sample size the more closely
代做Econ 325作业、代写Limit Theorem作业
the distribution of the sample mean will resemble a normally distributed random variable.
Data and Methodology
A number of ëpopulationsíare provided. In order to demonstrate the CLT it will be necessary
to describe the distribution of the sample mean for each of the populations.
Data
The Öle lab2-variables.csv contains N = 700 observations for each of 5 random variables
(called x1; : : : ; x5). Each of these can be thought of as a di§erent Population with a given
underlying distribution f(x1); g(x2); : : : ; k(x5).
Methods
Use R to carry out the following tasks:
1. (a) Generate summary statistics and create histograms for each of the 5 variables.
(b) Draw 1000 random samples of size n = 4; 25 and 144 for each of the random
variables (without replacement). Compute the sample mean for each random
sample and construct a histogram of the sample means..
R commands
This lab will make use of some commands that are found two additional packages available
in R: dplyr and ggplot2. Both of these packages must be loaded in R. You can check to
see which packages are loaded by selecting the packages tab in the lower right corner of the
screen. If a package has not been installed in the console the following command can be
entered:
install.packages("ggplot2")
the ggplot2 package will be installed (it may take a minute or two)
In order to make use of the additional commands available in a package your script Öle
must refer to the packages through a library commnad. It is best to start the script with
speciÖcation of the required packages:
library(ggplot2)
library(dplyr)
The dplyr package has a number of commands that are useful for re-organizing data. The
command that we will use in this lab is sample_n(data, sample size)
The ggplot2 package is used for making various graphs and Ögures. A very useful resource
for creating histograms in ggplot2 can be found at the link provided in the Lab folder on
Canvas.
The sample_n() command will draw a single random sample (of rows of a dataset) of a
speciÖc size, n. To generate 1000 random samples, sample_n() command along with a
2
command to take the mean can be embedded in the command replicate() which will repeat
these commands a speciÖed number of times.
Results and Discussion
Present and provide some discussion of the following:
Submit your .do Öle for this lab. Do not submit raw data.
1. (a) Consider the summary statistics and graphics for the underlying populations. Do
the underlying distributions appear to be Normally distributed? Comment on
the apparent distributions of each of the variables (symmetric, skewed, number
of modes,di§erence between mean and median, etc.).
(b) Discuss how changing the size of the sample alters the distribution of the sample
mean for each of the di§erent variables. Do the results conform with the prediction
of the Central Limit Theorem?

Because professional, so trustworthy. If necessary, add QQ : 99,515,681 or  micro-channel: codehelp

Guess you like

Origin www.cnblogs.com/blogjava2/p/11923393.html