MATH 185 – Take-Home Exam 1


MATH 185 – Take-Home Exam 1
Due Sunday, May 5th, by 11:59 PM
AGREEMENT
By taking this exam, you agree to not discuss the exam with anyone, starting now,
neither with a classmate or anyone else, neither in person nor through other means,
including electronic. Please do not post questions on Piazza. Unless otherwise speci-
fied, it is acceptable to copy-paste from the lecture or homework solution code.
Problem 1. (Student vs Wilcoxon) Suppose we have a numerical sample of size n which we
assume was generated iid from an underlying distribution F, unknown with a well-defined mean μ.
Student’s t-test is a test about the mean: Is μ equal to a given value μ0?
Wicoxon’s signed-rank test is a test for symmetry: Is F symmetric about a given μ0?
That being said, the t-test can be used to test whether F is symmetric about μ0, based on the
fact that ‘symmetric about μ0’ implies that ‘the mean is equal to μ0’. However, the two are not
equivalent, so that the t-test is not consistent against all alternatives. Conversely, for the signedrank
test to be useful as a test about the mean, we need to assume that F is symmetric about its
mean. With this additional (and nontrivial) assumption on F, testing for symmetry about μ0 is
equivalent to testing whether the mean equal to μ0. (Convince yourself of that.) In what follows,
we place ourselves in that situation, so that we can directly compare the two tests. There is some
theory on that. For example, it is known that when F is a normal distribution, in which case the
t-test achieves the most power asymptotically (meaning in the large-sample limit), the signed-rank
test performs almost as well. We want to evaluate that with simulations.
Since both tests are scale-free, we may take that F to be the normal distribution with mean μ
and variance 1. We consider the two-sided setting where we test μ = 0 versus μ 6= 0. For each
n ∈ {10, 20, 50, 100, 200, 500} do the following. For each μ in a grid of your choice, denoted M and
of size 10, generate X1, . . . , Xn ~ N (μ, 1) and apply the t-test and signed-rank test, both set at
level α = 0.10. Record whether they reject or not. Repeat this B = 1, 000 times and compute the
fraction of times each test rejects. This estimates the power of each test against the alternative μ.
The end result is a plot where these estimated power curves for each of these two tests are overlaid.
Use colors and a legend to identify the two curves. Make sure to choose M so that we can see the
power go from about α to about 1, zooming in on the action.
Note. When this problem is completed, you will have generated 6 plots all together, each with
the estimated power curves for the two tests.)
Problem 2. (Fungi in brassica plants) Consider the following article about how different
brassica plants are affected by different types of Rhizoctonia fungi.1 Read enough of the article
to understand the premise and the main findings. Otherwise, we will focus on the data given in
Table 6 on how different brassica species are affected by different types of Rhizoctonia fungi.
A. Write a function tableObsExp(dat) taking in a two-column data frame, with each column representing
a factor, and then outputting a table of observed and expected (under no association)
of counts — similar to what Table 6 in that article looks like.
B. Enter the observed counts from Table 6 (likely by hand, as the data do not seem directly
downloadable) and apply your function to recover a similar table.
C. Continuing with the same dataset, produce a couple of plots using functions in the ggplot2.
D. Finally, ask a question and formalize it into a hypothesis testing problem. Perform a test and
offer some brief comments.
1 The article was published in the scientific journal PLOS ONE and is available online at the following address
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0111750

因为专业,所以值得信赖。如有需要,请加QQ99515681 或邮箱:[email protected] 

微信:codinghelp

猜你喜欢

转载自www.cnblogs.com/comtopython/p/10821280.html