Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

instructure-uploads.s3.amazonaws.com Question 1) The ozone data in the R faraway

ID: 3323118 • Letter: I

Question

instructure-uploads.s3.amazonaws.com Question 1) The ozone data in the R faraway package contains daily measurements of ozone concentrations and eight meteorological variables in the Los Angeles basin for 330 days of 1976. We focus on the relationship of ozone concentrations (O3) and day of year (doy). The scatter plot with two smooth curve fits are shown in Figure Suppose a person did not examine the data and the curves. Instead, he tried to use Pearson's correlation and Spearman's rank correlation to test for association between O3 and doy. You were asked to help the persont This plot was generated with R command: library (Earawayl data (ozone): attach (ozone) libeary Iggplot2 ggplottozone, aes (xedoy, y+03, ), geom_point() . geon smooth (net hod-'gan', Eorula y's(x,bs-ea'1) geonsmooth (net hod-'Im',formula yxIx*2),col-red'

Explanation / Answer

d) The R code for the analysis is given by,

x<-ozone
o3<-x[,1]
doy<-x[,10]
n<-length(doy)
b<-0
c<-0
m<-1
while(m<=2500)
{
e<-rnorm(n,0,6)
o3.new<--0.000500275*(doy*doy)+0.216483699*doy-6.186163324+e
p1<-cor.test(doy,o3.new,alternative = "two.sided",method = "pearson")$p.value
p2<-cor.test(doy,o3.new,alternative = "two.sided",method = "spearman")$p.value
if(p1<=0.05)
{
b<-b+1
}
if(p1>0.05)
{
b<-b+0
}
if(p2<=0.05)
{
c<-c+1
}
if(p2>0.05)
{
c<-c+0
}
m<-m+1
}

So, power for the Pearson's test is coming out to be = b/2500 = 493/2500 = 0.1972

and the power for the Spearman's test is coming out to be = c/2500 = 45/2500 = 0.178

e) Based on the above tests we can see that the power of both the tests are coming out to be very less which indicates that under the alternative hypothesis (which tells us that there is a significant association between the two variables under study) very few number of times the null hypothesis gets rejected which we do not want. We want the power of the two tests to be significantly high. So, both the tests are not productive in testing the association between the two variables.