Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Enter sleep INTO R STUDIO to get the data set. The dataset is from a trial compa

ID: 3259500 • Letter: E

Question

Enter sleep INTO R STUDIO to get the data set.

The dataset is from a trial comparing two soporific drugs. There were 20 total subjects randomly assigned to one of two groups, so that each group had 10 subjects. Group 1 received one drug, while Group 2 received the other drug. The response variable is called “extra” and measures the number of extra hours an individual slept after taking the drug as compared to how many hours the individual usually sleeps (and negative values mean the individual slept fewer hours than usual).

(a) Use the t.test function in R to analyze the sleep data. Make sure to specify var.equal=T in the function so that the test will assume equal population variances. Report the test statistic and P -value, and state your conclusion for testing H0 : 1 2 = 0 vs.

HA : 1 2 = 0. Also, interpret the conclusion of the hypothesis test in context. That is, does the data provide sufficient evidence that the mean number of extra hours slept differs for the Drug 1 group vs. the Drug 2 group?

(b) Now do the same analysis, but use the aov function. Is the P -value for the F test the same as the P -value for the t test?

(c) If a random variable T follows a t distribution with n degrees of freedom, then T 2 follows a F distribution with 1 numerator degree of freedom and n denominator degrees of freedom. Use this fact to determine the relationship between the test statistic in (a) and the F statistic in (b) [that is, I essentially just want you to confirm the fact by squaring the test statistic you got in (a), and comparing that to the F statistic you got in (b)].

(d) Let the notation Tdf=n1+n22 denote a random variable that follows a t distribution with n1 +n2 2 degrees of freedom. The P -value calculated in (a) to test the two-sided hypotheses H0 : 1 2 = 0 vs. HA : 1 2 = 0 is a probability, where half of the probability is calculated from the lower tail and the other half is calculated from the upper tail. Specifically, the P -value from the t test is calculated as:

P -value = 2 PH0 (Tdf =n1 +n2 2 |observed stat.|)
= (Tdf =n1 +n2 2 |observed stat.|) + PH0 (Tdf =n1 +n2 2 |observed stat.|) = PH0(Tdf=n1+n22 |observed stat.| or Tdf=n1+n22 |observed stat.|)
= PH0(|Tdf=n1+n22| |observed stat.|)
= PH0(|Tdf=n1+n22|2 |observed stat.|2) (Squared both sides)
= PH0(Fdf1=1,df2=n1+n22 (observed stat.)2) (F = T2)

Based on the last three lines of the expressions above and the information in (c), describe in your own words why the P-value for the F test in a one-way ANOVA is a probability that is calculated only from the upper tail, even though the hypotheses we are testing are two-sided.

Explanation / Answer

Code for t.test

rm( list = ls() )

library( dplyr )

group_1 = filter( sleep, group == 1 )

group_2 = filter( sleep, group == 2 )

m = t.test( group_1$extra, group_2$extra, alternative = 'two.sided', var.equal = T )

Output :

p-value is 0.079 . Hence we accept null hypothesis and may conclude that the mean number of extra hours slept does not differ for the Drug 1 group vs. the Drug 2 group.

Code for aov function

n = aov( extra ~ group, data = sleep )

summary( n )


Output :

Yes the p-values in both cases are same.

In a) value of t-statistic is -1.8608

So the calculation confirms that square of t-statistic is actually equivalent to the F-statistic.