Statistics using R Use R to conduct hypothesis tests. Do airplane delays get wor
ID: 2930077 • Letter: S
Question
Statistics using R
Use R to conduct hypothesis tests.
Do airplane delays get worse or better over the course of a flight?
I took a random sample of 80 delayed flights that originated at George Bush Intercontinental Airport in Houston, Texas.
The data for this exercise can be read into R using the command: delay<-read.csv("http://www.math.usu.edu/cfairbourn/Stat2300/RStudioFiles/data/delay80.csv")
The first column, dep_delay, indicates the number of minutes the airplane departed after its scheduled departure time. The second column, arr_delay, indicates the number of minutes the airplane landed before or after its scheduled arrival time. The difference between dep_delay and arr_delay tells us how many minutes the flight made up or lost during its flight.
For instance, the second flight in the data set left 2 minutes late, but has an arrival delay of -17, indicating that the plane arrived 17 minutes BEFORE its scheduled arrival time. Thus, this flight made up 2 - (-17) = 19 minutes during the flight.
The sixth flight in the data set left 24 minutes late and arrived 69 minutes late. The difference, 24 - 69 = -45, indicates that the flight lost an additional 45 minutes during the flight.
If airplane delays get worse over the course of the flight, we would expect the mean difference of the delays to be negative. If they get better, we would expect the mean difference to be positive.
Instructions
Watch the video demonstrating how to conduct matched pairs hypothesis tests in RStudio.
a. Explain why a matched pairs test is used for this scenario.
b. Use R to conduct a hypothesis test to determine whether, on average, airplane delays get worse or better over the course of a flight. Include your hypotheses, your R code, the R output, and your conclusions based on the strength of the evidence.
c. Use a 95% confidence interval for the mean difference to indicate the size of the change in delay.
Explanation / Answer
a) Paired T.test is used when the same sample is used for calculations. Here the sample of planes delayed time and arrival time is calculated.
b)
> t.test(delay$dep_delay, delay$arr_delay, paired= T)
Paired t-test
data: delay$dep_delay and delay$arr_delay
t = 0.44306, df = 79, p-value = 0.6589
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.270098 3.570098
sample estimates:
mean of the differences
0.65
Conclusion: Using p-value: While comparing with significant value(0.1 or 0.05), we can conclude that there is no significant difference between two variables.
c) 95% confidence interval for a mean difference:
95 percent confidence interval:
-2.270098 3.570098
sample estimates:
mean of the differences
0.65
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.