Using the cheddar data, fit a linear model with taste as the response and the ot
ID: 3056920 • Letter: U
Question
Using the cheddar data, fit a linear model with taste as the response and the other three variables as predictors.
(b) Give the R command to extract the p -value for the test of lactic = 0. Hint: look at summary ()$coef .
(c) Add normally distributed errors to Lactic with mean zero and standard deviation 0.01 and refit the model. Now what is the p -value for the previous test?
(d) Repeat this same calculation of adding errors to Lactic 1000 times within for loop. Save the p -values into a vector. Report on the average p -value. Does this much measurement error makes a qualitative difference to the conclusions?
(e) Repeat the previous question but with a standard deviation of 0.1. Does this much measurement error makes an important difference?
Explanation / Answer
Using the cheddar data, fit a linear model with taste as the response and the other three variables as predictors.
Loading the data
library(faraway)
data <- cheddar
linear model
fit <- lm(taste~ Acetic+H2S + Lactic, data = data)
fit_summary <- summary(fit)
fit_summary
##
## Call:
## lm(formula = taste ~ Acetic + H2S + Lactic, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.390 -6.612 -1.009 4.908 25.449
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -28.8768 19.7354 -1.463 0.15540
## Acetic 0.3277 4.4598 0.073 0.94198
## H2S 3.9118 1.2484 3.133 0.00425 **
## Lactic 19.6705 8.6291 2.280 0.03108 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.13 on 26 degrees of freedom
## Multiple R-squared: 0.6518, Adjusted R-squared: 0.6116
## F-statistic: 16.22 on 3 and 26 DF, p-value: 3.81e-06
Extracting p.value
p.value <-fit_summary$coefficients["Lactic","Pr(>|t|)"]
p.value
## [1] 0.03107948
Add normally distributed errors to Lactic with mean zero and standard deviation 0.01 and refit the model. Now what is the p -value for the previous test?
Add normally distributed errors to Lactic with mean zero and standard deviation 0.01
n <- nrow(data)
data$Lactic1 <- data$Lactic + rnorm(n, mean = 0, sd = 0.01)
refit the model
fit1 <- lm(taste~ Acetic+H2S + Lactic1, data = data)
fit1_summary <- summary(fit1)
fit1_summary
##
## Call:
## lm(formula = taste ~ Acetic + H2S + Lactic1, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.1404 -6.8747 -0.8927 4.9951 25.3637
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -29.2777 19.7309 -1.484 0.14987
## Acetic 0.4879 4.4319 0.110 0.91318
## H2S 3.8397 1.2608 3.045 0.00527 **
## Lactic1 19.6451 8.5846 2.288 0.03049 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.12 on 26 degrees of freedom
## Multiple R-squared: 0.6522, Adjusted R-squared: 0.6121
## F-statistic: 16.25 on 3 and 26 DF, p-value: 3.748e-06
p.value for new fit
p.value1 <-fit1_summary$coefficients["Lactic1","Pr(>|t|)"]
p.value1
## [1] 0.03048841
(d) Repeat this same calculation of adding errors to Lactic 1000 times within for loop. Save the p -values into a vector. Report on the average p -value. Does this much measurement error makes a qualitative difference to the conclusions?
N = 1000
p.value.list <- rep(0,1000)
for (i in 1:N)
{
data$Lactic1 <- data$Lactic + rnorm(n, mean = 0, sd = 0.01)
fit1 <- lm(taste~ Acetic+H2S + Lactic1, data = data)
fit1_summary <- summary(fit1)
p.value.list[i] <- fit1_summary$coefficients["Lactic1","Pr(>|t|)"]
}
mean(p.value.list)
## [1] 0.03149492
Since mean p.value is 0.0314949 , it doesnot make any impact on the conclusion.
(e) Repeat the previous question but with a standard deviation of 0.1. Does this much measurement error makes an important difference?
N = 1000
p.value.list <- rep(0,1000)
for (i in 1:N)
{
data$Lactic1 <- data$Lactic + rnorm(n, mean = 0, sd = 0.1)
fit1 <- lm(taste~ Acetic+H2S + Lactic1, data = data)
fit1_summary <- summary(fit1)
p.value.list[i] <- fit1_summary$coefficients["Lactic1","Pr(>|t|)"]
}
mean(p.value.list)
## [1] 0.06602748
Since mean p.value is 0.0660275 which is greater than 0.05 , it make an impact on the conclusion.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.