Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Using the cheddar data, fit a linear model with taste as the response and the ot

ID: 3056920 • Letter: U

Question

Using the cheddar data, fit a linear model with taste as the response and the other three variables as predictors.

(b) Give the R command to extract the p -value for the test of lactic = 0. Hint: look at summary ()$coef .

(c) Add normally distributed errors to Lactic with mean zero and standard deviation 0.01 and refit the model. Now what is the p -value for the previous test?

(d) Repeat this same calculation of adding errors to Lactic 1000 times within for loop. Save the p -values into a vector. Report on the average p -value. Does this much measurement error makes a qualitative difference to the conclusions?

(e) Repeat the previous question but with a standard deviation of 0.1. Does this much measurement error makes an important difference?

Explanation / Answer

Using the cheddar data, fit a linear model with taste as the response and the other three variables as predictors.

Loading the data

library(faraway)

data <- cheddar

linear model

fit <- lm(taste~ Acetic+H2S + Lactic, data = data)

fit_summary <- summary(fit)

fit_summary

##

## Call:

## lm(formula = taste ~ Acetic + H2S + Lactic, data = data)

##

## Residuals:

## Min 1Q Median 3Q Max

## -17.390 -6.612 -1.009 4.908 25.449

##

## Coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) -28.8768 19.7354 -1.463 0.15540

## Acetic 0.3277 4.4598 0.073 0.94198

## H2S 3.9118 1.2484 3.133 0.00425 **

## Lactic 19.6705 8.6291 2.280 0.03108 *

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## Residual standard error: 10.13 on 26 degrees of freedom

## Multiple R-squared: 0.6518, Adjusted R-squared: 0.6116

## F-statistic: 16.22 on 3 and 26 DF, p-value: 3.81e-06

Extracting p.value

p.value <-fit_summary$coefficients["Lactic","Pr(>|t|)"]

p.value

## [1] 0.03107948

Add normally distributed errors to Lactic with mean zero and standard deviation 0.01 and refit the model. Now what is the p -value for the previous test?

Add normally distributed errors to Lactic with mean zero and standard deviation 0.01

n <- nrow(data)

data$Lactic1 <- data$Lactic + rnorm(n, mean = 0, sd = 0.01)

refit the model

fit1 <- lm(taste~ Acetic+H2S + Lactic1, data = data)

fit1_summary <- summary(fit1)

fit1_summary

##

## Call:

## lm(formula = taste ~ Acetic + H2S + Lactic1, data = data)

##

## Residuals:

## Min 1Q Median 3Q Max

## -17.1404 -6.8747 -0.8927 4.9951 25.3637

##

## Coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) -29.2777 19.7309 -1.484 0.14987

## Acetic 0.4879 4.4319 0.110 0.91318

## H2S 3.8397 1.2608 3.045 0.00527 **

## Lactic1 19.6451 8.5846 2.288 0.03049 *

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## Residual standard error: 10.12 on 26 degrees of freedom

## Multiple R-squared: 0.6522, Adjusted R-squared: 0.6121

## F-statistic: 16.25 on 3 and 26 DF, p-value: 3.748e-06

p.value for new fit

p.value1 <-fit1_summary$coefficients["Lactic1","Pr(>|t|)"]

p.value1

## [1] 0.03048841

(d) Repeat this same calculation of adding errors to Lactic 1000 times within for loop. Save the p -values into a vector. Report on the average p -value. Does this much measurement error makes a qualitative difference to the conclusions?

N = 1000

p.value.list <- rep(0,1000)

for (i in 1:N)

{

data$Lactic1 <- data$Lactic + rnorm(n, mean = 0, sd = 0.01)

fit1 <- lm(taste~ Acetic+H2S + Lactic1, data = data)

fit1_summary <- summary(fit1)

p.value.list[i] <- fit1_summary$coefficients["Lactic1","Pr(>|t|)"]

  

}

mean(p.value.list)

## [1] 0.03149492

Since mean p.value is 0.0314949 , it doesnot make any impact on the conclusion.

(e) Repeat the previous question but with a standard deviation of 0.1. Does this much measurement error makes an important difference?

N = 1000

p.value.list <- rep(0,1000)

for (i in 1:N)

{

data$Lactic1 <- data$Lactic + rnorm(n, mean = 0, sd = 0.1)

fit1 <- lm(taste~ Acetic+H2S + Lactic1, data = data)

fit1_summary <- summary(fit1)

p.value.list[i] <- fit1_summary$coefficients["Lactic1","Pr(>|t|)"]

  

}

mean(p.value.list)

## [1] 0.06602748

Since mean p.value is 0.0660275 which is greater than 0.05 , it make an impact on the conclusion.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote