Using the sat dataset, the data set is in R in the faraway package. t a model wi
ID: 3180283 • Letter: U
Question
Using the sat dataset, the data set is in R in the faraway package. t a model with the total SAT score as the response and expend, salary, ratio, and takers as predictors. Perform regression diagnostic on this model to answer the following questions. Display any plots that are relevant. Do not provide any plots about which you have nothing to say.
• (a) Check the constant variance assumption for the erros; •
(b) Check the normality assumption; •
(c) Check for large leverage points; •
(d) Check for outliers; •
(e) Check for inuential points.
Explanation / Answer
Rcode:
> model<-lm(total~expend+salary+ratio)
> #Check the constant variance assumption for the erros;
> ncvTest(model)
Non-constant Variance Score Test
Variance formula: ~ fitted.values
Chisquare = 0.313934 Df = 1 p = 0.5752761
> # Check the normality assumption;
> qqnorm(model$residuals)
> qqline(model$residuals)
> grid()
> #Check for outliers
> outlierTest(model)
No Studentized residuals with Bonferonni p < 0.05
Largest |rstudent|:
rstudent unadjusted p-value Bonferonni p
40 -2.181293 0.034427 NA
> # Check for large leverage points
> leveragePlots(model)
> #Check for inuential points
> d1 <- cooks.distance(model)
> r <- stdres(model)
> a <- cbind(sat, d1, r)
> a[d1 > 4/50, ]
[1] expend ratio salary takers verbal math total d1 r
<0 rows> (or 0-length row.names)
Answer:
There error is distributated almost normal. the variance is more or less constant. There is only one outlier in the data with the index 40 and there is no influential point present in the data.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.