Using R. DATASET - patsat.sas7bdat Y X1 X2 X3 48 50 51 2.3 57 36 46 2.3 66 40 48
ID: 3361600 • Letter: U
Question
Using R.
DATASET - patsat.sas7bdat
Y X1 X2 X3 48 50 51 2.3 57 36 46 2.3 66 40 48 2.2 70 41 44 1.8 89 28 43 1.8 36 49 54 2.9 46 42 50 2.2 54 45 48 2.4 26 52 62 2.9 77 29 50 2.1 89 29 48 2.4 67 43 53 2.4 47 38 55 2.2 51 34 51 2.3 57 53 54 2.2 66 36 49 2 79 33 56 2.5 88 29 46 1.9 60 33 49 2.1 49 55 51 2.4 77 29 52 2.3 52 44 58 2.9 60 43 50 2.3 A hospital administrator wished to study the relation between patient satis- faction (Y) and patient's age (X1, in years), severity of illness (X2, an index) and anxiety level (X3, an index). The administrator randomly selected 23 patients and collected the data in patsat.sas7bdat, where larger values of Y, X2, and X3 are, respectively, asso- ciated with more satisfaction, increased severity of illness and more anxiety. (a) Fit the regression model to the data and state the estimated regression function of 2 here? . What is the interpretation (b) Test whether there is a regression relationship here; that is, if the regression as a whole explains variability in the response. Using significance level a = 0.05, state your and alternative hypotheses, and your conclusions. What does your test imply about Bi, B2, and B3? (c) Test the null hypothesis that Bi is equal to 0 at the 0.05 level of significance. What do you conclude? (d) Obtain 95% confidence interval estimates of A, 2, and A. Interpret your results (e) Obtain a 95% confidence interval estimate of mean satisfaction when X1 X2 = 45 and X3-22. Interpret your confidence intervalExplanation / Answer
The R snippet is as follows
# read the data into R dataframe
data.df<- read.csv("C:\Users\586645\Downloads\Chegg\sas.csv",header=TRUE)
str(data.df)
# perform anova analysis
fit<- lm(Y~.,,data=data.df)
#summarise the results
summary(fit)
The results of the model are
summary(fit)
Call:
lm(formula = Y ~ ., data = data.df)
Residuals:
Min 1Q Median 3Q Max
-16.954 -7.154 1.550 6.599 14.888
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 162.8759 25.7757 6.319 4.59e-06 ***
X1 -1.2103 0.3015 -4.015 0.00074 ***
X2 -0.6659 0.8210 -0.811 0.42736
X3 -8.6130 12.2413 -0.704 0.49021
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10.29 on 19 degrees of freedom
Multiple R-squared: 0.6727, Adjusted R-squared: 0.621
F-statistic: 13.01 on 3 and 19 DF, p-value: 7.482e-05
summary(fit)
Call:
lm(formula = Y ~ ., data = data.df)
Residuals:
Min 1Q Median 3Q Max
-16.954 -7.154 1.550 6.599 14.888
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 162.8759 25.7757 6.319 4.59e-06 ***
X1 -1.2103 0.3015 -4.015 0.00074 ***
X2 -0.6659 0.8210 -0.811 0.42736
X3 -8.6130 12.2413 -0.704 0.49021
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10.29 on 19 degrees of freedom
Multiple R-squared: 0.6727, Adjusted R-squared: 0.621
F-statistic: 13.01 on 3 and 19 DF, p-value: 7.482e-05
B2 interpretation : For every unit increase in the value of X2 , Y decreases by -0.6659 units
we check the significant F , or the p value of the model is
p-value: 7.482e-05. This is much less than 0.05 , hence we conclude that the model is statistically significant , as we reject the null hypothesis in favor of alternate hypothesis
For Beta1 , the p value is 0.42736 , hence the slope of Beta2 is not signficant as the p value is not less than 0.05. So we can drop thie variable from the regression equation
The 95% CI is given as
mean +-SE , here SE is given in the output table
-1.21 +- 0.3015 = -1.515 , -9085
-0.6659 +- 0.8210 = -1.4869, -.1551
-8.613 +- 12.2413 = -20.85,3.62
Please note that we can answer only 4 subaprts of a question a time , as per the answering guidelines
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.