ANSWER NUMBER 4 - Regression May, 2017 MATH 433/533 – Exam 3 Name ______________
ID: 3228865 • Letter: A
Question
ANSWER NUMBER 4 - Regression
May, 2017 MATH 433/533 – Exam 3 Name ______________________
44 points
Use these data for following Questions.
1. The following data were collected on a simple random sample of 12 patients with hypertension:
Patient
y
x1
x2
x3
x4
x5
x6
1
105
41
85.4
1.75
5.1
63
38
2
115
42
94.2
2.10
3.8
70
19
3
116
44
95.3
1.98
8.2
72
15
4
117
55
94.7
2.01
5.8
73
91
5
112
50
89.4
1.89
7.0
72
90
6
121
42
99.5
2.25
9.3
71
17
7
121
47
99.8
2.25
2.5
69
46
8
110
45
90.9
1.90
6.2
66
11
9
110
43
89.2
1.83
7.1
69
73
10
114
46
92.7
2.07
5.6
64
39
11
119
49
90.1
2.11
6.0
68
47
12
122
54
95.3
2.35
7.2
73
42
y = mean arterial blood pressure (mm Hg)
x1 = age (yrs)
x2 = weight (kg)
x3 = body surface (m²)
x4 = duration of hypertension (yrs)
x5 = basal pulse
x6 = measure of stress
Regress y against all independent variables.
Give the regression equation, t-values and associated P-values for the coefficients, F and associated P-value, and
R². (12 points)
From the given data, we calculate the regression equation by using Excel as follows
The regression equation is Y = 33.00884 + 0.07764 X1 + 0.240108X2 + 20.71313X3 + 0.174008 X4 +0.1713X5 +0.012416X6
reg. coefficents P-value
x1 0.678937331
x2 0.405878629
x3 0.020832445
x4 0.637984163
x5 0.556937382
x6 0.710734785
P-value of all regression coefficients > 0.05 except x3
So All population regression coeffients are equal to Zero except x3
F - 14.9463 and associated P-value is 0.00467
Here P-value is 0.00467 < alpha 0.05, we reject H0
Thus we conclude that the regresssion equation is best fit to the given data
R2 = 0.9472 = 94.72% of variation in the variable Y can be explained by all other independent variables
2. Use the regression equation obtained in Question 1 to predict y and calculate the residual for patient 7 (remember that in the table, the y-value is the observed, and the expected value is found using the respective x-values from the table and the regression equation). (10 points)
3. Select from the four independent variables x1, x2, x3, and x6 the two independent variables that produce the
highest R².
For the two variables selected, give the regression equation, t-values and associated P-values for the coefficients,
F’ and associated P-value, and R². (12 points)
4. Use the regression equation obtained in Question 3 to predict y and calculate the residual for patient 7 (remember that in the table, the y-value is the observed, and the expected value is found using the respective x-values from the table and the regression equation). (10 points)
Patient
y
x1
x2
x3
x4
x5
x6
1
105
41
85.4
1.75
5.1
63
38
2
115
42
94.2
2.10
3.8
70
19
3
116
44
95.3
1.98
8.2
72
15
4
117
55
94.7
2.01
5.8
73
91
5
112
50
89.4
1.89
7.0
72
90
6
121
42
99.5
2.25
9.3
71
17
7
121
47
99.8
2.25
2.5
69
46
8
110
45
90.9
1.90
6.2
66
11
9
110
43
89.2
1.83
7.1
69
73
10
114
46
92.7
2.07
5.6
64
39
11
119
49
90.1
2.11
6.0
68
47
12
122
54
95.3
2.35
7.2
73
42
Explanation / Answer
1. Multiple linear regression eq using excel:
this is similar to waht is mentioned.
2. predicted value :
3. X3 & X6 produces maximum r-sq:
procudeure in r:
#data import
> library(readr)
> data <- read_csv("D:/Backup/data.csv")
attach(data)
library("heplots")
model=lm(y~.,data=data)
summary(model)
#estimating partial R-sq:
etasq(model)
#based on this we see that X2 & X3 are having maximum partial R-sq so we will choose them only
model_new=lm(y~x1+x2,data=data)
summary(model_new)
Multiple R-squared: 0.814
model_new=lm(y~x1+x3,data=data)
summary(model_new)
Multiple R-squared: 0.9039
model_new=lm(y~x1+x6,data=data)
summary(model_new)
Multiple R-squared: 0.407
model_new=lm(y~x2+x3,data=data)
summary(model_new)
Multiple R-squared: 0.9003
model_new=lm(y~x2+x6,data=data)
summary(model_new)
Multiple R-squared: 0.7221
model_new=lm(y~x3+x6,data=data)
summary(model_new)
Multiple R-squared: 0.9061
based on these we conclude that x3 & x6 produces maximum R-sq of 90.61%
model:
lm(formula = y ~ x3 + x6, data = data)
Residuals:
Min 1Q Median 3Q Max
-1.8664 -1.2753 -0.1686 0.9027 3.3519
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 56.8689 6.4357 8.836 9.92e-06 ***
x3 27.9576 3.0124 9.281 6.64e-06 ***
x6 0.0282 0.0198 1.424 0.188
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.774 on 9 degrees of freedom
Multiple R-squared: 0.9061, Adjusted R-squared: 0.8853
F-statistic: 43.43 on 2 and 9 DF, p-value: 2.38e-05
4. y=56.87+27.96x3 + 0.03*x6 = 56.87+27.96*2.25 + 0.03*46 = 121.16
SUMMARY OUTPUT Regression Statistics Multiple R 0.973237 R Square 0.947189 Adjusted R Square 0.883817 Standard Error 1.785005 Observations 12 ANOVA df SS MS F Significance F Regression 6 285.7354 47.62257 14.9463 0.004674 Residual 5 15.93122 3.186244 Total 11 301.6667 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 33.00884 15.10471 2.185334 0.080569 -5.81905 71.83672 -5.81905 71.83672 x1 0.087764 0.199887 0.439069 0.678937 -0.42606 0.601591 -0.42606 0.601591 x2 0.240108 0.264657 0.907242 0.405879 -0.44021 0.920431 -0.44021 0.920431 x3 20.71313 6.224538 3.327657 0.020832 4.712442 36.71381 4.712442 36.71381 x4 0.174008 0.347682 0.50048 0.637984 -0.71974 1.067752 -0.71974 1.067752 x5 0.1713 0.272318 0.629043 0.556937 -0.52872 0.871315 -0.52872 0.871315 x6 0.012416 0.031619 0.392685 0.710735 -0.06886 0.093697 -0.06886 0.093697Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.