ANSWER NUMBER 4 - Regression May, 2017 MATH 433/533 – Exam 3 Name ______________

ID: 3228865 • Letter: A

Question

ANSWER NUMBER 4 - Regression

May, 2017 MATH 433/533 – Exam 3 Name ______________________

44 points

Use these data for following Questions.

1. The following data were collected on a simple random sample of 12 patients with hypertension:

Patient

105

85.4

1.75

5.1

115

94.2

2.10

3.8

116

95.3

1.98

8.2

117

94.7

2.01

5.8

112

89.4

1.89

7.0

121

99.5

2.25

9.3

121

99.8

2.25

2.5

110

90.9

1.90

6.2

110

89.2

1.83

7.1

114

92.7

2.07

5.6

119

90.1

2.11

6.0

122

95.3

2.35

7.2

y = mean arterial blood pressure (mm Hg)

x1 = age (yrs)

x2 = weight (kg)

x3 = body surface (m²)

x4 = duration of hypertension (yrs)

x5 = basal pulse

x6 = measure of stress

Regress y against all independent variables.

Give the regression equation, t-values and associated P-values for the coefficients, F and associated P-value, and

R². (12 points)

From the given data, we calculate the regression equation by using Excel as follows

The regression equation is Y = 33.00884 + 0.07764 X1 + 0.240108X2 + 20.71313X3 + 0.174008 X4 +0.1713X5 +0.012416X6

reg. coefficents P-value
   x1   0.678937331
   x2   0.405878629
   x3   0.020832445
   x4   0.637984163
   x5   0.556937382
   x6   0.710734785

P-value of all regression coefficients > 0.05 except x3

So All population regression coeffients are equal to Zero except x3

F - 14.9463 and associated P-value is 0.00467

Here P-value is 0.00467 < alpha 0.05, we reject H0

Thus we conclude that the regresssion equation is best fit to the given data

R2 = 0.9472 = 94.72% of variation in the variable Y can be explained by all other independent variables

2. Use the regression equation obtained in Question 1 to predict y and calculate the residual for patient 7 (remember that in the table, the y-value is the observed, and the expected value is found using the respective x-values from the table and the regression equation). (10 points)

3. Select from the four independent variables x1, x2, x3, and x6 the two independent variables that produce the

highest R².

For the two variables selected, give the regression equation, t-values and associated P-values for the coefficients,

F’ and associated P-value, and R². (12 points)

4. Use the regression equation obtained in Question 3 to predict y and calculate the residual for patient 7 (remember that in the table, the y-value is the observed, and the expected value is found using the respective x-values from the table and the regression equation). (10 points)

Patient

105

85.4

1.75

5.1

115

94.2

2.10

3.8

116

95.3

1.98

8.2

117

94.7

2.01

5.8

112

89.4

1.89

7.0

121

99.5

2.25

9.3

121

99.8

2.25

2.5

110

90.9

1.90

6.2

110

89.2

1.83

7.1

114

92.7

2.07

5.6

119

90.1

2.11

6.0

122

95.3

2.35

7.2

Explanation / Answer

1. Multiple linear regression eq using excel:

this is similar to waht is mentioned.

2. predicted value :

3. X3 & X6 produces maximum r-sq:

procudeure in r:

#data import
> library(readr)
> data <- read_csv("D:/Backup/data.csv")
attach(data)

library("heplots")

model=lm(y~.,data=data)
summary(model)

#estimating partial R-sq:
etasq(model)
#based on this we see that X2 & X3 are having maximum partial R-sq so we will choose them only

model_new=lm(y~x1+x2,data=data)
summary(model_new)
Multiple R-squared: 0.814
model_new=lm(y~x1+x3,data=data)
summary(model_new)
Multiple R-squared: 0.9039
model_new=lm(y~x1+x6,data=data)
summary(model_new)
Multiple R-squared: 0.407
model_new=lm(y~x2+x3,data=data)
summary(model_new)
Multiple R-squared: 0.9003
model_new=lm(y~x2+x6,data=data)
summary(model_new)
Multiple R-squared: 0.7221
model_new=lm(y~x3+x6,data=data)
summary(model_new)
Multiple R-squared: 0.9061

based on these we conclude that x3 & x6 produces maximum R-sq of 90.61%

model:

lm(formula = y ~ x3 + x6, data = data)

Residuals:
Min 1Q Median 3Q Max
-1.8664 -1.2753 -0.1686 0.9027 3.3519

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 56.8689 6.4357 8.836 9.92e-06 ***
x3 27.9576 3.0124 9.281 6.64e-06 ***
x6 0.0282 0.0198 1.424 0.188
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.774 on 9 degrees of freedom
Multiple R-squared: 0.9061, Adjusted R-squared: 0.8853
F-statistic: 43.43 on 2 and 9 DF, p-value: 2.38e-05

4. y=56.87+27.96x3 + 0.03*x6 = 56.87+27.96*2.25 + 0.03*46 = 121.16

SUMMARY OUTPUT Regression Statistics Multiple R 0.973237 R Square 0.947189 Adjusted R Square 0.883817 Standard Error 1.785005 Observations 12 ANOVA df SS MS F Significance F Regression 6 285.7354 47.62257 14.9463 0.004674 Residual 5 15.93122 3.186244 Total 11 301.6667 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 33.00884 15.10471 2.185334 0.080569 -5.81905 71.83672 -5.81905 71.83672 x1 0.087764 0.199887 0.439069 0.678937 -0.42606 0.601591 -0.42606 0.601591 x2 0.240108 0.264657 0.907242 0.405879 -0.44021 0.920431 -0.44021 0.920431 x3 20.71313 6.224538 3.327657 0.020832 4.712442 36.71381 4.712442 36.71381 x4 0.174008 0.347682 0.50048 0.637984 -0.71974 1.067752 -0.71974 1.067752 x5 0.1713 0.272318 0.629043 0.556937 -0.52872 0.871315 -0.52872 0.871315 x6 0.012416 0.031619 0.392685 0.710735 -0.06886 0.093697 -0.06886 0.093697

Navigate

ANSWER NEEDED FOR PART C. Answers for A&B are included. A General Power bond wit

ANSWER NUMBER 8 28400 older were marrled opulation of Americans age 18 and (10%)

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

ANSWER NUMBER 4 - Regression May, 2017 MATH 433/533 – Exam 3 Name ______________

Question

Explanation / Answer

Related Questions

Navigate