Use the data in Problem 22 and develop a simple regression model IN EXCEL ONLY t
ID: 3390775 • Letter: U
Question
Use the data in Problem 22 and develop a simple regression model IN EXCEL ONLY to predict selling price of a house based on just the square footage. Compare this model with the previous multiple regression model. Which one is better? Should the number of bedrooms be included in the model? Why or why not?
Selling Price ($)
Square Footage
Bedrooms
Age (Years)
84000
1670
2
30
79000
1339
2
25
91400
1712
3
30
120000
1840
3
40
127500
2300
3
18
132500
2234
3
30
145000
2311
3
19
164000
2377
3
7
155000
2736
4
10
168000
2500
3
1
172500
2500
4
3
174000
2479
3
3
175000
2400
3
1
177500
3124
4
0
184000
2500
3
2
195500
4062
4
10
195000
2854
3
3
Selling Price ($)
Square Footage
Bedrooms
Age (Years)
84000
1670
2
30
79000
1339
2
25
91400
1712
3
30
120000
1840
3
40
127500
2300
3
18
132500
2234
3
30
145000
2311
3
19
164000
2377
3
7
155000
2736
4
10
168000
2500
3
1
172500
2500
4
3
174000
2479
3
3
175000
2400
3
1
177500
3124
4
0
184000
2500
3
2
195500
4062
4
10
195000
2854
3
3
Explanation / Answer
The regression outpult can be obtained from excel using the "Data Analysis" tab.
The regression output of the selling price based only on the square footage is as shown below -
So, the R-square value which is also known as coefficient of determination is approximately = 0.7.
The regression output of the selling price based on all the variables is as shown below -
The coefficient of determination in this case is approximately = 0.87.
The coefficient of determination tells us about the amount of variation in the dependent variable explained by the explanatory (independent) variables. So, a higher R-square value represents a better model.
As the second model has higher R-square value, so this is a better model. So, we should consider other variables in the model as well.
The regression output of the "Selling Price" based on the "Bedrooms" and "Square Footage" as the independent variable is as shown below-
We can see that the p-value of the coefficient of the variable "Bedroom" is much greater than the significance level of 0.05. And also there is hardly any variation in the R-square value, so we would conclude that there is no significance of the variable "Bedrooms" in the regression model.
So, the number of bedrooms should not be included in the model.
SUMMARY OUTPUT Regression Statistics Multiple R 0.83661386 R Square 0.69992275 Adjusted R Square 0.6799176 Standard Error 21367.3049 Observations 17 ANOVA df SS MS F Significance F Regression 1 15973783607 1.6E+10 34.98713 2.8378E-05 Residual 15 6848425804 4.57E+08 Total 16 22822209412 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 26499.0169 21415.37164 1.237383 0.234966 -19146.767 72144.801 -19146.767 72144.80102 Square Footage 51.0385635 8.628678588 5.914992 2.84E-05 32.6469705 69.430157 32.6469705 69.43015657Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.