Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Wal-Mart is the second largest retailer in the world. The data holds monthly dat

ID: 3255496 • Letter: W

Question

Wal-Mart is the second largest retailer in the world. The data holds monthly data on Wal-Mart’s revenue, along with several possibly related economic variables.

a. Develop a linear regression model to predict Wal-Mart revenue, using CPI as the only independent variable.

b. Develop a linear regression model to predict Wal-Mart revenue, using Personal Consumption as the only independent variable.

c. Develop a linear regression model to predict Wal-Mart revenue, using Retail Sales Index as the only independent variable.

d. Which of these three models is the best? Use R-square value, Significance F values and other appropriate criteria to explain your answer.

Identify and remove the four cases corresponding to December revenue.

e. Develop a linear regression model to predict Wal-Mart revenue, using CPI as the only independent variable.

f. Develop a linear regression model to predict Wal-Mart revenue, using Personal Consumption as the only independent variable.

g. Develop a linear regression model to predict Wal-Mart revenue, using Retail Sales Index as the only independent variable.

h. Which of these three models is the best? Use R-square values and Significance F values to explain your answer.

i. Comparing the results of parts (d) and (h), which of these two models is better? Use R-square values, Significance F values and other appropriate criteria to explain your answer.

Date Wal Mart Revenue CPI Personal Consumption Retail Sales Index December 11/28/2003 14.764 552.7 7868495 301337 0 12/30/2003 23.106 552.1 7885264 357704 1 1/30/2004 12.131 554.9 7977730 281463 0 2/27/2004 13.628 557.9 8005878 282445 0 3/31/2004 16.722 561.5 8070480 319107 0 4/29/2004 13.98 563.2 8086579 315278 0 5/28/2004 14.388 566.4 8196516 328499 0 6/30/2004 18.111 568.2 8161271 321151 0 7/27/2004 13.764 567.5 8235349 328025 0 8/27/2004 14.296 567.6 8246121 326280 0 9/30/2004 17.169 568.7 8313670 313444 0 10/29/2004 13.915 571.9 8371605 319639 0 11/29/2004 15.739 572.2 8410820 324067 0 12/31/2004 26.177 570.1 8462026 386918 1 1/21/2005 13.17 571.2 8469443 293027 0 2/24/2005 15.139 574.5 8520687 294892 0 3/30/2005 18.683 579 8568959 338969 0 4/29/2005 14.829 582.9 8654352 335626 0 5/25/2005 15.697 582.4 8644646 345400 0 6/28/2005 20.23 582.6 8724753 351068 0 7/28/2005 15.26 585.2 8833907 351887 0 8/26/2005 15.709 588.2 8825450 355897 0 9/30/2005 18.618 595.4 8882536 333652 0 10/31/2005 15.397 596.7 8911627 336662 0 11/28/2005 17.384 592 8916377 344441 0 12/30/2005 27.92 589.4 8955472 406510 1 1/27/2006 14.555 593.9 9034368 322222 0 2/23/2006 18.684 595.2 9079246 318184 0 3/31/2006 16.639 598.6 9123848 366989 0 4/28/2006 20.17 603.5 9175181 357334 0 5/25/2006 16.901 606.5 9238576 380085 0 6/30/2006 21.47 607.8 9270505 373279 0 7/28/2006 16.542 609.6 9338876 368611 0 8/29/2006 16.98 610.9 9352650 382600 0 9/28/2006 20.091 607.9 9348494 352686 0 10/20/2006 16.583 604.6 9376027 354740 0 11/24/2006 18.761 603.6 9410758 363468 0 12/29/2006 28.795 604.5 9478531 424946 1 1/26/2007 20.473 606.348 9540335 332797 0

Explanation / Answer

> #Go to excel sheet containing entire data
> #select all the numbers only in the column "Wal Mart Revenue"
> #excluding the cell having word "Wal Mart Revenue" and press "Ctrl+C"
> #open the code written in R file and run the command below
> #to get values of the variable "Wal Mart Revenue"
> rev=scan("clipboard")
Read 39 items
> head(rev)
[1] 14.764 23.106 12.131 13.628 16.722 13.980
> cpi=scan("clipboard")
Read 39 items
> head(cpi)
[1] 552.7 552.1 554.9 557.9 561.5 563.2
> cons=scan("clipboard")
Read 39 items
> head(cons)
[1] 7868495 7885264 7977730 8005878 8070480 8086579
> retail=scan("clipboard")
Read 39 items
> head(retail)
[1] 301337 357704 281463 282445 319107 315278
> dec=scan("clipboard")
Read 39 items
> #a:
> fit1=lm(rev~cpi)
> summary(fit1)

Call:
lm(formula = rev ~ cpi)

Residuals:
Min 1Q Median 3Q Max
-3.674 -2.379 -1.697 1.055 10.015

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -24.40854 19.24848 -1.268 0.2127
cpi 0.07179 0.03296 2.178 0.0358 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.689 on 37 degrees of freedom
Multiple R-squared: 0.1137, Adjusted R-squared: 0.08972
F-statistic: 4.745 on 1 and 37 DF, p-value: 0.03583

> #b:
> fit2=lm(rev~cons)
> summary(fit2)

Call:
lm(formula = rev ~ cons)

Residuals:
Min 1Q Median 3Q Max
-3.908 -2.363 -1.537 1.034 9.696

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -8.895e+00 1.014e+01 -0.877 0.3860
cons 3.028e-06 1.161e-06 2.608 0.0131 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.602 on 37 degrees of freedom
Multiple R-squared: 0.1553, Adjusted R-squared: 0.1324
F-statistic: 6.8 on 1 and 37 DF, p-value: 0.01307

> #c:
> fit3=lm(rev~retail)
> summary(fit3)

Call:
lm(formula = rev ~ retail)

Residuals:
Min 1Q Median 3Q Max
-4.3612 -2.0904 0.0569 1.7792 4.4392

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.380e+01 4.456e+00 -3.098 0.00371 **
retail 9.186e-05 1.302e-05 7.056 2.39e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.559 on 37 degrees of freedom
Multiple R-squared: 0.5737, Adjusted R-squared: 0.5621
F-statistic: 49.79 on 1 and 37 DF, p-value: 2.388e-08

> #d:
> #R-squared value for model with Retail Sales Index as the only independent variable
> #is the largest value which is 0.5737
> #Also the signifance F value is the largest which is 49.79
> #hence model with Retail Sales Index as the only independent variable is the best.
> #To identify and remove the four cases corresponding to December revenue
> ind=which(dec==1)
> ind
[1] 2 14 26 38
> rev1=rev[-ind]
> cpi1=cpi[-ind]
> cons1=cons[-ind]
> retail1=retail[-ind]
> length(rev1)
[1] 35
>
> #e:
> fit4=lm(rev1~cpi1)
> summary(fit4)

Call:
lm(formula = rev1 ~ cpi1)

Residuals:
Min 1Q Median 3Q Max
-2.7317 -1.4797 -0.6024 1.4513 3.9027

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -33.13422 10.24266 -3.235 0.00277 **
cpi1 0.08490 0.01752 4.845 2.91e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.827 on 33 degrees of freedom
Multiple R-squared: 0.4157, Adjusted R-squared: 0.398
F-statistic: 23.48 on 1 and 33 DF, p-value: 2.907e-05

> #f:
> fit5=lm(rev1~cons1)
> summary(fit5)

Call:
lm(formula = rev1 ~ cons1)

Residuals:
Min 1Q Median 3Q Max
-2.8756 -1.4322 -0.5687 1.5765 3.7409

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.004e+01 5.619e+00 -1.787 0.0832 .
cons1 3.041e-06 6.435e-07 4.725 4.13e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.846 on 33 degrees of freedom
Multiple R-squared: 0.4036, Adjusted R-squared: 0.3855
F-statistic: 22.33 on 1 and 33 DF, p-value: 4.133e-05

> #g:
> fit6=lm(rev1~retail1)
> summary(fit6)

Call:
lm(formula = rev1 ~ retail1)

Residuals:
Min 1Q Median 3Q Max
-2.3679 -1.6754 -0.9117 1.8869 4.0977

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.013e-01 4.298e+00 -0.140 0.889582
retail1 5.101e-05 1.280e-05 3.985 0.000351 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.964 on 33 degrees of freedom
Multiple R-squared: 0.3248, Adjusted R-squared: 0.3044
F-statistic: 15.88 on 1 and 33 DF, p-value: 0.0003515

> #h:
> #Here after removing four cases corresponding to December revenue
> #model with CPI as the only independent variable is the best model
> #as it has the largest R-square value(=0.4157) and Significance F value(=23.48)
>
> #i:
> summary(fit3)

Call:
lm(formula = rev ~ retail)

Residuals:
Min 1Q Median 3Q Max
-4.3612 -2.0904 0.0569 1.7792 4.4392

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.380e+01 4.456e+00 -3.098 0.00371 **
retail 9.186e-05 1.302e-05 7.056 2.39e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.559 on 37 degrees of freedom
Multiple R-squared: 0.5737, Adjusted R-squared: 0.5621
F-statistic: 49.79 on 1 and 37 DF, p-value: 2.388e-08

> summary(fit4)

Call:
lm(formula = rev1 ~ cpi1)

Residuals:
Min 1Q Median 3Q Max
-2.7317 -1.4797 -0.6024 1.4513 3.9027

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -33.13422 10.24266 -3.235 0.00277 **
cpi1 0.08490 0.01752 4.845 2.91e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.827 on 33 degrees of freedom
Multiple R-squared: 0.4157, Adjusted R-squared: 0.398
F-statistic: 23.48 on 1 and 33 DF, p-value: 2.907e-05

> #Larger the R-square value, better the model
> #Thus model with Retail Sales Index as the only independent variable
> #without removing four cases corresponding to December revenue
> #is the best model (with R-square=0.5737 and F-statistic: 49.79)
>