Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

MINI-TAB DATA FILE IS HERE: https://ufile.io/cbd72 This is from question 14.45 i

ID: 3179550 • Letter: M

Question

MINI-TAB DATA FILE IS HERE: https://ufile.io/cbd72

This is from question 14.45 in the textbook.   Zagat’s publishes restaurant ratings for various locations in the United States.   The file Restaurants contains the Zagat rating for food, décor, service and cost per person for a sample of 50 restaurants located in a city and 50 restaurants located in a suburb. Develop a regression model to predict the cost per person, based on a variable that represents the sum of the ratings for food, décor, and service and a dummy variable concerning location (city versus suburban).

** SHOW AS MUCH WORK AS POSSIBLE, I NEED TO UNDERSTAND THIS! -- THANK-YOU **

a) State the multiple regression model.

b) Interpret the slope coefficients.

c) Predict the mean cost for a restaurant with a summated rating of 60 that is located in a city. Also find the 95% confidence interval and 95% prediction interval.

d) Is there a significant linear relationship between the cost per person and the two independent variables (summated rating and location) at the 0.05 level of significance? Follow the 5 step p-value method.

e) At the 0.05 level of significance, determine whether each independent variable makes a significant contribution to the model. **Follow the 5 step p-value method.** (please show all work)

f) Interpret the meaning of the coefficient of multiple determination.

Explanation / Answer

Part-a

The proposed regression model is

Cost= beta_0+beta_1* Summated Rating +beta_2*codedlocation+Epsilon

Here epsilon is error which is assumed to be normal, independent with zero mean and constant variance.

The results of regression analysis using Minitab are as follows from where the estimated regression model is

Cost = -26.19 + 1.256 Summated Rating - 5.82 Coded Location

Regression Analysis: Cost versus Summated Rating, Coded Location

Analysis of Variance

Source             DF   Adj SS   Adj MS F-Value P-Value

Regression         2   7798.0 3899.01    46.74    0.000

Summated Rating   1   7116.8 7116.81    85.31    0.000

Coded Location    1    846.6   846.62    10.15    0.002

Error              97   8092.1    83.42

Lack-of-Fit      43   3497.9    81.35     0.96    0.557

Pure Error       54   4594.2    85.08

Total              99 15890.1

Model Summary

      S    R-sq R-sq(adj) R-sq(pred)

9.13365 49.07%     48.02%      46.00%

Coefficients

Term               Coef SE Coef T-Value P-Value   VIF

Constant         -26.19     8.06    -3.25    0.002

Summated Rating   1.256    0.136     9.24    0.000 1.00

Coded Location    -5.82     1.83    -3.19    0.002 1.00

Regression Equation

Cost = -26.19 + 1.256 Summated Rating - 5.82 Coded Location

Part-b

The coefficient of summated rating is 1.256 which means controlling for location, corresponding to unit increase in the summated rating for food, décor, and service , there is on an average $1.256 increase in cost

The coefficient of location is -5.82 which means controlling for summated rating, cities has on an average $5.82 lower cost than suburban location.

Part-c

From following Minitab results The predicted mean cost for a restaurant with a summated rating of 60 that is located in city(location=1) is =$43.37

95%cofnidence interval=($40.7877, $45.9457)

95% predictor interval=($25.0564,$61.6770)

   Fit   SE Fit        95% CI              95% PI

43.3667 1.29942 (40.7877, 45.9457) (25.0564, 61.6770)

Part-d

Step-1: We have to test the null hypothesis H0:beta_1=beta_2=0

……….versus the alternative hypothesis H1: at least one coefficient is different from zero.

Step-2: We choose level of significance alpha=0.05.

Step-3: From following ANOVA table of regression in Minitab we get test statistic F(2,97)= 46.74

Analysis of Variance

Source             DF   Adj SS   Adj MS F-Value P-Value

Regression         2   7798.0 3899.01    46.74    0.000

Summated Rating   1   7116.8 7116.81    85.31    0.000

Coded Location    1    846.6   846.62    10.15    0.002

Error              97   8092.1    83.42

Lack-of-Fit      43   3497.9    81.35     0.96    0.557

Pure Error       54   4594.2    85.08

Total              99 15890.1

Step-4: p-value of ANOVA F-test is p=0.000 which is less than 0.05 so we have enough evidence to reject the null hypothesis.

Step-5 We conclude that there is a significant linear relationship between the cost per person and the two independent variables (summated rating and location) at the 0.05 level of significance

Part-e

Step-1: We have to test the null hypotheses H0:beta_10 and H0:beta_2=0

……….versus the alternative hypotheses H1: beta_10 and H1:beta_20

Step-2: We choose level of significance alpha=0.05.

Step-3: From following coefficients table of regression in Minitab we get test statistic

For summated rating t(97)= 9.24

For location t(97)=-3.19

Coefficients

Term               Coef SE Coef T-Value P-Value   VIF

Constant         -26.19     8.06    -3.25    0.002

Summated Rating   1.256    0.136     9.24    0.000 1.00

Coded Location    -5.82     1.83    -3.19    0.002 1.00

Step-4:

For summated rating p-value=0.000

For location p-value=0.002

As p-values for both coefficients are less than 0.05, we reject both null hypotheses.

Step-5 We conclude that each of the variables summated rating and location contributed significantly to cost at the 0.05 level of significance

Part-f

Coefficient of multiple determination R-squre=49.07%

This means that summated rating and location together explained 49.07% of the variations in cost.

Model Summary

      S    R-sq R-sq(adj) R-sq(pred)

9.13365 49.07%     48.02%      46.00%