Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

20 2 60 4 46 3 41 2 12 1 137 10 68 5 89 5 4 1 32 2 144 9 156 10 93 6 36 3 72 4 1

ID: 2926279 • Letter: 2

Question

20 2 60 4 46 3 41 2 12 1 137 10 68 5 89 5 4 1 32 2 144 9 156 10 93 6 36 3 72 4 100 8 105 7 131 8 127 10 57 4 66 5 101 7 109 7 74 5 134 9 112 7 18 2 73 5 111 7 96 6 123 8 90 5 20 2 28 2 3 1 57 4 86 5 132 9 112 7 27 1 131 9 34 2 27 2 61 4 77 5 20 2 60 4 46 3 41 2 12 1 137 10 68 5 89 5 4 1 32 2 144 9 156 10 93 6 36 3 72 4 100 8 105 7 131 8 127 10 57 4 66 5 101 7 109 7 74 5 134 9 112 7 18 2 73 5 111 7 96 6 123 8 90 5 20 2 28 2 3 1 57 4 86 5 132 9 112 7 27 1 131 9 34 2 27 2 61 4 77 5 #3.13. Refer to Copier maintenance Problem 1.20. a. What are the alternative conclusions when testing for lack of fit of a linear regression b. Perform the test indicated in part (a). Control the risk of Type Ierror at.05. State the decision c. Does the test in part (b) detect other departures from regression model (2.1), such as lack function? rule and conclusion. of constant variance or lack of normality in the error terms? Could the results of the test of lack of fit be affected by such departures? Discuss.

Explanation / Answer

Once we start building a linear regression model, first of all its performance depends on various constants. If the error term shows departure from normality, the regression model cannot give better performance. Response variable should be normally distributed. For testing that, we commonly make use of shapro wilk test or Kolmogirov Smirnov test test or simply using qqplots to check that. We check that the there should not be auto-correlation. One of the basic assumption which we check is Homoscedasticity ( means equality of variances check using Levenes test).

If we build a linear regression model as for data in Table 1.

Here we have considered, Variable 1 as response variable and Variable 2 as independent variable.

Two alternative methods for testing whether a linear association exists between the predictor x and the response y in a simple linear regression model are

F test and T test tells us how formally how well the model fits the data.

If we take our data set and start building a model using simply excel data analysis tool pack.

Table 1: Data

Variable1

Variable 2

20.00

2.00

60.00

4.00

46.00

3.00

41.00

2.00

12.00

1.00

137.00

10.00

68.00

5.00

89.00

5.00

4.00

1.00

32.00

2.00

144.00

9.00

156.00

10.00

93.00

6.00

36.00

3.00

72.00

4.00

100.00

8.00

105.00

7.00

131.00

8.00

127.00

10.00

57.00

4.00

66.00

5.00

101.00

7.00

109.00

7.00

74.00

5.00

134.00

9.00

112.00

7.00

18.00

2.00

73.00

5.00

111.00

7.00

96.00

6.00

123.00

8.00

90.00

5.00

20.00

2.00

28.00

2.00

3.00

1.00

57.00

4.00

86.00

5.00

132.00

9.00

112.00

7.00

27.00

1.00

131.00

9.00

34.00

2.00

27.00

2.00

61.00

4.00

77.00

5.00

Test of Homogeneity of Variances

Variable 1

Levene Statistic

df1

df2

Sig.

109.423

1

88

.000

After performing homogeneity test on data in table 1.

Levene Statistic P value (<0.05) tells us that there is no homogeneity in the variances in the two variables.

ANOVA

Variable two

Sum of Squares

df

Mean Square

F

Sig.

Between Groups

113920.044

1

113920.044

124.199

.000

Within Groups

80717.244

88

917.241

Total

194637.289

89

Using Excel data analysis tool to see the model coefficients significance.

SUMMARY OUTPUT

Regression Statistics

Multiple R

0.978517

R Square

0.957495

Adjusted R Square

0.956507

Standard Error

8.913508

Observations

45

ANOVA

df

SS

MS

F

Significance F

Regression

1

76960.42

76960.42298

968.6572

0.0000

Residual

43

3416.377

79.45062845

Total

44

80376.8

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Intercept

-0.58016

2.803941

-0.206907577

0.837059

-6.234842727

5.074529412

B

15.03525

0.483087

31.12325812

0.000

14.06100984

16.00948624

R2 tells us how much variation independent variable explains regarding variation in dependent variable. The model intercept is insignificant ( means model intercept plays no role, it is better is fit a linear regression model without intercept).

Variable1

Variable 2

20.00

2.00

60.00

4.00

46.00

3.00

41.00

2.00

12.00

1.00

137.00

10.00

68.00

5.00

89.00

5.00

4.00

1.00

32.00

2.00

144.00

9.00

156.00

10.00

93.00

6.00

36.00

3.00

72.00

4.00

100.00

8.00

105.00

7.00

131.00

8.00

127.00

10.00

57.00

4.00

66.00

5.00

101.00

7.00

109.00

7.00

74.00

5.00

134.00

9.00

112.00

7.00

18.00

2.00

73.00

5.00

111.00

7.00

96.00

6.00

123.00

8.00

90.00

5.00

20.00

2.00

28.00

2.00

3.00

1.00

57.00

4.00

86.00

5.00

132.00

9.00

112.00

7.00

27.00

1.00

131.00

9.00

34.00

2.00

27.00

2.00

61.00

4.00

77.00

5.00