Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Data was collected from 40 employees to develop a regression model to predict th

ID: 3295526 • Letter: D

Question

Data was collected from 40 employees to develop a regression model to predict the employee’s annual salary using their years with the company (Years), their starting salary (Starting), and their Gender (Male = 0, Female = 1). The results from Excel regression analysis are shown below:

SUMMARY OUTPUT

Regression Statistics

Multiple R

0.718714957

R Square

0.516551189

Standard Error

10615.63461

Observations

40

ANOVA

df

SS

MS

F

Significance F

Regression

3

4334682510

1444894170

12.82165585

7.48476E-06

Residual

36

4056901131

112691698.1

Total

39

8391583641

Coefficients

Standard Error

t Stat

P-value

Intercept

27946.57894

4832.438706

5.783121245

1.35464E-06

Years

1665.251558

425.0829092

3.917474737

0.000383313

Starting

0.266374185

0.12610443

2.112330112

0.041661598

Gender

-3285.541043

5617.145392

-0.584912943

0.56225464

           a.         What is the regression equation?

           b.         In testing the null hypothesis that the regression equation is not significant at the 0.05 level, what is the appropriate conclusion?

           c.         In testing the significance of the partial regression coefficient associated with the Years variable at the 0.05 significance level, what is the appropriate conclusion?

           d.         In testing the significance of the partial regression coefficient associated with the Starting variable at the 0.05 significance level, what is the appropriate conclusion?

           e.         In testing the significance of the partial regression coefficient associated with the Gender variable at the 0.05 significance level, what is the appropriate conclusion?

           f.          For a male employee with 5 years of experience and a starting salary of $30,000, what is the approximate 95% confidence interval for his annual salary?

SUMMARY OUTPUT

Regression Statistics

Multiple R

0.718714957

R Square

0.516551189

Standard Error

10615.63461

Observations

40

ANOVA

df

SS

MS

F

Significance F

Regression

3

4334682510

1444894170

12.82165585

7.48476E-06

Residual

36

4056901131

112691698.1

Total

39

8391583641

Coefficients

Standard Error

t Stat

P-value

Intercept

27946.57894

4832.438706

5.783121245

1.35464E-06

Years

1665.251558

425.0829092

3.917474737

0.000383313

Starting

0.266374185

0.12610443

2.112330112

0.041661598

Gender

-3285.541043

5617.145392

-0.584912943

0.56225464

Explanation / Answer

a) The regression equation is Y = 27946.57894 + 1665.251558Years + 0.266374185 Starting - 3285.541043 Gender
b) P-value = 7.48476E-06 < alpha 0.05, so we reject H0
Thus we conclude that the regression equation is significant at the 0.05 level
i.e. regression equation is best fit to the given data
c)
P-value =0.000383313 < alha 0.05, so we reject H0
Thus we conclude that the significance of the partial regression coefficient associated with the Years variable at the 0.05 significance level

d)
P-value = 0.041661598 < alpha 0.05,so we reject H0
Thus we conclude that the significance of the partial regression coefficient associated with the Starting variable at the 0.05 significance level

e) P-value = 0.56225464 > alpha 0.05, we accept H0
the partial regression coefficient no associated with the Gender variable at the 0.05 significance level

f) The predict value of salary is
salary = 27946.57894 + 1665.251558(5) + 0.266374185 (30,000) - 3285.541043 (0)
= 44264.06228