Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Develop a regression model to help them to predict the birthweight of a baby bas

ID: 3238515 • Letter: D

Question

Develop a regression model to help them to predict the birthweight of a baby based on the variables in the data supplied. The model could then be used to predict birthweight to identify babies at risk in future.

By using the forward stepwise method, develop a multiple regression model to predict the birthweight.

Step 1: Gestation only

Step 2: Gestation and Smoke

Step 3: Gestation, Smoke and Pre-pregnancy Weight

Step 4: Gestation, Smoke, Pre-pregnancy Weight and Height

Step 5: Gestation, Smoke, Pre-pregnancy Weight, Height and Status

Step 6: Gestation, Smoke, Pre-pregnancy Weight, Height, Status and Age

a) Interpret the regression coefficients of all six (6) independent variables in the model obtained in Step 6, and comment on the statistical significance of each.

b) Use Excel to obtain the correlation matrix for the following variables: Gestation, Pre-pregnancy Weight, Height, Age and Birthweight. Do you think multi-collinearity is a problem in the regression model? Are the correlation coefficients consistent with the regression coefficients obtained in the model in Step 6? Discuss briefly.

c) Focusing on Steps 3 and 4, discuss fully how the introduction of Height in Step 4 affects the regression coefficient of Pre-pregnancy Weight.

d) Based on the results in (a) to (c), explain which independent variables should be included or excluded to formulate the final model. State the final model.

e) Comment on the overall adequacy of the final model.

f) Consider an indigenous mother who is a smoker, 20 years of age, and 160cm tall with a pre-pregnancy weight of 58kg and gestational age of 267 days. What is the expected weight of the child, using the final model you have developed in (d)?

Dataset: https://www.dropbox.com/s/mb19h4ddhi7u3q2/Birthweights.xlsx?dl=0

Explanation / Answer

b.

since the variables are not highly related to each other ,there is no multicollinearity

c.

introduction of Height in Step 4 reduce to zero the regression coefficient of Pre-pregnancy Weight.

D. step 6 which includes all variables since the R-suare value is 1.

E. the model is able to expalin 100% significantly the variability between the varibles

Intercept -3.9854E-16 Gestation (days) 1.17655E-18 Smoke 1 Pre-pregnancy weight (kg) 3.70607E-19 Height (cm) -5.98931E-19 Status -6.42909E-17 Age 1.23166E-18