1. Galton’s Height Data Motivated by the work of his cousin, Charles Darwin, the
ID: 3365991 • Letter: 1
Question
1. Galton’s Height Data
Motivated by the work of his cousin, Charles Darwin, the English scientist Francis Galton studied the degree to which human traits were passed from one generation to the next. In an 1885 study, he measured the heights of 933 adult children and their parents. The data set which Galton created included some sets of siblings. Although one of the assumptions of linear regression is that the observations should be independent, we will ignore the possible dependence of observations from children of the same family when performing our analysis. The data set can be found at the bottom of the question.
a. Convert the categorical variable gender to an indicator variable which takes on the value 0 if the gender is female and the value 1 if the gender is male.
b. Construct a linear regression model that we can use to estimate a child’s height from their mother’s height, their father’s height and their gender. Report the fitted model.
c. By how much does a male’s mean height exceed a female’s mean height for children’s whose parents’ heights are the same? Report both the estimate and a 95% confidence interval for the estimate.
d. Compute a 95% Prediction Interval for a female whose father’s height is 73 inches and whose mother’s height is 67 inches.
Data Set:
Explanation / Answer
A.
data5<-read.table(file.choose(), sep = ",", header = T)
data5$Gender<-as.numeric(data5$Gender)
data5$Gender[data5$Gender==1]<-0
data5$Gender[data5$Gender==2]<-1
working.data<-data5
working.data<-working.data[,-2]
model1<-lm(Height ~ . , data = working.data)
B.
summary(model1)
Call:
lm(formula = Height ~ ., data = working.data)
Residuals:
Min 1Q Median 3Q Max
-9.5280 -1.4604 0.0996 1.4783 9.1161
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.43221 2.72802 6.023 2.46e-09 ***
Gender 5.21902 0.14188 36.784 < 2e-16 ***
Father 0.39339 0.02868 13.718 < 2e-16 ***
Mother 0.31840 0.03102 10.263 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.165 on 929 degrees of freedom
Multiple R-squared: 0.6358, Adjusted R-squared: 0.6346
F-statistic: 540.5 on 3 and 929 DF, p-value: < 2.2e-16
C.
Gender 5.21902 0.14188 36.784 < 2e-16 ***
the difference between heights of male and female when the parents height are same is 5.21902.
The coefficient of gender variable. As when female, the coefficient will have 0 effect has variable has value 0. In case of mail, it will take 5.21902. So the difference is that. Standard error = 0 .14188
D. Gender Height Father Mother
0 73 67
When predicted with above data in model, the value was 66.48184
So this is the predicted value of female's height when the parents height are given.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.