Use R-Studio to solve and show ALL codes: File can be found on my drop box link:
ID: 3122600 • Letter: U
Question
Use R-Studio to solve and show ALL codes:
File can be found on my drop box link: https://www.dropbox.com/s/bhx1zjbaoqb1rla/birthweights2004.csv?dl=0
2. The birth weight of a baby is of interest to health officials because many studies have shown possible links between this weight and conditions in later life, such as obesity or diabetes. Researchers look for possible relationships between the birth weight of a baby and the age of the mother or whether or not she smoked cigarettes or drank alcohol during her pregnancy. The Centers for Disease Control and Prevention (CDC), using data provided by the U.S. Department of Health and Human Services, National Center for Health Statistics, the Division of Vital Statistics as well as the CDC, maintain a database on all babies born in a given year http://wonder.cdc.gov/natality-current.html. We will investigate different samples taken from CDC's database of births The dataset "birthweights.csv", which is attached above in this file, we will investigate consists of a random sample of 1009 babies born in North Carolina during 2004. The babies in the sample had a gestation period of at least 37 weeks and were single births(i.e. not a twin or triplet) a) Use the command subset(birthweights,?????,drop T create new vectors, (e.g. weightbabyBoy and weightbabyGirl), from the data by selecting the column Weight (select Weight) and extracting those rows corresponding to the males (subset Gender Male") or females (subset Gender Female The drop-T argument ensures that we have a vector object as opposed to a data frame (b) Use the following code to compare the ecdf's plot.ecdif(???, xlab grams") plot. ecd col pch add abline (v mean(???), lty col curve(pnorm mean??? ,sd(???),1000,6000, cool "red", add TRUE) legend(???, legend col pch for male and female new born babies birth weights. (ii for new born babies birth weights with mother used Alcohol and NonAlcohol (c) What do you see? (d) Identify the distribution of weights for babies. Hint: Use ecdf's (e) Compute the mean and standard deviation of the weights of the babies born to nonsmoking mothers and smoking mothers. (f) is the observed mean difference in weights easily explained by chance or is there a real difference in the mean weights of North Carolina babies born to nonsmoking and smoking mother in 2004? (Hint: Use histo, qq-plots...] (g Formulate the null and alternative hypotheses H 0 and H 1 and using the Two-Sample t-test for means answer the question in part (h) Do Parts (d), (e), f) and (g) for newborn babies birth weights with mother used Alcohol and NonAlcoholExplanation / Answer
#Setting Working Directory
setwd("C:/Users/Adi/Downloads/")
getwd()
# Importing Data-set into R
mydata=read.csv("birthweights2004.csv",header = TRUE)
#Get Summary
summary(mydata)
#Extracting Specific Columns
Weight = data.frame(mydata$Weight)
Gender = data.frame(mydata$Weight)
Smoker = data.frame(mydata$Smoker)
Alcohol = data.frame(mydata$Alcohol)
MotAge = data.frame(mydata$MothersAge)
#Plotting the data-set
ecdfplot(~ Weight + Gender + Smoker+ Alcohol + MotAge, data=mydata, auto.key=list(space='right'))
Weight
Gender
Smoker
Alcohol
Motage
# Histograms
hist(Weight)
hist(Smoker)
# Checking dependecies - Regression
Regression=glm(Gender~Weight+Smoker+Alcohol+MotAge)
summary(Regression)
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.