1. The dataset vbca (in the library faraway) comes from a study of breast cancer
ID: 3206018 • Letter: 1
Question
1. The dataset vbca (in the library faraway) comes from a study of breast cancer in Wisconsin. There are 681 cases of potentially cancerous tumors of which 238 are actually malignant. Determining whether a tumor is really malignant is traditionally determined by an invasive surgical procedure. The purpose of this study was to determine whether a new procedure called fine needle aspiration which draws only a small sample of tissue could be effective in determining tumor status. (a) Fit a binomial regression with cLASS as the response and the other nine variables as predictors using glm in R. Report the residual deviance and associated degrees of freedom. (b) For the model with all nine variables in (la), explain how you calculate in notes Also code the steps needed to calculate from the notes as well, and present your code and the result. (c) Read the help file for "residuals glm". Note how you obtain different types of residuals discussed in lecture Produce residual plots for the model in (la), include these in your report, and comment. (d) Use AIC as the criterion to determine the best subset of variables if you only consider models obtained by dropping one explanatory variable out at a time. (Read the help file for the step function Side note: this is not necessarily the optimum model if we had considered all possible sub- sets of the 9 variables, but that is a lot of models and beyond the scope of this particular homework! Here I just ask you to drop out one variable at a time sequentiallyExplanation / Answer
I could not find the wbca data in R but here i am sending the r-codes regarding the binomial regression i hope it may heplful to you.extract the response y as CLASS variable in the data and extract other regressor . y is a binary response ie(0 and 1) then programm is as follows
fit=lm(y~x1+x2+x3+x4+x5+x6+x7+x8+x9)
plot(y~x1+x2+x3+x4+x5+x6+x7+x8+x9)
scatter.smooth(y~x1+x2+x3+x4+x5+x6+x7+x8+x9)
f=glm(y~x1+x2+x3+x4+x5+x6+x7+x8+x9,family="binomial")
summary(f)
qchisq(0.95, d.f) ## note here d.f is to be found in previous output
## to extract the values of deviance and aic the command is
f$deviance
F$aic
names(f) #gives the names of the output like aic ,bic,deviance ,fitted.values etc.
i hope from the results you may draw the conclusions and these r command may help you to solve the problem.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.