Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Question: Campaign organizers for both the Republican and Democrat parties are i

ID: 3209425 • Letter: Q

Question

Question:

Campaign organizers for both the Republican and Democrat parties are interested in identifying individual undecided voters who would consider voting for their party in an upcoming election. The file BlueOrRed contains data on a sample of voters with tracked variables including: whether or not they are undecided regarding their candidate preference, age, whether they own a home, gender, marital status, household size, income, years of education, and whether they attend church.

Partition the data into training (50 percent), validation (30 percent), and test (20 percent) sets. Use logistic regression to classify observations as undecided (or decided) using Age, HomeOwner, Female, Married, HouseholdSize, Income, and Education as input variables and Undecided as the output variable. Perform an exhaustive-search best subset selection with the number of subsets equal to 2.

a. From the generated set of logistic regression models, select one that you believe is a good fit. Express the model as a mathematical equation relating the output variable to the input variables.

b. Increases in which variables increase the chance of a voter being undecided? Increases in which variables decrease the chance of a voter being decided?

c. Using the default cutoff value of 0.5 for your logistic regression model, what is the overall error rate on the test data?

d. Examine the decile-wise lift chart for your model on the test data. What is the first decile lift? Interpret this value.

Data (Link to excel spreadsheet as data is too big to paste):

https://s3.amazonaws.com/tarynalyssa.com/chegg/BlueOrRed.xlsx

Explanation / Answer

a) the model is obtain as

undecided=-4.088-0.00977*Age+0.4563*HomeOwner+1.0231*Female+0.1767*Married+0.1598*HouseholdSize-0.005427*Income+0.18135*Education

Now to decide the fit is good or not

the p-value for the deviance, pearson,Howser Lemenshow goodness of fit test is 0.

so it is less than alpha=0.05

so there is a statistical evidence that we may reject the null hypothesis of adequate of fit of the data.

so we may say that the fit is inadequate.

B) from the fitted model we may observe that the coefficient of the variable namely HomeOwner,Female,Married,HouseholdSize and education are positive so it indicate that as you increase these variable the chance of voter being decided.

and the variable Age,income increases the chance of voter being undecided decreases

c)

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote