Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Hi, I need help with Stats HW (R Studio) Here is the data we are using: ```{r lo

ID: 3204195 • Letter: H

Question

Hi, I need help with Stats HW (R Studio)

Here is the data we are using:

```{r load-data, echo=FALSE}
real.estate <- read.csv("http://www.stat.cmu.edu/~cshalizi/mreg/15/hw/08/real-estate.csv")
```

Questions to answer:

1) Write a few sentences (3-8) describing the goal of this assignment. Summarize what you are to do in your own words.

2)Why would someone want to predict houseprices from data?

3)How do people come up with the price they want?

4)What do real estate agents do?

-------------------------

Question part II

1) What is the data and where does it come from?

2)How many observations do we have?

3)What kinds of predictors are available?

4) Are there any other predictors you wish you had?

5) Which predictors are discrete?

6) Produce a pairwise scatterplot of all the continuous variables.

7) Comment on what types of relationships you see.

8) How does this plot inform whether a linear regression model is reasonable?

9) For the discrete predictors, produce boxplots examining the response against the levels of the predictor.

10) Are there any obvious outliers? Why do you say that? Should we remove them?

-------------------------------

Fixing data:

```{r example-code-eda, eval=FALSE}
pairs(price~continuous predictors, data = real.estate)
boxplot(Price/10000~Airconditioning, notch=TRUE, varwidth=TRUE, data=real.estate,
names=c("No AC", "AC"), main="Price ($10,000) vs. Air-conditioning")
```

Please fix the codes above in order to make it run (it's generating errors now)

1) Do we include discrete predictors as dummies?

2)Do we need any interaction terms?

3)What is the RMSE? What do you think of it?

4)Examine a QQ-plot of the residuals. What do you see? What should you see?

5)Should we make any transformations?

6)Plot the residuals against the fitted values.

7)Plot the residuals against each of the continuous predictors.

8)Comment on the results. Should we see patterns?

9)Make boxplots of the residuals against the discrete predictors. Comment on the results. Should we see patterns?

10)Are there any outliers? Examine a Cook's distance plot.

11) Make a table displaying the predictors for the weird points.

12) Why are they weird?

---------------------------------------------

Fixing codes to make it run:

```{r example-code-modelling, eval=FALSE}
# Note that this code isn't going to run or appear. In fact, it will likely
# cause errors if you make it run. To use any of this, you must figure out
# how it works.
initial.mdl <- lm(Price ~ factor(discrete.var) + more predictors,
data=real.estate)
qqnorm(rstandard(initial.mdl))
qqline(rstandard(initial.mdl))
# Function to plot residuals vs. a predictor
# Rather than writing the same code over and over
resid.vs.pred <- function(mdl, pred='fitted', data, standardized=TRUE, ...) {
# Input: an lm model; the name of a predictor variable (defaults to "fitted");
# the name of the data frame; whether to use standardized residuals;
# other optional graphical settings
# Output: none
if (standardized) {
resids <- rstandard(mdl)
} else {
resids <- residuals(mdl)
}
if (pred=="fitted") {
preds <- fitted(mdl)
} else {
preds <- data[,pred]
}
plot(preds, resids, xlab=pred, ylab="Residuals", ...)
abline(h=0, col="red") # Ideal
# Guide to the eye;
mean.spline <- smooth.spline(x=preds, y=resids, cv=TRUE)
lines(mean.spline, col="grey")
# pm two standard deviations (again, as a guide to the eye)
abline(h=2*sd(resids), col="red", lty="dotted")
abline(h=-2*sd(resids), col="red", lty="dotted")
var.spline <- smooth.spline(x=preds, y=resids^2, cv=TRUE)
lines(x=var.spline$x, y=mean.spline$y+2*sqrt(var.spline$y), col="grey",
lty="dotted")
lines(x=var.spline$x, y=mean.spline$y-2*sqrt(var.spline$y), col="grey",
lty="dotted")
}
boxplot(rstandard(initial.mdl) ~ real.estate$Bedroom, varwidth=TRUE,
main="Residuals vs. bedrooms")
plot(cooks.distance(initial.mdl), pch=19,
col=ifelse(cooks.distance(initial.mdl) > 0.1, "red", "black"),
ylab="Cook's Distance")
kable(real.estate[bad.houses,c("Price","Sqft","Bedroom","Bathroom")])
```

Thank you very much!

Explanation / Answer

1) Write a few sentences (3-8) describing the goal of this assignment. Summarize what you are to do in your own words.

Answer: The goal of this assignment is build a model which uses the data and predict the target variable which is the house price. I will have to figure out which variables are redundant by VIF value and is more relavant to the target variable i.e more correlation with the target variable

Using these variables we will identify the weights or coefficients to be given to every variable and come up with an equation

2)Why would someone want to predict houseprices from data?

Answer: We predict the houseprices in order to see if the investment in house is really worth in coming years or not, we predict it to see whether on a given set of variables we are able to get optimal house value or not

3)How do people come up with the price they want?

Answer: They calculate it form the above equation developed from the model

4)What do real estate agents do?

Answer: They check for the prices surrounding that particular locality or they can use the model built above to predict the prices and can take a decision to buy the house or not

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote