Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

I need help coding for this problem. It is about the OLS regression with categor

ID: 3043930 • Letter: I

Question

I need help coding for this problem. It is about the OLS regression with categorical variable(gear). The dataset is mtcars in r.

Should find the answer without using lm function.( predictors - hp, qsec, wt, gear(as a categorical var.) / response - mpg).

Thank you.

Your turn: . Consider the variable gear (number of forward gears). Let's handle this variable as a . Form a new design matrix X by adding two dummy indicators for number of gears 4 . Calculate a new set of coefficient estimates with these new set of predictors categorical one. and 5 Compare your results with those returned by the following call of lm() # regression output with in() lm (mpg ~ hp + qsec + wt + factor(gear), data = mt cars) ## Call: ## 1m(formula mpg ~ hp + qsec + wt + factor (gear), data = mt cars) Coefficients: (Intercept) 24.13327 factor(gear)5 2.24468 ## hp 0.02164 wt factor(gear)4 1.15593 ## qsec 0.56761 3.66248 ##

Explanation / Answer

ans) To perform the linear regression using mtcars in R we are given the predictors as hp, qsec, wt, and gear where is gear is a categorical variable. The response is mpg. In this case for categorical variable we have to extract the variable from the data and find the number of categories involved in it. Here we can find that there are 3 categories in gear.So we have to make 2 dummy variables instead of the gear variable. Then we will perform the usual regression using the formulas. We will first find the fitted values of regression coefficients. Denote it by beta_hat. Then we will find SSRes, SST to find R-square values. The R code is given below :

d=mtcars
d
d1=summary(d)
d2=d$gear
d2

# response

y=d$mpg

# creating 2 dummy variables for a categorical variable of 3 categories

x11=replace(d2,d2==3,1)
x12=replace(x11,x11==4,0)
x13=replace(x12,x12==5,0) # Here x13 is first dummy variable which is 1 for gear = 3 and 0 otherwise

x21=replace(d2,d2==4,1)
x22=replace(x21,x21==3,0)
x23=replace(x22,x22==5,0)   # Here x23 is second dummy variable which is 1 for gear = 3 and 0 otherwise

#other continuous predictors

x1=d$hp
x2=d$qsec
x3=d$wt

x=as.matrix(cbind(x1,x2,x3,x13,x23))

beta=(solve(t(x)%*%x))%*%(t(x))%*%y


phat=x%*%(solve(t(x)%*%x))%*%t(x)
id=diag(32)
sut=as.vector(rep(1,32))
sstot=t(y)%*%(id-((sut%*%t(sut))/32))%*%y # the value is 1126.047
ssres=t(y)%*%(id-phat)%*%y   # the value is  238.8945

Rsq=1-(ssres/sstot) # the value is 0.7878468

F= ((sstot-ssres)/ssres) *(26/5) # the value is 19.31059
F_q=qf(0.95,5,26) # the value is  2.58679

Comparing F-observed value and quantile of F at 95% significance level we can conclude that regressors are significant.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote