Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

. In this problem we will perform regression analysis on the sepal and petal mea

ID: 3047439 • Letter: #

Question

. In this problem we will perform regression analysis on the sepal and petal measurements of the first 50 flowers in the Iris data. We want to use the sepal length and width and the petal length of a flower to predict its petal width. That is, the response variable is Y- iris 1:50.4) and the explanatory variables are X : (Xi &J; X) = irs 11:50,1:3 . Assume a linear regression model that Y 0 + 3=1 , ) vector. X, , where is the error (residual (a) Find the least square estimate of (A) R) and the sum of squared residuals. (b) If we want to use two of the three explanatory variables to predict Y, which two should we choose? Justify your answer

Explanation / Answer

Answer:

a)

R out put is below

> y=iris[1:50,4]
> x1=iris[1:50,1]
> x2=iris[1:50,2]
> x3=iris[1:50,3]
> reg=lm(y~x1+x2+x3)
> reg

Call:
lm(formula = y ~ x1 + x2 + x3)

Coefficients:
(Intercept) x1 x2 x3  
-0.29474 0.04504 0.01984 0.16913  

b0 =  -0.29474 b2=0.01984

b1=0.04504 b3= 0.16913  

from summary(reg)

Residual standard error: 0.1002.

b) If we want to use two of the three explanatory variables to predict Y , we should use x1 and x3. because adj R2 is maximum when we use x1 and x3 variable as compire to when we use other combination. i.e. x1 ,x2 nad x2,x3