Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Y X1 X2 X3 48 50 51 2.3 57 36 46 2.3 66 40 48 2.2 70 41 44 1.8 89 28 43 1.8 36 4

ID: 3325377 • Letter: Y

Question

Y X1 X2 X3

48 50 51 2.3
57 36 46 2.3
66 40 48 2.2
70 41 44 1.8
89 28 43 1.8
36 49 54 2.9
46 42 50 2.2
54 45 48 2.4
26 52 62 2.9
77 29 50 2.1
89 29 48 2.4
67 43 53 2.4
47 38 55 2.2
51 34 51 2.3
57 53 54 2.2
66 36 49 2.0
79 33 56 2.5
88 29 46 1.9
60 33 49 2.1
49 55 51 2.4
77 29 52 2.3
52 44 58 2.9
60 43 50 2.3
86 23 41 1.8
43 47 53 2.5
34 55 54 2.5
63 25 49 2.0
72 32 46 2.6
57 32 52 2.4
55 42 51 2.7
59 33 42 2.0
83 36 49 1.8
76 31 47 2.0
47 40 48 2.2
36 53 57 2.8
80 34 49 2.2
82 29 48 2.5
64 30 51 2.4
37 47 60 2.4
42 47 50 2.6
66 43 53 2.3
83 22 51 2.0
37 44 51 2.6
68 45 51 2.2
59 37 53 2.1
92 28 46 1.8

9.17., Refer to Patient satisfaction Problems 6.15 and 9.9. The hospital administrator was interested to learn how the forward stepwise selection procedure and some of its variations would perform here. a. Determine the subset of variables that is selected as best by the forward stepwise regression procedure, using F limits of 3.0 and 2.9 to add or delete a variable, respectively. Show your b. To what level of significance in any individual test is the F limit of 3.0 for adding a variable c. Determine the subset of variables that is selected as best by the forward selection procedure, d. Determine the subset of variables that is selected as best by the backward elimination e. Compare the results of the three selection procedures. How consistent are these results? steps approximately equivalent here? using an F limit of 3.0 to add a variable. Show your steps. procedure, using an F limit of 2.9 to delete a variable. Show your steps. How do the results compare with those for all possible regressions in Problem 9.9?

Explanation / Answer

(a) The forward stepwise regression procedure can be done using the multiple linear regression method

Once the dataset is loaded in R we can us the lm() function to determine how y can be predicted using the 3 varaibles X1,X2 and X3. Once we get the equation as Y = B0 + B1*X1 + B2*X2 + B3*X3 , we can do tests of significance on the individual coefficients to ascertain which variable is coefficient in predicting the value of Y for the population. We can then eliminate those variables which are not significant whie F values are less than 3.0 and 2.9 in predicting Y

(b) To find out the level of significance upto which the F level of 3.0 is significant, we have to find the area to the right of F = 3.0 on the F curvewhich we can find using technology or using the F dist table

(c) This can be done in the open source software R using the lm() and the glm() functions, load the given data as a dataframe and lookup the documentation in R as to what input parameters are needed, we can then get the desired output

(d) This can again be done in the open source software R using the lm() and the glm() functions, load the given data as a dataframe and lookup the documentation in R as to what input parameters are needed, we can then get the desired output