Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

1. What is the difference between correlation and regression? 2. Given an exampl

ID: 3326954 • Letter: 1

Question

1. What is the difference between correlation and regression?

2. Given an example of a dependent variable and independent variable.

3. What is a multiple regression model?

4. What are the assumptions underlying multiple regression analysis?

5. Write and explain the equation of a multiple linear regression model?
Now apply the general equation to write for a model consisting of one dependent variable and two independent variables?
6. What is b0 in a m

ultiple linear regression model equation? What are b1 and b2 referred to in the same equation?
7. What is the meaning of coefficient of multiple determination and partial regression coefficient?
8. The total variation present in the Y values maybe partitioned into two components. What are they called and what do they mean?
9. Briefly explain “total variation” present in the Y values of a multiple regression equation? What is “explained variation” and “unexplained variation”?
10. Given the MINITAB output below, answer the following questions:
a) What interpretation can you make about the regression equation? Where Y is “Score for attention span”, X1 is “Age” and X2 is “Education qualification”.
b) What do you call the “R-Sq”?
c) What does R-Sq = 37% mean?
d) Given that total sum of squares equals to regression sum of squares plus error sum of squares, can you show how R-Sq can be manually calculated? (please show working)
e) Is the R-Square significant (at a significance level of a = .01)? f) What are the steps to test the regression hypothesis?
g) What are the assumptions you make to do a multiple regression analysis? h) Write the null and the alternate hypotheses statements. i) What test statistics is appropriate for testing the regression hypothesis?

j) What is the formula to calculate variance ratio and what are the degrees of freedom for each part of the formula? Where the sample size is 71.

k) By using the Statistics table provided, evaluate if the variance ratio calculated is equal to or greater than the critical value of F? What is the critical value of F? l) What is the statistical decision for your null and alternate hypotheses?

Output: Regresslon Analysls: Y versus X1, X2 The regression equation 1s Y = 5.49-0.184 X1 + 0.611 X2 SE Coef 4.443 0.04851 0.1357 Predictor Coef 5.494 -0.18412 0.6108 1.24 -3.80 4.50 0.220 0.000 0.000 Constant X2 S= 3.134 R-Sq=37.1% R-Sq (adj ) Analysis of Variance 35.2% Source Regression Residual Error Total DF 2 68 70 MS 196.69 9.82 20.02 0.000 393.39 667.97 1061.36 Source x1 X2 Seq ss 194.24 199.15 DF Taken from Daniel, W. W. & Cross, C. L. (2014)

Explanation / Answer

1. Correlation quantifies the strength of the linear relationship between a pair of variables, whereas regression expresses the relationship in the form of an equation. The goal of a correlation analysis is to see whether two measurement variables co vary, and to quantify the strength of the relationship between the variables, whereas regression expresses the relationship in the form of an equation.

For example, in students taking a Maths and English test, we could use correlation to determine whether students who are good at Maths tend to be good at English as well, and regression to determine whether the marks in English can be predicted for given marks in Maths.

2. In the above example, if we are trying to determine whether the marks in English can be predicted for given marks in Maths, then marks in Maths is the independent variable and marks in English is a dependent variable.'

3. If there are more than one independent variable, then a simple linear regression model is unsuitable for the prediction model.In such cases, a Multiple Regression model is used.

A multiple linear regression model shows the relationship between the dependent variable and multiple (two or more) independent variables.

4. Assumptions underlying multiple regression analysis:

5. Data are from children aged 1 to 5 years in the Variables:

— Y is the child’s arm circumference (cm)

— X1 is the age of the child (months)

— X2 is the height of the child (cm)

Does arm circumference increase with increasing child age after controlling for child height?

Multiple linear regression model • Y = B0 + B1 X1 + B2 X2

B0= the estimated mean arm circumference when the values of age and height are zero

B1= the change in the estimated mean arm circumference associated with each 1 month increase in age if height is unchanged

B2= the change in the estimated mean arm circumference associated with each 1 cm increase in height if age is unchanged.

6. b0, b1 and b2 are as explained in the previous example.  b0 is the mean value of the dependent variable when the independent variables are zero;

b1 signifies the change in estimated mean value of the dependent variable for 1 unit of change in the first independent variable when the other independent variable is unchanged.

b2 signifies the change in estimated mean value of the dependent variable for 1 unit of change in the second independent variable when the first independent variable is unchanged.