Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

2. Consider the data provided in the table below, which shows years of education

ID: 3355549 • Letter: 2

Question

2. Consider the data provided in the table below, which shows years of education and income for ten individuals. Education Income Income 15,000 15 10 12 20,000 20 12 35,000 35 12 40,000 40 260,000 16 50,000 50 16 60,000 6 16 70,000 70 18 60,000 60 21 60 80,000 80 2a. Calculate the covariance and corelation of educatión and income, using the "Income" column for income, which is measured in dollars. 2b. Calculate the covariance and correlation of education and income, this time using the "Income" column for income, which is measured in thousands of dollars. 2c. Based on what you have done, discuss the merits of using covariance and correlation to measure the relationship between two variables.

Explanation / Answer

2 a.

Sum(X) =10 + 12 + 12 + 12 + 12 + 16 + 16 + 16 + 18 + 21 = 145

XMean = 14.5

Sum(Y) =15000 + 20000 + 35000 + 40000 + 60000 + 50000 + 60000 + 70000 + 60000 + 80000 = 490000

YMean = 49000

Covariance(X,Y) = SUM(xi - xmean)*(yi - ymean)/(samplesize -1)

= (10-14.5)*(15000-49000)+(12-14.5)*(20000-49000)+(12-14.5)*(35000-49000)+(12-14.5)*(40000-49000)+(12-14.5)*(60000-49000)+(16-14.5)*(50000-49000)+(16-14.5)*(60000-49000)+(16-14.5)*(70000-49000)+(18-14.5)*(60000-49000)+(21-14.5)*(80000-49000))/9

= 60555.556

Result Details & Calculation for Correlation -

X Values
= 145
Mean = 14.5
(X - Mx)2 = SSx = 106.5

Y Values
= 490000
Mean = 49000
(Y - My)2 = SSy = 4040000000

X and Y Combined
N = 10
(X - Mx)(Y - My) = 545000

R Calculation
r = ((X - My)(Y - Mx)) / ((SSx)(SSy))

r = 545000 / ((106.5)(4040000000)) = 0.8309

The value of R is 0.8309. This is a strong positive correlation, which means that high X variable scores go with high Y variable scores (and vice versa).

The value of R2, the coefficient of determination, is 0.6904.

2b.

Covariance -

Sum(X) =10 + 12 + 12 + 12 + 12 + 16 + 16 + 16 + 18 + 21 = 145
XMean = 14.5
Sum(Y) =15 + 20 + 35 + 40 + 60 + 50 + 60 + 70 + 60 + 80 = 490
YMean = 49
Covariance(X,Y) = SUM(xi - xmean)*(yi - ymean)/(samplesize -1)
= (10-14.5)*(15-49)+(12-14.5)*(20-49)+(12-14.5)*(35-49)+(12-14.5)*(40-49)+(12-14.5)*(60-49)+(16-14.5)*(50-49)+(16-14.5)*(60-49)+(16-14.5)*(70-49)+(18-14.5)*(60-49)+(21-14.5)*(80-49))/9
= 60.556

Result Details & Calculation for correlation

X Values
= 145
Mean = 14.5
(X - Mx)2 = SSx = 106.5

Y Values
= 490
Mean = 49
(Y - My)2 = SSy = 4040

X and Y Combined
N = 10
(X - Mx)(Y - My) = 545

R Calculation
r = ((X - My)(Y - Mx)) / ((SSx)(SSy))

r = 545 / ((106.5)(4040)) = 0.8309

The value of R is 0.8309. This is a strong positive correlation, which means that high X variable scores go with high Y variable scores (and vice versa).

The value of R2, the coefficient of determination, is 0.6904.

2c. Results are same in both the case.

Covariance - No.of Inputs 10 X Mean 14.5 Y Mean 49000

Sum(X) =10 + 12 + 12 + 12 + 12 + 16 + 16 + 16 + 18 + 21 = 145

XMean = 14.5

Sum(Y) =15000 + 20000 + 35000 + 40000 + 60000 + 50000 + 60000 + 70000 + 60000 + 80000 = 490000

YMean = 49000

Covariance(X,Y) = SUM(xi - xmean)*(yi - ymean)/(samplesize -1)

= (10-14.5)*(15000-49000)+(12-14.5)*(20000-49000)+(12-14.5)*(35000-49000)+(12-14.5)*(40000-49000)+(12-14.5)*(60000-49000)+(16-14.5)*(50000-49000)+(16-14.5)*(60000-49000)+(16-14.5)*(70000-49000)+(18-14.5)*(60000-49000)+(21-14.5)*(80000-49000))/9

= 60555.556

Result Details & Calculation for Correlation -

X Values
= 145
Mean = 14.5
(X - Mx)2 = SSx = 106.5

Y Values
= 490000
Mean = 49000
(Y - My)2 = SSy = 4040000000

X and Y Combined
N = 10
(X - Mx)(Y - My) = 545000

R Calculation
r = ((X - My)(Y - Mx)) / ((SSx)(SSy))

r = 545000 / ((106.5)(4040000000)) = 0.8309