Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

1)An educational psychologist wants to use association analysis to analyze test

ID: 3877977 • Letter: 1

Question

1)An educational psychologist wants to use association analysis to analyze test results. The test consists of 100 questions with four possible answers each.

(a)How would you convert this data into a form suitable for association analysis?

(b)In particular, what type of attributes would you have and how many of them are there?

2)Which of the following quantities is likely to show more temporal autocorrelation: daily rainfall or daily temperature? Why?

3)Many sciences rely on observation instead of (or in addition to) designed experiments. Compare the data quality issues involved in observational science with those of experimental science and data mining.

4)For the following vectors, x and y, calculate the indicated similarity or distance measures.

a. (a) x : (0,0,1,1), y : (2,2,2,2) cosine, correlation, Euclidean

b. (b) x : (0,1,0,1), y : (0,1,0,1) cosine, correlation, Euclidean, Jaccard

c. (c) x : (1,1,0,1), y : (-1,0,-1,0) cosine, correlation, Euclidean

d. (d) x : (1,0,0,1,0,1), y : (0,1,1,0,0,1) cosine, correlation, Jaccard

e. (e) x : (2,1,0,2,0,3), y : (1,1,1,0,0,1) cosine, correlation

5)This exercise compares and contrasts some similarity and distance measures. For binary data, the L1 distance corresponds to the Hamming distance; that is, the number of bits that are different between two binary vectors. The Jaccard similarity is a measure of the similarity between two binary vectors. Compute the Hamming distance and the Jaccard similarity between the following two binary vectors.

x: 0111010101

y : 0110011010

Explanation / Answer

2) Which of the following quantities is likely to show more temporal autocorrelation: daily rainfall or daily temperature? Why?

Answer: Daily temperature is more temporal autocorrelation than compared to daily rainfall. Because all over the location the rainfall is not continuous and it is scattered. But when comparing the daily temperature is located closely to the location of the daily rainfall.

5) This exercise compares and contrasts some similarity and distance measures. For binary data, the L1 distance corresponds to the Hamming distance; that is, the number of bits that are different between two binary vectors. The Jaccard similarity is a measure of the similarity between two binary vectors. Compute the Hamming distance and the Jaccard similarity between the following two binary vectors.

x: 0111010101 y : 0110011010

Answer:

Hamming distance = 3 => number of different bits, that is 3 binary numbers are different

Jaccard similarity = 2 / 5 = 0.4 // 5 => 1 + 2 + 2 (f01 = 1, f10 = 2, f00 = 5, f11 = 2, Hence 1 + 2 + 2 = 5)