Q2.a. The 48 observations in data set were portioned by 3 means clustering metho
ID: 3363127 • Letter: Q
Question
Q2.a. The 48 observations in data set were portioned by 3 means clustering method and the following information is available
CLUSTER CENTERS
Original coordinates
Cluster
rating
Box office (2015 dollars)
Cluster 1
17.08333333
36.78625
Cluster 2
29.2
162.2546667
Cluster 3
63
50.37222222
Nominalized coordinates
Cluster
rating
Box office (2015 dollars)
Cluster 1
-0.59356616
-0.647117648
Cluster 2
-0.013367708
1.297329632
Cluster 3
1.605122609
-0.436568991
Original coordinates
DISTANCE BETWEEN THE CENTERS
Cluster 1
Cluster 2
Cluster 3
Cluster 1
0
126.0521209
47.88443295
Cluster 2
126.0521209
0
116.8765219
Cluster 3
47.88443295
116.8765219
0
Nominalized coordinates
DISTANCE BETWEEN THE CENTERS
Cluster 1
Cluster 2
Cluster 3
Cluster 1
0
2.029163736
2.029163736
Cluster 2
2.029163736
0
2.371901208
Cluster 3
2.208746939
2.371901208
0
DATA ANLYSIS
Original coordinates
Cluster
#Obs
Avg. Dist
Cluster 1
24
23.05251209
Cluster 2
15
32.08746692
Cluster 3
9
25.68957845
Overall
48
26.37038542
Nominalized coordinates
Cluster
#Obs
Avg. Dist
Cluster 1
24
0.572324172
Cluster 2
15
0.744118115
Cluster 3
9
0.71100553
Overall
48
0.652012534
1. The most homogeneous cluster is:
a. Cluster 1
b. Cluster 2
c. Cluster 3
2. The most heterogeneous cluster is:
a. Cluster 1
b. Cluster 2
c. Cluster 3
3. The most closest from each other clusters are:
a. Cluster 1 and 2
b. Cluster 1 and 3
c. Cluster 2 and 3
4. The most distinct from each other clusters are:
a. Cluster 1 and 2
b. Cluster 1 and 3
c. Cluster 2 and 3
Q2.b. The company that specializes in the development of software that tracks web browsing history of individuals. Using XL miner, the top association rule was found and the following information is available about this rule:
Confidence %
Antecedent (A)
Consequent (C)
Support for A
Spport for C
Spport for A and C
Lift Ratio
CNN
Weatherchannel
1693
1837
867
1. The confidence of the top rule is
a. 75.00%
b. 65.25%
c. 51.21%
d. 49.55%
2. Assuming that the total number of transactions is 20000, the lift ratio of the top rule is
a. 5.57
b. 22.30
c. 2.79
d. 1.64
Cluster
rating
Box office (2015 dollars)
Cluster 1
17.08333333
36.78625
Cluster 2
29.2
162.2546667
Cluster 3
63
50.37222222
Explanation / Answer
1. The most homogeneous cluster is:
a. Cluster 1
b. Cluster 2
c. Cluster 3
Cluster 1 , the least value of average distance is for cluster 1
Cluster
#Obs
Avg. Dist
Cluster 1
24
0.572324172
2. The most heterogeneous cluster is:
a. Cluster 1
b. Cluster 2
c. Cluster 3
Cluster2
The highest value for avg distance is for cluster 2
Cluster 2
15
0.744118115
3. The most closest from each other clusters are:
a. Cluster 1 and 2
b. Cluster 1 and 3
c. Cluster 2 and 3
Cluster 1 and cluster 2 are closest
Look the normalised table for distance between the centers , lowest value is 2.029163736 , which is between cluster1 and cluster 2
4. The most distinct from each other clusters are:
a. Cluster 1 and 2
b. Cluster 1 and 3
c. Cluster 2 and 3
Cluster 2 and cluster 3 are farthest , as the distance between the centers is highest for these 2 clusters
Please note that we can answer only 4 subparts of a question at a time , as per the answering guidelines
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.