i) Clustering is an unsupervised learning problem. True or False. Explain why ii
ID: 3238087 • Letter: I
Question
i) Clustering is an unsupervised learning problem. True or False. Explain why
ii) Principal Components Analysis (PCA) can be used to create a low dimensional projection of the data for use with clustering. True or False. Explain why
iii) In cluster analysis the choice of similarity measure will affect the cluster assignments.True or False. Explain why
iv) When computing principal components the data should be standardized, i.e. the data should be centered and scaled to a (0,1) distribution. True or False. Explain why
v) Cluster analysis can only be performed on continuous variables. True or False. Explain why
vi) Since cluster analysis is an unsupervised learning method, two different cluster partitions cannot be compared. True or False. Explain why
Explanation / Answer
i) True ===Clustering can be considered the most important unsupervised learning problem; so, as every other problem of this kind, it deals with finding a structure in a collection of unlabeled data.
ii) True==You can use PCA for dimensionality reduction as a feature extractor, and to visualize the clusters.
iii) True==Different similarity criteria can lead to different clustering.
iv) True== So that the data follows a normal distribution.
(v) False==The choice of clustering variables can include a combination of nominal, ordinal or interval ratio scaled measures.
(vi) True== If the cluster partitions are formed by different variable then they cannot be compared.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.