Using the Mahalanobis distance in the programming language R, find the outliers
ID: 3866285 • Letter: U
Question
Using the Mahalanobis distance in the programming language R, find the outliers in the following multidimensional dataset: This dataset contains the height (cm), weight (Kg) and the blood pressure (systolic) values of 25 patients diagnosed with type II diabetes and are under controlled treatment with the metformin drug.
Height Weight Blood Pressure 164 70 100 167 72 125 180 76 120 168 68 139 156 54 140 147 67 158 178 69 145 180 78 134 185 79 126 154 73 119 144 74 145 134 77 147 156 76 148 178 89 125 179 84 134 196 83 136 156 82 128 153 77 119 153 75 117 152 73 154 167 87 149 168 86 160 170 88 167 189 57 121 190 59 156Explanation / Answer
library(rgl)
HWB <- data.frame(Height.cm=c(164, 167, 180, 168, 156, 147, 178, 180, 185, 154, 144, 134, 156, 178, 179, 196, 156,153,152,167,168,170,189,190),
Weight.kg=c(70,72,76,68,54,67,69,78,79,73,74,77,76,89,84,83,82,77,75,73,87,86,88,57,59),
BO =c( 100,125,120,139,140,158,145,134,126,119,145,147,148,125,134,136,128,119,117,154,149,160,167,121,156))
m.dist.order <- order(mahalanobis(HWB, colMeans(HWB), cov(HWB)), decreasing=TRUE)
is.outlier <- rep(FALSE, nrow(HWB))
is.outlier[m.dist.order[1:2]] <- TRUE # outliers marking here
col <- is.outlier + 1
plot3d(HWB$Height.cm, HWB$Weight.kg, HWB$BP, type="s", col=col)
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.