1. Given the data set below, apply the k-Nearest Neighbor algorithm to classify
ID: 3253357 • Letter: 1
Question
1. Given the data set below, apply the k-Nearest Neighbor algorithm to classify the test data for k=1 and k=3. Use the Euclidean distance metric.
Training Set
#
x1
x2
true label
1
0.453705
-0.0106
1
2
3.258589
0.169734
1
3
3.184656
-0.83691
0
4
-0.42561
1.385033
0
5
0.658765
-1.87715
0
6
-0.40507
-1.9574
0
7
-4.52775
4.123102
1
8
2.538689
-1.5386
1
9
-1.04649
-3.59664
1
10
2.967113
0.505111
0
·
Testing Set
#
x1
x2
true label
predicted label
11
-4.69237
-4.77898
1
12
-2.1147
-1.81277
0
13
4.277164
-4.83136
1
14
-1.33862
-0.93995
0
15
-4.02728
-4.96129
1
16
4.968125
3.757161
1
17
-2.19987
-3.48712
0
18
2.849136
-3.33965
0
19
-4.30273
2.530094
1
20
4.690116
-0.36379
1
·
· 2. Compute the confusion matrix, accuracy, precision, recall, and F1 measures given your answers to problem 1.
·
· 3. Assume you have the data set given below, which provides hypothetical examples of instances when people did or did not get hired for a job. It consists of three categorical attributes and a label that indicates "hired" or "not hired". Using this data, induce a decision tree using information gain for splitting the nodes, showing the calculations at each step.
Training Set
#
Experience (EXP)
Sufficient Qualifications? (QUAL)
Opinions of References (REFOP)
true label
1
good
Yes
favorable
1
2
excellent
Yes
favorable
1
3
none
No
favorable
0
4
good
No
not favorable
0
5
good
Yes
not favorable
0
6
excellent
Yes
not favorable
0
7
excellent
Yes
favorable
1
8
good
Yes
favorable
1
9
none
Yes
favorable
1
10
none
Yes
not favorable
0
Training Set
#
x1
x2
true label
1
0.453705
-0.0106
1
2
3.258589
0.169734
1
3
3.184656
-0.83691
0
4
-0.42561
1.385033
0
5
0.658765
-1.87715
0
6
-0.40507
-1.9574
0
7
-4.52775
4.123102
1
8
2.538689
-1.5386
1
9
-1.04649
-3.59664
1
10
2.967113
0.505111
0
Explanation / Answer
Solution :-
General type of syntax is as follows:-
label = predict(Mdl,X)
[label,score,cost] = predict(Mdl,X)
Based on above syntax, we will now fill the below predicted label and it is also based on k = 1 and 3
Testing Set # x1 x2 true label predicted label 11 -4.69237 -4.77898 1 1 12 -2.1147 -1.81277 0 1 13 4.277164 -4.83136 1 0 14 -1.33862 -0.93995 0 0 15 -4.02728 -4.96129 1 0 16 4.968125 3.757161 1 0 17 -2.19987 -3.48712 0 1 18 2.849136 -3.33965 0 1 19 -4.30273 2.530094 1 1 20 4.690116 -0.36379 1 0Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.