Data Mining and Knowledge Discovery in Databases Given a database table containi
ID: 666696 • Letter: D
Question
Data Mining and Knowledge Discovery in Databases
Given a database table containing weather data as follows:
Outlook
Temperature
Humidity
Windy
Class: Play
Sunny
Hot
High
False
No
Sunny
Hot
High
True
No
Overcast
Hot
High
False
Yes
Rainy
Mild
High
False
Yes
Rainy
Cool
Normal
False
Yes
Rainy
Cool
Normal
True
No
Overcast
Cool
Normal
True
Yes
Sunny
Mild
High
False
No
Sunny
Cool
Normal
False
Yes
Rainy
Mild
Normal
False
Yes
Sunny
Mild
Normal
True
Yes
Overcast
Mild
High
True
Yes
Overcast
Hot
Normal
False
Yes
Rainy
Mild
High
True
No
C. For the weather database table given in B, please predict a class label for the weather data by using naïve Bayesian classification approach (20 points).
The unknown samples to be classified are:
(Outlook = ‘Sunny’, Temperature = ‘Mild’ , Humidity = ‘High’ , Windy = ‘False’)
(Outlook = ‘Sunny’, Temperature = ‘Hot’ , Humidity = ‘High’ , Windy = ‘False’)
Outlook
Temperature
Humidity
Windy
Class: Play
Sunny
Hot
High
False
No
Sunny
Hot
High
True
No
Overcast
Hot
High
False
Yes
Rainy
Mild
High
False
Yes
Rainy
Cool
Normal
False
Yes
Rainy
Cool
Normal
True
No
Overcast
Cool
Normal
True
Yes
Sunny
Mild
High
False
No
Sunny
Cool
Normal
False
Yes
Rainy
Mild
Normal
False
Yes
Sunny
Mild
Normal
True
Yes
Overcast
Mild
High
True
Yes
Overcast
Hot
Normal
False
Yes
Rainy
Mild
High
True
No
Explanation / Answer
Let us point out all values.....
Play = yes
--------------
sunny
n = 5 ( number of items with sunny)
n_c = 2 (sunny with play yes)
P = 2/5 = 0.4 (probability)
temperature mild
n = 6
n_c = 4
p = 4/6 = 0.66
Humidity high
n = 7
n_c = 3
p = 3/7 = 0.42
windy false
n = 8
n_c = 6
p = 6/8 = 0.75
so P(sunny/yes) = 5 + 2 * 0.4 = 5.8/5+2 = 5.8/7 = 0.82
P(temperature mild/yes) = 6 + 4*0.66 / 10 = 0.84
P(Humidity high/yes) = 7 + 3 * 0.42 = 8.26/10 = 0.82
P(windy false) = 8 + 6 * 0.75 = 12.45 / 14 = 0.88
So finally (Outlook = ‘Sunny’, Temperature = ‘Mild’ , Humidity = ‘High’ , Windy = ‘False’) =
0.82 * 0.84 * 0.82 * 0.88 = 0.497
---------------------------------------------------------------------------------------------------------------------------
Let us point out all values.....
Play = yes
--------------
sunny
n = 5
n_c = 3
P = 3/5 = 0.6
temperature hot
n = 4
n_c = 2
p = 2/4 = 0.5
Humidity high
n = 7
n_c = 4
p = 4/7 = 0.57
windy false
n = 8
n_c = 2
p = 2/8 = 0.24
so P(sunny/yes) = 5 + 3 * 0.6 = 6.8/5+2 = 6.8/7 = 0.97
P(temperature hot/yes) = 4 + 2*0.5 / 6 = 0.83
P(Humidity high/yes) = 7 + 4 * 0.57 = 9.28/10 = 0.92
P(windy false) = 8 + 2 * 0.24 = 8.48 / 14 = 0.605
So finally (Outlook = ‘Sunny’, Temperature = ‘Mild’ , Humidity = ‘High’ , Windy = ‘False’) =
0.97 * 0.83 * 0.92 * 0.605 = 0.4481
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.