A database has five transactions. Let min sup = 60% and min conf = 80%. TID item
ID: 3582371 • Letter: A
Question
A database has five transactions. Let min sup = 60% and min conf = 80%.
TID items
T100 = {M, O, N, K, E, Y}
T200 = {D, O, N, K, E, Y}
T300 = {M, A, K, E}
T400 = {M, U, C, K, Y}
T500 = {C, O, O, K, I, E}
(1) Find all frequent itemsets of a single item using the Apriori algorithm.
(2) Find all frequent itemsets of two items continuing the algorithm.
(3) Find all frequent itemsets of three items
(4) List all the strong association rules (with support s and confidence c) matching
the following metarule, where X is a variable representing customers, and itemi
denotes variables representing items (e.g., “A,” “B,”):
For all x in transaction, buys (X, item1) ^ buys (X, item2) => buys(X, item3) [s, c]
Explanation / Answer
List of all distinct items in database = ['A', 'C', 'D', 'E', 'I', 'K', 'M', 'N', 'O', 'U', 'Y']
there are total 11 items.
Given min sup = 60% and min conf = 80%
So, min sup = 60% = 60/100 * (5 transactions) = 3 transactions
Definitions:
a)Support(Item-set) = No. of transactions where all items in 'Item-set' are purchased.
b)Frequent Item-sets: A Item-set is said to be frequent if Support(Item-set) >= min-support.
(1) Find all frequent itemsets of a single item using the Apriori algorithm.
Ans:
All Possible itemsets of a single item
[A],[C],[D],[E],[I],[K],[M],[N],[O],[U],[Y]
So,
support([A]) = No. of transactions where A is purchased = 1
support([C]) = No. of transactions where C is purchased = 2
support([D]) = 1
support([E]) = 4
support([I]) = 1
support([K]) = 5
support([M]) = 3
support([N]) = 2
support([O]) = 3
support([U]) = 1
support([Y]) = 3
So, all frequent itemsets of a single item = [E],[K],[M],[O],[Y] (as their support is >= min support)
Claim-1: If a item-set is not frequent then any superset of X is also not frequent.
So for finding all frequent itemsets of two items, we need to consider [E],[K],[M],[O],[Y] only.
(2) Find all frequent itemsets of two items continuing the algorithm.
Ans:
All Possible itemsets of a two items
support(EK) = 4
support(EM) = 2
support(EO) = 3
support(EY) = 2
support(KM) = 2
support(KO) = 3
support(KY) = 3
support(MO) = 1
support(MY) = 2
support(OY) = 2
so, all frequent itemsets of two items are [E,K], [E,O], [K,O], [K,Y].
(3) Find all frequent itemsets of three items
Ans:
All Possible itemsets of three items
support(EKM) = 2
support(EKO) = 3
support(EKY) = 2
support(EMO) = (<=2)
support(EMY) = (<=2)
support(EOY) = (<=2)
support(KMO) = (<=2)
support(KMY) = (<=2)
support(KOY) = (<=2)
support(MOY) = (<=1)
So, [E,K,O] is the only one frequent itemsets of three items.
(4)
Definitions:
a) Confidence(item-set1 item-set2) = Support(X Y)/Support(X).
b) There exists association rule item-set1 item-set2 if (item-set1 item-set2) is
1) frequent Item-set and
2) Confidence (item-set1 item-set2) > min-confidence
Question is asking association rule of the following form
(X, item1) ^ buys (X, item2) => buys(X, item3) [s, c]
So, we have only one frequent itemset of length 3,
So,for itemset [E,O,K] there are three possibilites
1)[E,O] K
CONF = support([E,O,K])/sopport([E,O])
=3/3 = 100%
Hence [E,O] K is a association rule
2)[O,K] E
CONF = support([E,O,K])/sopport([O,K])
= 3/3 = 100%
Hence [O,K] E is a association rule
3)[E,K] O
CONF = support([E,O,K])/sopport([E,K])
= 3/4 = 75%
As 75% < 80%, [E,K] O is not a association rule.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.