The following is a set of ten transactions in a convenience store. Let\'s assume
ID: 3881916 • Letter: T
Question
The following is a set of ten transactions in a convenience store. Let's assume minimum support = 30%: that is, a frequent itemset must have support count greaterthanorequalto 3. (a) Find all frequent itemsets using the Apriori algorithm. Show your computation steps. (b) Generate all possible rules from the frequent itemsets that contain the largest number of items and calculate the confidence and lift values for each rule generated. (c) Assume that the minimum confidence required is 70%, identify and rank the rules that have at least minimum confidence.Explanation / Answer
Pros of the Apriori algorithm
It is an easy-to-implement and easy-to-understand algorithm.
It can be used on large itemsets.
Cons of the Apriori Algorithm
Sometimes, it may need to find a large number of candidate rules which can be computationally expensive.
Calculating support is also expensive because it has to go through the entire database.
R implementation
The package which is used to implement the Apriori algorithm in R is called arules. The function that we will demonstrate here which can be used for mining association rules is
apriori(data, parameter = NULL)
The arguments of the function apriori are
data: The data structure which can be coerced into transactions (e.g., a binary matrix or data.frame).
parameter: It is a named list containing the threshold values for support and confidence. The default value of this argument is a list of minimum support of 0.1, minimum confidence of 0.8, maximum of 10 items (maxlen), and a maximal time for subset checking of 5 seconds (maxtime).
> library(arules)
> data("Adult")
> rules <- apriori(Adult,parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
> summary(rules)
#set of 52 rules
#rule length distribution (lhs + rhs):sizes
# 1 2 3 4
# 2 13 24 13
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 1.000 2.000 3.000 2.923 3.250 4.000
# summary of quality measures:
# support confidence lift
# Min. :0.5084 Min. :0.9031 Min. :0.9844
# 1st Qu.:0.5415 1st Qu.:0.9155 1st Qu.:0.9937
# Median :0.5974 Median :0.9229 Median :0.9997
# Mean :0.6436 Mean :0.9308 Mean :1.0036
# 3rd Qu.:0.7426 3rd Qu.:0.9494 3rd Qu.:1.0057
# Max. :0.9533 Max. :0.9583 Max. :1.0586
# mining info:
# data ntransactions support confidence
# Adult 48842 0.5 0.9
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.