Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Apriori. Let’s look at a concrete example, based on the AllElectronics transacti

ID: 3906062 • Letter: A

Question

Apriori. Let’s look at a concrete example, based on the AllElectronics transaction database, D, of Table 6.1. There are nine transactions in this database, that is, |D|=9. We use Figure 6.2 to illustrate the Apriori algorithm for nding frequent itemsets in D. 1. In the rst iteration of the algorithm, each item is a member of the set of candidate 1-itemsets, C1. The algorithm simply scans all of the transactions to count the number of occurrences of each item.

2. Suppose that the minimum support count required is 2, that is, min sup=2. (Here, we are referring to absolute support because we are using a support count.The corresponding relative support is 2/9=22%.)The set of frequent1-item sets,L1,can then be determined. It consists of the candidate 1-itemsets satisfying minimum support. In our example, all of the candidates in C1 satisfy minimum support.

3. To discover the set of frequent 2-itemsets, L2, the algorithm uses the join L1 1 L1 to generate a candidate set of 2-itemsets, C2.7 C2 consists of|L1| 22-itemsets. Note that no candidates are removed from C2 during the prune step because each subset of the candidates is also frequent. Table 6.1 Transactional Data for an AllElectronics Branch TID List of item IDs

T100 I1, I2, I5

T200 I2, I4

T300 I2, I3

T400 I1, I2, I4

T500 I1, I3

T600 I2, I3

T700 I1, I3

T800 I1, I2, I3, I5

T900 11, 12, 13

Data Mining Page 250

Explanation / Answer

Apriori algorithm is used to find frequent item sets. This is done by generating candidate item sets by testing a few specifications. This can be explained with the help of the given example

TID

List of items_IDs

T100

I1,I2,I5

T200

I2,I4

T300

I2,I4

T400

I1,I2,I4

T500

I1,I3

T600

I2,I3

T700

I1,I3

T800

I1,I2,I3,I5

T900

I1,I2,I3

This is the list of transactions with the column name TID (i.e Transaction ID) and the corresponding list of item sets (i.e I1,I2…etc) for each transaction.

The given minimum support count is 2 (i.e min_sup=2) this means that the total number of times a particular item set repeats itself must be atleast 2. The table below gives the frequency of each item set

Item set

Frequency

{I1}

6

{I2}

7

{I3}

6

{I4}

2

{I5}

2

[A single row in the above table means that itemset {I1} is seen 6 times in the given table]

The above formed table is known is candidate item set 1 (i.e C1).

Now let us form the candidate item set C2 by taking the combinations of item sets in C1 (having 2 item sets as one single set) and lets also find the support count of each such set (i.e frequency that each set appears)

Item set

Support count

{I1,I2}

4

{I1,I3}

4

{I1,I4}

1

{I1,I5}

2

{I2,I3}

4

{I2,I4}

2

{I2,I5}

2

{I3,I4}

0

{I3,I5}

1

{I4,I5}

0

[ A single row in the above table means that item set I1 along with I2 is seen 4 times in the given table]

This forms the candidate item set C2.

But in the given question the minimum support count (min_sup) = 2.

Therefore in the above table we delete the item sets which have support count less than 2.

Hence the table is modified as follows

Item Set

Support count

{I1,I2}

4

{I1,I3}

4

{I1,I5}

2

{I2,I3}

4

{I2,I4}

2

{I2,I5}

2

Now let us generate the candidate item set C3 (by taking 3 item sets from C2 as a single item set) .

This is formed as follows

Item set

Support count

{I1,I2,I3}

2

{I1,I2,I5}

2

Now candidate item set C4 is formed by combining 4 item sets from C3 as a single item set.

But in the given table there is only one transaction with 4 item sets and this does not meet the minimum support count required. So C4 cannot be formed and is considered as empty and the Apriori algorithm ends here giving {I1,I2,I3} and {I1,I2,I5} as the required frequent item sets.

TID

List of items_IDs

T100

I1,I2,I5

T200

I2,I4

T300

I2,I4

T400

I1,I2,I4

T500

I1,I3

T600

I2,I3

T700

I1,I3

T800

I1,I2,I3,I5

T900

I1,I2,I3

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote