Q16: Explain FP Tree Growth Algorithm with a suitable example

Introduction

FP-Growth (Frequent Pattern Growth) is an efficient and scalable method for mining frequent itemsets without the need for candidate generation, unlike the Apriori algorithm. It uses a data structure called the FP-Tree (Frequent Pattern Tree).

Steps of the FP-Growth Algorithm

  1. Scan the transaction database to determine the frequency (support count) of each item.
  2. Remove infrequent items based on a minimum support threshold.
  3. Sort the frequent items in descending order of their frequency.
  4. Construct the FP-Tree:
    • Create the root as a null node.
    • Insert each transaction into the tree. Common prefixes share nodes, and counters are incremented.
  5. Use the FP-Tree to generate frequent itemsets through conditional pattern bases and recursive tree projections.

Example

Consider the following transaction dataset:

Transaction ID Items
T1 A, B
T2 B, C, D
T3 A, C, D, E
T4 A, D, E
T5 A, B, C

Step 1: Frequency Count

  • A – 4
  • B – 3
  • C – 3
  • D – 4
  • E – 2

Assume minimum support = 2, so all items are kept.

Step 2: Sort Items by Frequency

Order: A (4), D (4), B (3), C (3), E (2)

Step 3: Build the FP-Tree

  • T1: A → B
  • T2: D → B → C
  • T3: A → D → C → E
  • T4: A → D → E
  • T5: A → B → C

Tree is built by sharing prefixes and incrementing counters.

Mining Frequent Itemsets

Now recursively mine the tree starting from the least frequent item (E) and generate conditional trees.

For example:

  • Conditional base for E: {A D C}, {A D}
  • Conditional FP-Tree for E: A → D
  • Frequent itemsets: {E}, {D E}, {A E}, {A D E}

Advantages

  • Efficient: Reduces the number of candidate sets generated
  • Scalable: Works better on large datasets compared to Apriori
  • Compact representation through shared paths

Conclusion

FP-Growth is a highly efficient algorithm for frequent pattern mining. Its use of prefix trees helps manage large datasets effectively, and it eliminates the costly candidate generation process seen in older algorithms like Apriori.

Leave a Comment

Your email address will not be published. Required fields are marked *

Disabled !