Search In this Thesis
   Search In this Thesis  
العنوان
Finding the frequent itemsets using the itemset tree/
الناشر
Mohamed Ahmed Yakout ,
المؤلف
Yakout,Mohamed Ahmed.
الموضوع
Data base. Computer scince.
تاريخ النشر
2006
عدد الصفحات
iii-xi+61 P.:
الفهرس
Only 14 pages are availabe for public view

from 79

from 79

Abstract

The problem of finding the frequent itemsets is crucial in data mining. Since the introduction of the Apriori algorithm in 1994, there have been several methods to improve its performance. There are two main approaches used in the proposed algorithms. First, algorithms use a process of candidates generate-and-test to find frequent itemsets. Second and recent approach, algorithms transform the original data into a representation better suited for frequent itemset mining.
‎The recent dataset-transformation approach suffers either from the possible increasing in the number of structures that could be produced through the execution of the algorithm or from the problem of the processing time in either projecting or decomposing the datasets. In addition, since the constructed structure is altered or destroyed during the execution and sometimes is built after filtering the transactions, it cannot be reused in ad¬hoc mining queries or in other mining processes.
‎In this thesis, we are making use of the ltemSet Tree structure in effectively counting the itemsets’ SUpp011. The itemset tree was proposed for ud-hoc mining queries. However there is no efficient way to use it in support counting during the process of finding the frequent itemsets. We are presenting a proposal for using a Guidance Information Bits to speed up the process of the support counting in the Itemset Tree. To find all the frc4uent itemsets, the proposed algorithm (TDF) explores the frequent itemsets search space in depth-first. We generate candidates from the scarch space and count their support in the Itemset Tree. The generated candidates arc grouped such that they sharc thc I •••. Iing itemset and differ only in the lust item.
‎Several experiments have l1een wnducted to analyze the itcmset trec SilC 111 main memory and identify the parameters thut uffe!:t its size. Also, the performunce of the TDF algorithm, using the itemset tree with its various shapes and reduced size, is compared with FP-grolVth algorithm.