Search In this Thesis
   Search In this Thesis  
العنوان
HIGH PERFORMANCE DATA MINING IN DISTRIBUTED DATABASES \
المؤلف
Darwish, Mahmoud Fouad Anwar.
هيئة الاعداد
مشرف / محمود فؤاد أنور درويش
مشرف / حسام الدين مصطفى فهيم
مشرف / نجوى بدر
مشرف / رانيه الجوهرى
تاريخ النشر
2017.
عدد الصفحات
102 P. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
Information Systems
تاريخ الإجازة
1/1/2017
مكان الإجازة
جامعة عين شمس - كلية الحاسبات والمعلومات - نظم المعلومات
الفهرس
Only 14 pages are availabe for public view

from 102

from 102

Abstract

The massive volume of data generated on daily basis decreases the ability of current data mining techniques to generate knowledge in a short time. The constant change in data requires constant updating of the existing patterns. It is computationally intensive to repeat the knowledge discovery process on the whole databases with every update. Therefore, there is a need to enhance the performance of association rules mining methodologies when dealing with incremental updates.
In order to enhance the performance of incremental association rules mining, this thesis focus on the utilization of current hardware and software advances in high-performance computing. This thesis proposes a distributed incremental association rules mining approach based on MPI. In addition, the thesis also proposes a hybrid incremental mining approach based on OpenMP and MPI to work in high performance computing environments. In order to reduce the need to reprocess the entire database, this thesis depends on pre-large and negative borders approaches.
To evaluate the applied approaches, this thesis considered the output accuracy, processing time and the acceleration as our primary evaluation metrics. In fact, experimental results have proved that our distributed method reduces processing time by 40% when compared to serial existing approach and our hybrid approach reduces processing time by 19% when compared to distributed approach.