Search In this Thesis
   Search In this Thesis  
العنوان
Efficient query processing using hadoop environment /
الناشر
Afaf Ghazi Abdullah Bin Saadon ,
المؤلف
Afaf Ghazi Abdullah Bin Saadon
هيئة الاعداد
باحث / Afaf Ghazi Abdullah Bin Saadon
مشرف / Hods Mokhtar Omar Mokhtar
مشرف / Hods Mokhtar Omar Mokhtar
مشرف / Hods Mokhtar Omar Mokhtar
تاريخ النشر
2018
عدد الصفحات
74 Leaves :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
Information Systems
تاريخ الإجازة
21/10/2018
مكان الإجازة
جامعة القاهرة - كلية الحاسبات و المعلومات - Information Systems
الفهرس
Only 14 pages are availabe for public view

from 90

from 90

Abstract

It is true that data is never static; it keeps growing and changing over time. New data is added and old data can either be modified or deleted. This incremental nature of data motivates the development of new systems to perform large-scale data computations incrementally. MapReduce was recently introduced to provide an efficient approach for handling large-scale data computations. Nevertheless, it turned to be inefficient in sup-porting the processing of small incremental data. While many previous systems have extended MapReduce to perform iterative or incremental computations, these systems are still inefficient and too expensive to perform large-scale iterative computations on changing data. In this paper, we present a new system called iiHadoop, an extension of Hadoop framework, optimized for incremental iterative computations. iiHadoop accelerates program execution by performing the incremental computations on the small fraction of data that is affected by changes rather than the whole data. In addi-tion, iiHadoop improves the performance by executing iterations asynchronously, and employing locality-aware scheduling for the map and reduce tasks taking into account the incremental and iterative behavior. An evaluation for the proposed iiHadoop framework is presented using examples of iterative algorithms, and the results showed significant performance improvements over comparable existing frameworks