الفهرس | Only 14 pages are availabe for public view |
Abstract MapReduce has become an effective framework for processing and analysing huge data size in large systems. On the other hand, SQL Query is necessary to build an efficient and flexible SQL translator to MapReduce framework. The need of optimized SQL translator that can deal with advanced queries is very necessary, which can increase the performance of data analysis with growing of Big Data. Hive supports queries which called HiveQL. HiveQL offers the same features as SQL, which still difficult to deal with complex SQL queries. Consequently, manual translation of HiveQL often leads to poor performance. Also, Flink has become an effective framework to Big Data analysis in large cluster systems. On the other hand, FLink doesn’t support any Query language. So, the designing and implementing SQL to FLink Translator is needed to execute SQL Query over FLink. The work in this thesis adopts these limitations of SQL translators and proposes two contributions which considered as SQL–to-MapReduce translators to improve Big Data analysis. The first contribution is called QRMapper (Query Rewriting Mapper). It is developed to solve the problem of translating a complex SQL queries into HiveQL by utilizing and optimizing Query rewriting. This translator improves the performance of HiveQL without any changes in Hive framework and provides the possibility of executing SubQuery and Advanced SQL Query. Our system performance has been evaluated using TPC-H Benchmark. The second contribution is called SQL to Flink Translator. A new system has been developed to define and add SQL Query language to Flink. This translator improves the performance of SQL without any change in Flink framework and provides the possibility of execute SQL Query on Flink by generating Flink algorithm that executes SQL Queries. Also, SQL TO Flink Translator has the capability of execute SQL with high degree of performance, when other systems have a low-performance . Our system performance has been evaluated using TPC-H Benchmark. Generally, according to these two contributions, a new layer has been developed to execute advanced SQL Query over MapReduce translator. So, it is considered a main contribution in the Big Data field. |