Search In this Thesis
   Search In this Thesis  
العنوان
An Algorithm Framework for big data processing based on intelligent data operation /
المؤلف
Zaghloul, Mohamed Mustafa.
هيئة الاعداد
باحث / محمد مصطفي زغلول
مشرف / مفرح محمد سالم
مشرف / عمرو ثابت محمد
مناقش / علي ابراهيم الدسوقي
الموضوع
Intelligent data.
تاريخ النشر
2023.
عدد الصفحات
110 p. :
اللغة
الإنجليزية
الدرجة
الدكتوراه
التخصص
هندسة النظم والتحكم
تاريخ الإجازة
1/1/2023
مكان الإجازة
جامعة المنصورة - كلية الهندسة - قسم هندسة الحاسبات ونظم التحكم
الفهرس
Only 14 pages are availabe for public view

from 110

from 110

Abstract

An Algorithm Framework for big data processing based on intelligent data operation
Data operations play a crucial role in data engineering, which is becoming an increasingly significant challenge in data management. Intelligent, automation is necessary to handle the increasing number of use cases and towards agile data operation. Machine learning and automation can enhance the concept of intelligent data operation. Agile data operation is becoming necessary to fully utilize the value of data specially on big data environment. To address the agility of data operations on big data, a new methodology for data operations is needed. The proposed framework, Intelligent DataOps framework, is based on a data-process-centric approach and a data augmentation process. The data-process-centric approach on the proposed framework focuses on reengineering the metadata and establishing data operations as processes. This architecture coordinates operations across the entire data supply chain, which consists of various processes. The principles behind process-centric design are based on multiple processes levels: the data chain process level, data engineering process level, data function process level, and data augmentation process level. The data-process-centric methodology automates and orchestrates data processes based on process control model that includes process definition, process control, process audit, and process operation management, with the metadata repository storing all relevant data related to process definitions, object definitions, process configuration, and audit. The data augmentation process works with the data-process-centric approach to increase the agility of data operations using machine learning. This results in an algorithmic framework for processing big data based on intelligent data operation to improve data engineering operationalization from orchestration and monitoring to establish a formal process. The proposed framework focus on the case of data transformation performance prediction, this will be used to enhance big data operations. The data-process-centric approach (data engineering process) will automate and coordinate data operations. There are various data transformations within data engineering processes that utilize ensemble learning techniques to predict the performance of data transformations (the data augmentation process). The proposed methodology is implemented in a big data environment. The proposed framework improves the mean time to deliver data and the mean time to operate data. Deployment results indicate that the Mean Time to Deliver data (MTTD) is improved by 22%, while the data engineering process with data augmentation improves the MTTD by 24%. The Mean Time to Operate (MTTO) the data engineering process is improved by 27% when using the process with data augmentation, resulting in a 30% improvement in the MTTO.