Author: Ayman Ramadan Ali Elkilany/ Title: Clustering utilization in data mining tasks /

Search In this Thesis

العنوان

Clustering utilization in data mining tasks /

الناشر

Ayman Ramadan Ali Elkilany ,

المؤلف

Ayman Ramadan Ali Elkilany

هيئة الاعداد

باحث / Ayman Ramadan Ali Elkilany

مشرف / Ehab Ezzat Hassanein

مشرف / Neamat Abdelhadi Eltazi

مناقش / Neamat Abdelhadi Eltazi

تاريخ النشر

2018

عدد الصفحات

68 Leaves :

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

Information Systems

تاريخ الإجازة

12/1/2019

مكان الإجازة

جامعة القاهرة - كلية الحاسبات و المعلومات - Information Systems

الفهرس

Only 14 pages are availabe for public view

from

Abstract

Data have grown into enormous size in the last few years where its influence has increased on everyone{u2019}s life. Data resources like YouTube videos, images or news are being posted to the web daily with huge amounts. With the existence of such massive amount of resources, there is a continuous need to find intelligent methods to process and benefit from it in an unsupervised manner. We define intelligent methods in this context as methods that do not need any human intervention for processing web resources. Human manual effort would be cumbersome in dealing with such massive amount of resources. Hence, intelligent methods should be dependent on data only to learn and provide computational intelligence to the user. In this thesis, we propose clustering as our intelligent method to provide the required computational intelligence. Clustering is the task of finding different phenomena hidden in the data. We intend to use phenomena information to benefit from the data in data mining tasks with the best way possible. Towards this goal, we propose two different paths to benefit from the data in unsupervised manner via clustering. The first path is to give identification to anonymous data gathered from the web. Converting anonymous data into identifiable one would represent a new source of data to learn from. We propose the idea of identifying relations between entities in the text from randomly gathered news from the web at their publication time. Gathering a news dataset and automatically extracting entities and relations between them allowed us to build a dataset of relations examples and learning from