Search In this Thesis
   Search In this Thesis  
العنوان
Clustering and Relating Research Papers using Self-Organizing Maps\
المؤلف
Ahmed,Reham Fathy Mahmoud
هيئة الاعداد
باحث / ريهام فتحي محمود أحمد محمد
مشرف / هاني محمد كمال مهدى
مشرف / شريف رمزي سلامة
مناقش / بسنت محمد محمد الكفراوى
تاريخ النشر
2021.
عدد الصفحات
86p.:
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
الهندسة الكهربائية والالكترونية
تاريخ الإجازة
1/1/2021
مكان الإجازة
جامعة عين شمس - كلية الهندسة - كهرباء حاسبات
الفهرس
Only 14 pages are availabe for public view

from 121

from 121

Abstract

Text data increases every day with a huge amount. We usually deal with vast quantities of text data through the Internet or on our computer systems. It will be useful to have a method to organize this huge amount of text data. With this huge increase, clustering text papers becomes an important research topic.
For many years and till now many researchers are trying to find the best algorithm to make text clustering. Many algorithms were used to perform text clustering such as Naïve Bayes, Support Vector Machines (SVMs), and Self-Organizing Maps (SOMs).
Research papers are a special type of text documents as they have specific expressions and scientific keywords. This was our motivation to develop an algorithm which can cluster research papers. This thesis proposes a method to cluster research papers based on SOMs.
A SOM is an unsupervised machine learning method. It has some parameters which need to be optimized in order to produce the best possible solution. These parameters are either set manually or using trial and error methods. In this work, we propose to use the well-known genetic algorithm to search the parameter space in an effort to find the best values automatically. Accordingly, in this thesis we decided to use SOM algorithm optimized by genetic algorithm.
First, we built our algorithm and test it on clustering gray colors as a simple case study in order to test our algorithm and measure its efficiency. Then we applied our algorithm on three different research papers data sets to cluster them. To achieve better results, we also integrated our suggested algorithm with a pretrained Word2Vec model to be able to match different words having similar meaning. Finally, we compared our results with previous research on clustering research papers showing that our work outperform their results which was already compared it to many other earlier methods.