Search In this Thesis
   Search In this Thesis  
العنوان
Sentiment Analysis for Arabic text using Deep Learning Techniques \
المؤلف
Moawad, Enas Abd El Hakim Khalil.
هيئة الاعداد
باحث / إيناس عبد الحكيم خليل معوض
مشرف / هدى قرشي محمد اسماعيل
مشرف / إيناس محمد فهمي محمود الهوبي
مناقش / محسن عبد الرازق على رشوان
تاريخ النشر
2022.
عدد الصفحات
117 p. :
اللغة
الإنجليزية
الدرجة
الدكتوراه
التخصص
هندسة النظم والتحكم
تاريخ الإجازة
1/1/2022
مكان الإجازة
جامعة عين شمس - كلية الهندسة - هندسة الحاسبات والنظم
الفهرس
Only 14 pages are availabe for public view

from 117

from 117

Abstract

Sentiment analysis is the research field that examines people’s language for their views, feelings, assessments, attitudes, and emotions. The growing importance of sentiment analysis corresponds to social media platforms such as reviews, blogs, microblogs, forum discussions, Twitter, and social networks.
These platforms have a huge amount of data on them. This massive volume of data necessitates quick and precise analysis techniques, which aid decision-making in various disciplines and changes policy as needed.
Emotion analysis is one of the most common sentiment analysis jobs, aiming to observe and distinguish different sorts of sentiments/emotions expressed through language expression. The research interest in Arabic sentiment analysis has increased drastically due to the internet’s vast number of Arabic language users. This work built a framework for multilabel emotion analysis from Arabic tweets. The Arabic tweets dataset used has been provided by SemEval 2018-Task1, E-c subtask.
In this framework, the tweets data have been first collected and preprocessed through different normalization steps.
The normalization contains Stemming, stop word removal, special characters, digits removal, and building an emotion lexicon to replace the emotions with their meaning related to emotion classes. Following the normalization step is the feature extraction process. We use the word embedding pre-trained model for feature extraction.
The classification process was then implemented using a deep learning Bidirectional Long Short term Memory model. This model is considered a recent approach for this multilabel task in Arabic social media. The model achieved accuracy (Jaccard Index) of 49.8%.
Also, in the classification process of our framework work, other different machine learning techniques have been implemented. Support Vector Machine SVM with accuracy of 46.3%, K Nearest Neighbor (KNN) with accuracy of 37.5%, Ensemble Random Forest (RF) with accuracy of 29.1%, extra tree with accuracy of 26.2%, and Multi-Layer Perceptron (MLP) with accuracy of 48%. The proposed BiLSTM model outperformed all these machine learning models in accuracy. The BiLSTM system achieved a noticeable enhancement in accuracy compared with the last best model in the same task implemented by other teams. It surpassed the EMA model built using the Support vector classifier SVC (EMA team achieved 48.8%). It outperforms the other deep neural networks (UNCC Team) based on fully connected layers that reported an accuracy of 44.6%.
Finally, we adopt a transfer learning model for our emotion classification task, an adapted Arabic Bidirectional Encoder Representations from Transformers (AraBERT) model achieved a Jaccard accuracy of 59.3% outperforming all deep learning and other machine learning models implemented. It also showed the superiority over the last known best results of the same task built on an ensemble of deep learning models (BILSTM, BIGRU) and transformer based model (MARBERT). This model achieved Jaccard accuracy of 54% which is less than the adapted AraBERT model.