Author: El-Demerdash, Alaa Abdel-Mohsen Mohamed./ Title: A novel technique to improve the performance of the internet public sentiment analysis /

Search In this Thesis

العنوان

A novel technique to improve the performance of the internet public sentiment analysis /

المؤلف

El-Demerdash, Alaa Abdel-Mohsen Mohamed.

هيئة الاعداد

مناقش / آلاء عبدالمحسن محمد الدمرداش السيد

مشرف / شريف السيد حسن

مشرف / جون فايز ونيس

مناقش / يحيى عبدالعظيم المشد

مناقش / محمد شريف مصطفى القصاص

الموضوع

Application software. Artificial intelligence. Computer communication systems. Data mining. Information storage and retrieval.

تاريخ النشر

2021.

عدد الصفحات

online resource (86 pages) :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

هندسة النظم والتحكم

تاريخ الإجازة

1/1/2021

مكان الإجازة

جامعة المنصورة - كلية الحاسبات والمعلومات - قسم هندسه الحاسبات والنظم

الفهرس

Only 14 pages are availabe for public view

from

Abstract

Sentiment analysis attracts the attention of Egyptian Decision-makers in the education sector. It offers a viable method to assess education quality services based on the students’ feedback. It also provides an understanding of their needs. As machine learning techniques offer automated strategies to process big data, which derived from social media and other digital channels, we, herein, uses a dataset for tweets’ sentiments to assess a few machine learning techniques. After dataset preprocessing to remove symbols and perform necessary Stemming and Lemmatization for features extraction, we use several machine learning techniques and a proposed Long Short-Term Memory (LSTM) classifier optimized by the Salp Swarm Algorithm (SSA) and measured their corresponding performance. The validity and accuracy of commonly used classifiers, such as Support Vector Machine, Logistic Regression Classifier, and Naive Bayes classifier, were reviewed. Moreover, LSTM based on the SSA classification model was compared with Support Vector Machine (SVM), Logistic Regression (LR), and Naive Bayes (NB). The students’ feedback sentiment analysis was also assessed, and its correlation with the courses’ overall evaluation was investigated. The independent variable for each course is the percentage of positive feedback relative to all students’ feedback, while the dependent variable is the overall course evaluation which was statistically calculated. In statistics, Pearson, Kendall rank, Spearman, and the Point-Biserial correlation were the regular tests to be measured. In this work, the correlation is used to measure the association between the percentage of positive feedback and course evaluation. The correlation coefficient value ranges from -1 to +1, where ±1 represents the perfect association among the variables. While the relationship weakens as the correlation coefficient approaches zero. The positive and negative signs represent the direction of the positive and negative relationship, respectively. Thus, and for example, the course CSE423 had 45% positive feedback while our system predicted 58%. While MUR233 had 30% and our system predicted 48% and so on. The correlation coefficient is used to measure the relationship between two datasets. The p-value represents the probability of uncorrelated datasets correlating at least as high as the correlation calculated from these datasets. The correlation was measured between the predicted positive sentiment of each course and the course evaluation based on Pearson, Kendall rank, Spearman, and the Point-Biserial correlations. Moreover, the Wilcoxon Signed Rank test was calculated as a standard statistical test for correlation assessment. The p-value was found to be 0.018, which accepts the null hypothesis (H0: Means are equal). Finally, as LSTM based SSA achieved the highest accuracy, it was applied to predict the sentiments of students’ feedback and evaluate their association with the course outcome evaluations for education quality purposes. Finally, we measured Pearson correlation, which was around 0.994