Author: Anis, Sarah Osama./ Title: Sentiment Analysis in Tourism /

Search In this Thesis

العنوان

Sentiment Analysis in Tourism /

المؤلف

Anis, Sarah Osama.

هيئة الاعداد

باحث / سارة أسامة أنيس

مشرف / مصطفى عارف

مشرف / سالى سعد

تاريخ النشر

2021.

عدد الصفحات

73 p. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

Computer Science (miscellaneous)

تاريخ الإجازة

1/1/2021

مكان الإجازة

جامعة عين شمس - كلية الحاسبات والمعلومات - علوم الحاسب

الفهرس

Only 14 pages are availabe for public view

from

Abstract

Sentiment Analysis is an automated process of analysing people’s opinions and feelings using Natural Language processing tools. As everything is shifting online, the demand for sentiment analysis has increased tremendously. In tourism, Sentiment analysis can help to comprehend tourists concerns and complaints which will benefit organizations in this field with accurate sentiment tracking of their customers, enabling them to improve customer experience. Tourism-related websites have turned into an incredible data source that impacts the tourism industry from many points of view. Tourists express their opinions regarding products and services online daily. The interest in understanding and analysing customer opinions has increased significantly over the past few years as it is vital for the decision making of both customers and companies. Sentiment analysis is the practice of applying natural language processing, statistics and machine learning methods to extract and identify the common opinion behind the text in a review, blog discussion, news, comments or any other document. Sentiment analysis has great potential to directly understand tourists’ opinions.
This thesis tackles a comprehensive overview of the latest update in this field giving a nearly full image of sentiment analysis approaches, techniques, and challenges in analysing the correct meaning of sentiments and detecting the suitable sentiment polarity in the field of tourism. It discusses the general process of sentiment analysis with its stages along with recent studies reviewed for each stage. It gives a detailed description of the main sentiment classification approaches which are machine learning approach, Lexicon-based approach and Hybrid approach. A comparative analysis between the sentiment classification approaches, methods and techniques is also presented to highlight the differences between approaches and the advantages and disadvantages of each approach and technique. There are several challenges in sentiment analysis that are highlighted in this thesis that help shed light on areas that are less investigated in this field. Recommendations to solve these challenges are also presented.
In this thesis, an approach is introduced that automatically perform sentiment analysis for hotel reviews provided by customers from one of the leading travel sites. Different techniques were investigated, Fuzzy C-means clustering algorithm was used for sentiment detection to extract subjective sentences from objective ones. Sentiment detection could be viewed as a prior step to increase the accuracy of sentiment classification. Sentiment detection is an important sub-task of sentiment analysis that can prevent a sentiment classifier from considering the deceptive or misleading text in online reviews. Sentiment classification determines the overall polarity of opinion whether it’s positive or negative. For sentiment classification, hotel reviews have been analysed using various techniques like Naïve Bayes, K-Nearest Neighbour, Support Vector Machine, Logistic Regression, and Random Forest. An ensemble learning model was also proposed that combines the five classifiers. Ensemble learning was used in order to achieve better results, as it is commonly known to outperform the performance of single classifiers. We have also investigated the importance of deep learning in sentiment analysis and its ability to improve the accuracy of the sentiment prediction. We have proposed a deep learning approach based on word embedding and gated recurrent unit to solve the sentiment classification problem. Finally results of each classifier were compared. Ensemble classifier achieved 86.2% accuracy and best results of the five classifiers were obtained by the Support Vector Machine with 86.3% accuracy. Our deep learning approach outperformed the performance of other methods with accuracy 89% and 92% F-score.