Author: El-Rashidy, Nora Mahmud./ Title: Improving feature selection using intelligent techniques /

Search In this Thesis

العنوان

Improving feature selection using intelligent techniques /

المؤلف

El-Rashidy, Nora Mahmud.

هيئة الاعداد

باحث / نورا محمود متولى الرشيدى

مشرف / راشدمحمد العدوى

مشرف / حازم مختار البكرى

مناقش / محمد السعيد نصر

مناقش / محمد محمد فؤاد

الموضوع

Information system. Information science.

تاريخ النشر

2015.

عدد الصفحات

102 p. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

Information Systems

تاريخ الإجازة

01/01/2015

مكان الإجازة

جامعة المنصورة - كلية الحاسبات والمعلومات - Department of Information system

الفهرس

Only 14 pages are availabe for public view

from

113

from

113

Abstract

Sentiment analysis is a research topic that started to get more attention around 2001. It is mainly concerned with the area of text classification based on its sentiment. Sentiment analysis also concerned with analyzing textual datasets which contain opinions (e.g., discussion groups, blogs, social media, and internet forums) with its objective of opinion classifying as negative, positive, or neutral. Classification of textual objects according to sentiment is considered a more difficult task than classification of textual objects according to content because opinions that expressed with natural language can expressed in complex and subtle ways containing slang, ambiguity, sarcasm, irony, and idioms. Various Feature selection methods like Rough Set Theory (RST), Information Gain (IG), Minimum Redundancy Maximum Relevance (MRMR) has been used as feature selection method classification and sentiment analysis, we apply IG for feature selection. Results show that it’s useful to use it in sentiment classification of text documents.
Reviews are compared to those obtained other feature selection methods. Many classification methods are also used for sentiment analysis like Support Vector Machine (SVM), Naïve Bayes (NB), etc. We also applied two weighting schemes called Feature Presence and Senti-word Lexicon. Experimental results show that XIII given a dataset containing labeled movie reviews, Naive generates a classification accuracy of 91.1% with 42% of the features. The performance of IG varied depending on the dataset. The performance of NB is better than SVM and decision trees on the Arabic movie review dataset. We can, therefore, recommend multinational NB For classification. Senti-word lexicon result in an accuracy of about 94.6%.