Author: Hassaan, Ahmed Omar Abdullah./ Title: Text-based Filtering of Online Social Networks /

Search In this Thesis

العنوان

Text-based Filtering of Online Social Networks /

المؤلف

Hassaan, Ahmed Omar Abdullah.

هيئة الاعداد

باحث / أحمد عمر عبد الله حسان

مشرف / طارق مصطفى محمود

مشرف / طارق عبدالحفيظ عبدالرحمن

الموضوع

Web servers. Community server. Online social networks - Software.

تاريخ النشر

2020.

عدد الصفحات

129 p. :

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

Computer Science (miscellaneous)

تاريخ الإجازة

1/1/2020

مكان الإجازة

جامعة المنيا - كلية العلوم - علوم الحاسب

الفهرس

Only 14 pages are availabe for public view

from

149

from

149

Abstract

Online Social Networks (OSNs) are the most popular interactive media for communicating, posting and sharing indefinite amounts of personal information. However, along with interesting and attractive topics and contents, some users neither like the fact that certain topics that are not among their interests can fill their personal pages nor do they wish to see disappointing negative posts that may appear repeatedly. In addition, people sometimes post inappropriate or abusive content on these networks, such as insults or pornography.
Most of the efforts in the field of text classification have focused on the English language, while research on the Arabic language, which has numerous challenges, such as word meaning, variations in the lexical category, and diacritical marks, is scarce.
So, in this thesis, we constructed a standard multi-label Arabic dataset using manual annotation and a semi-supervised annotation technique that can be used for short text classification, sentiment analysis, and hate speech detection. To the best of our knowledge, this is the first Arabic dataset that includes multiple classes to support online short text classification and sentiment analysis across many subjects with no limitation to a particular area.
In addition, we constructed a standard Arabic dataset that can be used for hate speech and abuse detection. In contrast to most previous work the datasets were collected from one platform, the proposed dataset is collected from more social network platforms (Facebook, Twitter, Instagram, and YouTube). To validate the effectiveness of the proposed datasets twelve machine learning algorithms and two deep learning architecture were used. Also, we examine the relationship between topics published in OSNs and hate speech.
Then, we evaluated the topics classification, sentiment analysis, and hate speech detection. Based on that evaluation we proposed a filtering system based on supervised machine learning methods that classify Arabic posts in OSNs. The classification includes topics categorization, sentiment analysis, and hate speech detection. Our proposed technique could be used to work in the background of the user’s browser and it does not violate the privacy rights or obtain any personal information. The experimental results validate the effectiveness of the proposed datasets and the filtering system.