Search In this Thesis
   Search In this Thesis  
العنوان
A Formal Approach to Modern Standard Arabic Syntax :
المؤلف
Abdelhalim, Amira Ahmed.
هيئة الاعداد
باحث / أمير أحمد عبد الحليم
مشرف / سامح أبو المجد الأنصاري
مشرف / سامح أبو المجد الأنصاري
مناقش / سامح أبو المجد الأنصاري
الموضوع
Arabic Language - - usage. Phonetics and Phonology.
تاريخ النشر
2016.
عدد الصفحات
242 p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
الصوتيات والموجات فوق الصوتية
تاريخ الإجازة
14/7/2016
مكان الإجازة
جامعة الاسكندريه - كلية الاداب - الصوتيات واللسانيات
الفهرس
Only 14 pages are availabe for public view

from 272

from 272

Abstract

This study tries to cover the interdisciplinary characteristics of this specific field of study. The first part covers different issues concerning the syntactic structure description and identification, philosophy of sentence structure analysis, emphasizing the used approach for sentence syntactic representation of Immediate Constituent Analysis, the requirements of a NLP parsing framework, Formal Grammars and automatic syntactic analysis. The demonstration is implemented by a syntactic parser along with a supported additional module of automatic extraction of syntactic arguments of verbs (automatic subcategorization) on sentence level, specifically acquainted with Arabic language characteristics. It is a formal approach to Arabic syntax and not just parsing, as it offers a complete formal theoretical linguistic and applicable computational methodology to the automation of MSA syntactic structures. The study focuses on verbal sentences, parses different kinds of complex and simple constituents within first, and then combines them with subcategorization verb frames information to correctly parse complete well-formed Arabic verbal sentences. It is well known that such information, of subcategorization frame, increases the efficiency of automatic syntactic analysis and increases the accuracy of parsing due to the ability to differentiate distinction between complements and adjuncts, i.e. phrases attachment.
In order to apply empirically the purpose of the study, the analytical descriptive approach of Immediate Constituent Analysis using categories and phrase structure rules is exemplified through tree diagrams and applied on syntactic structures of sentences selected from the MSA Corpus, a real contemporary written corpus of MSA, the Arabic Parkinson Corpus. Sentences are selected in order to represent diverse Arabic structures patterns, diverse clauses and constituents’ patterns, and explained through constituency relations. A dictionary is built where the data is segmented and tokenized into their parts of speech (POS), supplemented with all required linguistic attributes, and supported with transitivity information for verbs; in order to be able to detect verbs subcategorization frames automatically. Data is analyzed into linguistic syntactic trees by Immediate Constituent Analysis manually to deduct the needed grammar. The linguistic description is then transformed into a formal grammar for further testing, for the applicability of the approach and analysis, to the adopted formal language of the Universal Networking Language (UNL), in order to be analyzed and parsed automatically by Interactive Analyzer tool (IAN) along with the automatic identification of the subcategorization verb frames. The evaluation and comparison of the results are performed through comparing the results with the manual analysis and evaluated through F-measure and the output is compared, as a benchmark, with Stanford Parser. At the end, the study tries to present deductive formal tests for the grammar model.