Search In this Thesis
   Search In this Thesis  
العنوان
Intelligent Hybrid Man-Machine Translation Evaluation \
المؤلف
Sabek, Ibrahim Ahmed Ibrahim Saleh.
هيئة الاعداد
باحث / ابراهيم أحمد ابراهيم صالح سابق
ibrahim.sabek@alex-eng.edu.eg
مشرف / نجوى مصطفى المكى
nagwamakky@gmail.com
مشرف / سهيرأحمد فؤاد بسيونى
مناقش / أمين أحمد شكرى
amin.shoukry@gmail.com
مناقش / صالح عبد الشكور الشهابى
salma.elshehabi@alex-eng.edu.eg
الموضوع
Computer Science.
تاريخ النشر
2014.
عدد الصفحات
75 p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
الهندسة (متفرقات)
تاريخ الإجازة
1/6/2014
مكان الإجازة
جامعة الاسكندريه - إدارة جامعة الاسكندرية - ادارة كلية الهندسة - الحاسبات والنظم
الفهرس
Only 14 pages are availabe for public view

from 106

from 106

Abstract

Machine Translation (MT) has grasped a lot of attention in translation communities dur¬ing the recent years and become a crucial part in almost all search engines. However, the widespread of MT technology depends on the trust associated with its outputs. Different approaches have been introduced to address the issues of evaluating translations from one natural language to another. Automatic metrics have been developed to predict the quality of MT outputs. Although these metrics are efficient in terms of speed, the existence of reference translations is assumed. Another research direction, known as Quality Estimation (QE), was proposed to exploit human assessments for evaluation based on machine learning techniques and without reference translations Both of automatic metrics and QE approaches have drawbacks. Automatic metrics paid little attention to capture any information at linguistic levels further than lexical. Therefore, these metrics are considered superficial. On the other hand, QE approaches rely only on hu¬man assessments which are much more expensive to obtain. Moreover, human assessments can vary for the same translated sentence.In this thesis, the drawbacks of these two directions are addressed. We extracted a set of linguistic and data-driven features from parallel corpora to evaluate MT outputs. The advantages of these features are twofold. First, they provide a deep linguistic insight which addresses a key issue in automatic metrics. Second, these features are extracted from parallel corpora without the need for expensive human assessments. The experimental evaluation shows that our proposed system outperforms state-of-the-art automatic metrics in terms of accuracy.
Moreover, if human assessments are available, the proposed approach can benefit from them while solving the inconsistency issues of these assessments. A probabilistic inference model was devised to infer the credibility of human assessments. Trusted human assessments can then be used to improve the accuracy of the proposed system.