Search In this Thesis
   Search In this Thesis  
العنوان
An enhanced deterministic error correction model For optical character recognition degraded arabic text /
الناشر
Mariam Adel Abdelhady Muhammad ,
المؤلف
Mariam Adel Abdelhady Muhammad
هيئة الاعداد
باحث / Mariam Adel Abdelhady Muhammad
مشرف / Mervat Gheith
مشرف / Tarek Elghazaly
مشرف / Mustafa Ezzat
تاريخ النشر
2016
عدد الصفحات
109 Leaves ;
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
Computer Science (miscellaneous)
تاريخ الإجازة
28/3/2017
مكان الإجازة
جامعة القاهرة - المكتبة المركزية - Computer and Information Science
الفهرس
Only 14 pages are availabe for public view

from 120

from 120

Abstract

Recently, the spell correcting of optical character recognition (OCR) has been one of the main focuses of natural language processing research. The challenges of the Arabic language and the lack of resources have made it difficult to provide Arabic OCR systems with high accuracy. Post-processing techniques are used to correct the Arabic degraded OCR text. This research presents a new correction model for Arabic OCR errors. The proposed model is mainly based on the character segmentation and the character alignment on a single character or multi-characters. This research investigates four factors can affect the proposed model: (i) the effect of increasing the size of training set, (ii) the effect of adding the training and test sets words into the dictionary to find the correct words of the candidate words, (iii) the effect of using different versions of OCR application upon testing, and (iv) the effect of using different fonts upon testing. The results show that the first and the second factors have a positive effect, but the third and the fourth factors have a negative effect on the performance of the model. Results also show that the proposed model contribute in enhancing the performance of the model