Author: Mariam Adel Abdelhady Muhammad/ Title: An enhanced deterministic error correction model For optical character recognition degraded arabic text /

Search In this Thesis

العنوان

An enhanced deterministic error correction model For optical character recognition degraded arabic text /

الناشر

Mariam Adel Abdelhady Muhammad ,

المؤلف

Mariam Adel Abdelhady Muhammad

هيئة الاعداد

باحث / Mariam Adel Abdelhady Muhammad

مشرف / Mervat Gheith

مشرف / Tarek Elghazaly

مشرف / Mustafa Ezzat

تاريخ النشر

2016

عدد الصفحات

109 Leaves ;

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

Computer Science (miscellaneous)

تاريخ الإجازة

28/3/2017

مكان الإجازة

جامعة القاهرة - المكتبة المركزية - Computer and Information Science

الفهرس

Only 14 pages are availabe for public view

from

120

from

120

Abstract

Recently, the spell correcting of optical character recognition (OCR) has been one of the main focuses of natural language processing research. The challenges of the Arabic language and the lack of resources have made it difficult to provide Arabic OCR systems with high accuracy. Post-processing techniques are used to correct the Arabic degraded OCR text. This research presents a new correction model for Arabic OCR errors. The proposed model is mainly based on the character segmentation and the character alignment on a single character or multi-characters. This research investigates four factors can affect the proposed model: (i) the effect of increasing the size of training set, (ii) the effect of adding the training and test sets words into the dictionary to find the correct words of the candidate words, (iii) the effect of using different versions of OCR application upon testing, and (iv) the effect of using different fonts upon testing. The results show that the first and the second factors have a positive effect, but the third and the fourth factors have a negative effect on the performance of the model. Results also show that the proposed model contribute in enhancing the performance of the model