Author: El-Abady, Naglaa Fathy Khalil./ Title: Documents forgery detection /

Search In this Thesis

العنوان

Documents forgery detection /

المؤلف

El-Abady, Naglaa Fathy Khalil.

هيئة الاعداد

باحث / Naglaa Fathy Khalil El-Abady

مشرف / Hala Helmy Zayed

مشرف / Mohamed Taha Abd El Fattah

مناقش / El-sayed mohamed elhorbaty

مناقش / Mazen mohamed selim

الموضوع

Deep Learning. Neural Networks.

تاريخ النشر

2023.

عدد الصفحات

162 P. ;

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

Computer Networks and Communications

تاريخ الإجازة

18/10/2023

مكان الإجازة

جامعة بنها - كلية الحاسبات والمعلومات - علوم الحاسب

الفهرس

Only 14 pages are availabe for public view

from

196

from

196

Abstract

Digital forgery is the illegal alteration of digital content, including photos, photographs, documents, and music, possibly for financial advantage. It is very common now because digital images are not difficult to manipulate and alter due to easily available image processing and editing software. Digital Image processing is the most advanced technique used by Forger to fabricate documents.
Personal computers, scanners, and color printers are widely spread because they have low prices and high quality. The combination of these technologies of the low price raised the risk of forgery of the documents. Personal computers, scanners, and printers are good enough to generate fraudulent documents, like certificates, agreements, identity cards, lottery tickets, etc.
Document forgery detection is becoming increasingly important in the current era, as forgery techniques are available to even inexperienced users. It is a common problem affecting many areas of our daily lives. For example, customers may present fake documents to banks to obtain a loan or tampered documents to insurance companies to obtain the amount.
Document forensics involves getting evidence from the questioned documents or examining handwriting, ink, and paper to ascertain the source. Forensic document examination involves analyzing and comparing questioned documents with known material to identify the author or origin of the questioned document whenever possible.
Various types of document forgery detection can be found. According to forensics specialists, one of them is based on identifying the type of printer that was used to print the document. Identifying the origins of printed documents is helpful for criminal investigations and authenticating digital versions of a document in today’s world. Source Printer Identification (SPI) has become increasingly popular for identifying fraud in printed documents.
The other method relies on counting the amount and kind of inks used to create documents. Ink mismatches in document forensics convey crucial information regarding forgeries, enabling us to determine the authenticity of documents. Identifying and separating these inks from the multispectral paper is exceedingly difficult. Inks of different materials exhibit different spectral signatures even if they have the same color. Through their unique spectral signatures, the distinctive features of an image under investigation can be found using hyperspectral imaging. It records several electromagnetic spectrum narrow-band images. Using ink analysis, which can give sufficient details about the type and composition of the ink, hyperspectral document analysis (HSDA) can be used to authenticate documents.
This thesis introduces a Source Printer Identification (SPI) technique to identify the origin of printed documents. It is a method for identifying the source printer and classifying the questioned document into one of the printer classes. According to what we know, most earlier studies segmented documents into characters, words, and patches or cropped them to obtain large datasets. In this thesis, we used different techniques to identify the source printer type. One uses traditional techniques, and the other with Deep Learning (DL) techniques.
Traditional techniques extract global feature descriptor vectors using Histogram of Oriented Gradients (HOG) features. Using Local Binary Pattern (LBP) features, local feature descriptor vectors are extracted for each image. Both HOG and LBP feature vectors are concatenated to create the trained models. The extracted features are classified using a Decision Tree (DT), k-Nearest Neighbors) k-NN (, Support Vector Machine (SVM), a combination of them, Bagging, Boosting, and random forest classifiers.
Deep learning techniques train three different Convolution Neural Network (CNN) models on three separate datasets to determine the most accurate model. In the first technique, 13 pre-trained CNNs were tested, and they were only used for feature extraction, while SVM was used for classification. In the second technique, a pre-trained neural network is retrained using transfer learning for feature extraction and classification. In the third technique, CNN is trained from scratch and then used for feature extraction and SVM for classification.
This thesis also introduces an Ink Mismatch Detection (IMD) technique. Most earlier studies had insufficiently accurate black ink detection. A supervised deep learning approach that captures spectral features from hyperspectral document images to detect ink mismatches in hyperspectral document images is proposed to enhance the accuracy of black ink detection. Using a hyperspectral image dataset of UWA writing ink in blue and black, we assessed the performance of the CNN model. To find ink mismatch, different types of artificially identical color inks (2–5) were mixed in a range of ratios.
Finally, the suggested methods for source printer identification have the highest accuracy of 99.58%. The suggested methods outperform the other recently disclosed algorithms in terms of classification accuracy on the same dataset. According to experimental results, the second suggested method for detecting inks performs better than all earlier strategies by achieving greater accuracy. When compared to other methods, it proves to be the most accurate approach for detecting both black and blue inks. In comparison to techniques that exclusively utilize spectral features, it exhibits an average accuracy improvement of up to 4.77% for black inks and 0.86% for blue inks. Furthermore, when compared to methods employing both spectral and spatial features, it demonstrates an average accuracy enhancement of up to 0.36% for black inks and 0.18% for blue inks. The proposed method accurately detects ink mismatches and identifies various inks based on their unique spectral response, rendering it highly beneficial for applications in document forensics.