Search In this Thesis
   Search In this Thesis  
العنوان
Face Recognition on Heterogeneous Architecture using Parallel Computing Paradigms \
المؤلف
Ibrahim, Dalia Shouman El-Shahat.
هيئة الاعداد
باحث / داليا شومان الشحات ابراهيم
مشرف / حسام فهيم
مشرف / سلمى حمدي
مشرف / حسام فهيم
تاريخ النشر
2017.
عدد الصفحات
103 p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
Computer Science (miscellaneous)
تاريخ الإجازة
1/1/2017
مكان الإجازة
جامعة عين شمس - كلية الحاسبات والمعلومات - نظم الحاسبات
الفهرس
Only 14 pages are availabe for public view

from 101

from 101

Abstract

Face recognition applications are widely used in different areas, specifically, in security and biometrics. The decision often should be highly accurate and fast. Principle Component Analysis is a feature extraction algorithm used in facial recognition applications by projecting images on a new face-space. It is mainly applied to reduce the dimensionality of the image. However, PCA consumes a lot of processing time due to its high intensive computation nature.
To overcome the single computing systems limitations, different parallel processing paradigms are used to accelerate the process. In this thesis, two face recognition scenarios are implemented using different parallel programming memory architectures.
First, we show how a cluster of supercomputers can be used to accelerate a face recognition system. The work focuses on speeding either the training or testing phase of PCA. In addition, the suggested environment is dynamic to different numbers of supercomputers. Experimental results show that the proposed architecture improves execution time up to 25X in training of the first scenario and 5X in recognition phase for both scenarios, reaching super-linear and linear speed-up, respectively. It also achieves system scalability on different data sizes from the Facial Recognition Technology (FERET) database.
Furthermore, a hybrid architecture is suggested to optimize face recognition by exploiting the benefits of multi-core and distributed systems combined. Hybrid MPI/OpenMP libraries are used to perform two-levels decomposition; one on the distributed systems, and the other using cores inside each supercomputer. The proposed approach significantly reduces the algorithm complexity when implemented over a cluster with parallel computing architecture. The first scenario achieves 2975X and 102X faster than the sequential implementation in the training and recognition phases, respectively. However, the second scenario achieves 74X faster than the sequential implementation in the recognition phase.
In addition, a heterogeneous computing system architecture is proposed for enhancing the projection time of PCA algorithm. The speed-up reaches up to 290X compared to the sequential implementation of the projection step. This affects the total training time up to1.6X.