Search In this Thesis
   Search In this Thesis  
العنوان
Hand Gesture Spotting and Recognition
Based on Generative and Discriminative
Models in Stereo Color Image Sequences /
المؤلف
Ebrahim, Fatma Shabaan Abd Elbary.
هيئة الاعداد
باحث / Fatma Shabaan Abd Elbary Ebrahim
مشرف / Fayed Fayek Ghaleb
مشرف / Ebrahim Abd Allah Youness
مناقش / Mahmoud Othman Elmezain
تاريخ النشر
2018.
عدد الصفحات
230 P. :
اللغة
الإنجليزية
الدرجة
الدكتوراه
التخصص
النظرية علوم الحاسب الآلي
تاريخ الإجازة
1/1/2018
مكان الإجازة
جامعة عين شمس - كلية العلوم - قسم الرياضيات
الفهرس
Only 14 pages are availabe for public view

from 230

from 230

Abstract

Summary
The main objective of this thesis is to study the Human Computer Interaction (HCI). Mainly, we shall study the system of gesture spotting and recognition by using hand. This faced by several problems: Extracting (spotting) key gesture from the continuous hand motions sequence is the first problem which arises in the recognition of hand gesture. The second problem is that there is quite a bit of variability in trajectory, shape, and duration of the same gesture even for the same person.
Throughout literature, to discover the start points of gestures they firstly detect the end points of gestures and then track back through their optimal paths, this technique is called the backward spotting. The recognizer receives the point’s trajectory between the detection of start and end points for classification. Thus, there is time delay among the spotting process of key (meaningful) gestures and the recognition process. It is being noted that the time delay is not acceptable to online implementation and applications.
In this work, we are handled meaningful gesture spotting and recognition according to two different classification techniques: generative model such as Hidden Markov Models (HMMs) with coupling of Support Vector Machine (SVM) and neural network (NNW) separately as well as discriminative models like Conditional Random Fields (CRFs)in conjunction with SVM and NNW to specify the superior in terms of recognition outcomes.
For designing the model of non-gesture a stochastic method is proposed to spot meaningful gestures accurately with HMMs in conjunction with a SVM and NNW versus CRFs in conjunction with a SVM and NNW with no training data. For HMM, the non-gesture model is used to select the starting and ending points of key gestures as an adaptive threshold which gives a trust measure which is embedded in input videos. Firstly, the hand’s segmentation is done via YCbCr color space and 3D depth map. We captured depth map by kind of stereo camera (bumblebee) where complex background sense is neutralized. The depth map which based on passive stereo measuring has been obtained using calibration data of the camera in addition to mean absolute difference. To track a hand motion a set of hand postures is extracted by mean-shift technique coupling of depth map to correctly achieve accurate and robust hand trajectory. Secondly, in addition to the extracted features of three angles we derived also the features of Zernike moments and elliptic Fourier of dynamic affine-invariants from 3Dspatio-temporal hand volume. Finally, the generative model of Hidden Markov Model and discriminative models of Conditional Random Fields perform the processes of spotting as well as recognition using the set of angles features. In addition, we employed the technique of forward spotting in coupling with circular buffer mechanism which carries out hand gesture spotting and recognition simultaneously with no time delay. After that, a Support Vector Machine is employed to verify the hand shape in between the start and the end point of key/meaningful gesture, that subject powerful view invariant process.
We implemented the gestures numbers from 0 to 9 for gesture-based interaction as an application to explain the coactions of indicated contents and the validation of gesture spotting and recognition system. We trained the HMMs models by Baum-Welch (BW) algorithm while we trained CRFs by using gradient ascent along Broyden-Fletcher-Goldfarb-Shanno (BFGS) optimization technique. By Bumblebee stereo camera system of 6mm focal point we captured the input images at 15FPS with 240×320 pixels frame resolve, application by Matlab. Founded form dataset, we get the classification results which contain 700 video piece for isolated gestures (i.e. 70 video piece for each gesture number from 0 to 9). For training NNW, HMMs, CRFs and SVMs each isolated number from 0 to 9 have 42 videos. But for testing, the data set includes 280 videos of hand gestures. Experiments explain that the suggested system can effectively spot and recognize hand gesture from continuous hand trajectories with 94.64%, 91.07, 88.93% and with 86.08, 92.50 % recognition rate for HMMs in coupling with SVM, HMMs in coupling with NNW, NNW separately and CRFs in conjunction with NNW, CRFs in conjunction with SVM respectively.