Author: Ali، Yosra Abd El Moniem Emam Mahmoud./ Title: Hybrid Speech Image Model for Solving Automatic Speech Recognition /

Search In this Thesis

العنوان

Hybrid Speech Image Model for Solving Automatic Speech Recognition /

المؤلف

Ali، Yosra Abd El Moniem Emam Mahmoud.

هيئة الاعداد

باحث / يسرا عبد المنعم إمام محمود

مشرف / عمرو محمد رفعت

مشرف / نشأت محمد حسين

مناقش / نشأت محمد حسين

الموضوع

Speech Image.

تاريخ النشر

2019.

عدد الصفحات

149 p> ،

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

الهندسة الكهربائية والالكترونية

تاريخ الإجازة

2/1/2019

مكان الإجازة

جامعة الفيوم - كلية الهندسة - قسم هندسة الإلكترونيات والإتصالات

الفهرس

Only 14 pages are availabe for public view

from

149

from

149

Abstract

Speech processing tasks may be classified into three categories; speech synthesis, speech encoding and speech recognition. Speech recognition is the process of converting the speech signal into sequence of words or classes. Spoken language consists of units like words or sub-words called Syllables. Mono phone and tri phone are considered examples of sub-word unit, recognizing the language unit is the objective of automatic speech recognition.
In this research the work is oriented toward speech recognition enhancement by preprocessing the speech contents. This research focus on speech classification into vowels, consonants, silent and closures sounds. In this research it is trying to enhance the stationarity of speech signal by moving it into more stationary 2D domain. In this research speech duration (frame) is transformed into 2-dimensional image using suggested technique called Best Tree Encoded image (BTEI). BTEI is an algorithm that visualizes the best wavelet tree nodes in frequency domain by entropy. This technique is basically relaying on the entropy of wavelet packet tree nodes. The entropy is utilized to select the best nodes that represent the signal. The research introduced a study and evaluation of context independent phone recognition using BTE. The research provides comparison against MFCC as evaluation technique. The archived results shows that, recognition rate using that proposed new features (BTE) is almost approaching the popular MFCC’s but it is better than MFCC in memory space needed to store the features vector by average saving of 66%. This promising achievement makes it worthy .