Author: Hakeem,Fady Khalaf Fahmy/ Title: Text-to-Speech Method Optimization \

Search In this Thesis

العنوان

Text-to-Speech Method Optimization \

المؤلف

Hakeem,Fady Khalaf Fahmy

هيئة الاعداد

باحث / فادى خلف فهمى حكيم

مشرف / حازم محمود عباس

مشرف / محمود إبراهيم خليل

مناقش / محسن عبد الرازق رشوان

تاريخ النشر

2020

عدد الصفحات

107p.:

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

الهندسة الكهربائية والالكترونية

تاريخ الإجازة

1/1/2020

مكان الإجازة

جامعة عين شمس - كلية الهندسة - هندسة الحاسبات والنظم

الفهرس

Only 14 pages are availabe for public view

from

138

from

138

Abstract

Speech synthesis is the artificial production of human speech. A typical text- to-speech (TTS) system converts a language text into a waveform. There ex- ist many English (TTS) systems that produce mature, natural, and human- like speech synthesizers. In contrast, other languages, including Arabic, have not been considered until recently. Existing Arabic speech synthesis solu- tions are slow, of low quality, and the naturalness of synthesized speech is inferior to the English synthesizers. They also lack essential speech key fac- tors such as intonation, stress, and rhythm. Different works were proposed to solve those issues, including the use of concatenative methods such as unit selection or parametric methods. However, they required a lot of laborious work and domain expertise. Another reason for such poor performance of Arabic speech synthesizers is the lack of speech corpora, unlike English that has many publicly available corpora1 2 and audiobooks.
End-to-end speech synthesis methods managed to achieve nearly natural and human-like speech. they are prone to some synthesis errors such as missing or repeating words, or incomplete synthesis. We may argue that this is mainly due to the local information preference between teacher forc- ing input and the learned acoustic features of a conditional autoregressive model. The local information preference prevents the model from depending on text input when predicting acoustic feature which contributes to syn- thesis errors during inference time. In this work, we compare between two