Search In this Thesis
   Search In this Thesis  
العنوان
Generative Adversarial Text to image synthesis
المؤلف
Mohamed Fathallah Salem Abo hedima
هيئة الاعداد
باحث / محمد فتح الله سالم أبو هديمة
مشرف / شريف سعيد العتربى
مناقش / أسامة محمد أبو سعده
مناقش / حمدي محمد موسى
عدد الصفحات
200p.
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
علوم الحاسب الآلي
تاريخ الإجازة
21/12/2023
مكان الإجازة
جامعة المنوفية - كلية الحاسبات والمعلومات - علوم الحاسب
الفهرس
Only 14 pages are availabe for public view

from 92

from 92

Abstract

Generative adversarial networks (GANs) are a powerful tool for synthesizing realistic images, but they can be difficult to train and are prone to instability and mode collapse. This thesis proposes two models. The first model is to stabilize and improve the training of GANs, and the second model makes use of the first model to generate images from text. The thesis proposes a new model called the Identity Generative Adversarial Network (IGAN) that addresses mode collapse and stability issues.
This model is based on three modifications to the baseline deep convolutional generative adversarial network (DCGAN). The first modification is to add a non-linear custom identity block to the architecture. This will make it easier for the model to fit complex data types and cut down on the time it takes to train.
The second modification is to smooth out the standard GAN loss function by using a modified loss function. The third and final modification is to use minibatch training to let the model use other examples from the same minibatch as side information to improve the quality and variety of generated images.
These modifications help stabilize the training process and improve the model’s performance. The performance of the GAN models is compared using the inception score (IS) and the Frechet inception distance (FID), which are widely used metrics for evaluating the quality and diversity of generated images. The effectiveness of our approach was tested by comparing an IGAN model with other GAN models on the CelebA and stacked MNIST datasets.
Results show that IGAN outperforms all the other models, achieving an IS of 13.95 and an FID of 43.71 after training for 200 epochs. In addition to demonstrating the improvement in the performance of the IGAN, the instabilities, diversity, and fidelity of the models were investigated. The results showed that the IGAN was able to converge to a distribution of the real data more quickly. Furthermore, the experiments