Author: El-Bana, Shimaa Hassan Abd El-Kader./ Title: Efficient Image Processing Using New Structures of Deep Neural Networks for Different Applications \

Search In this Thesis

العنوان

Efficient Image Processing Using New Structures of Deep Neural Networks for Different Applications \

المؤلف

El-Bana, Shimaa Hassan Abd El-Kader.

هيئة الاعداد

باحث / شيماء حسن عبد القادر البنا

مشرف / سعيد السيد إسماعيل الخامى

مشرف / حسن محمود محمود الرجال

مشرف / أحمد فتحى محمد القبانى

مناقش / السيد مصطفى سعد

مناقش / نهى عثمان قرنى غريب

الموضوع

Electric Communication.

تاريخ النشر

2024.

عدد الصفحات

159 p. :

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

الهندسة (متفرقات)

تاريخ الإجازة

7/3/2024

مكان الإجازة

جامعة الاسكندريه - كلية الهندسة - الهندسة الكهربائية

الفهرس

Only 14 pages are availabe for public view

from

184

from

184

Abstract

The most commonly used deep learning framework for image-processing applications has evolved as convolutional neural networks (CNNs). This prominence stems from their capability to extract intricate characteristics from basic data. Within CNNs, the pooling operation plays a vital role by reducing the relative dimension of the characteristic representations and introducing transformation invariance to the network. However, conventional pooling schemes, including maximum and average pools, possess certain limitations, notably, a certain pooling window size and data loss. Additionally, the effectiveness of deep architectures for computation has faced several challenges, primarily related to constraints on training data availability. This limitation has posed significant hurdles in fully harnessing the potential for uses of deep architectures a significant degree of privacy and security. While data augmentation and transfer learning have advanced considerably in narrowing the performance gap, there remains room for further enhancement. In this thesis, two variants of wavelet pooling were investigated. The initial variant encompasses all sub-bands in a first-level wavelet decomposition, whereas the second variant adapts to specific sub bands based on the input images processed by the architecture. This adaptive technique is mentioned as Matched Wavelet Pooling (MWP). The initial application involves the creation of a novel model based on the MobileNet- V1. This model comprehensively incorporates all proposed wavelet sub-bands, and an in-depth investigation is conducted to assess the impact of wavelet pooling on the model’s performance in image recognition tasks. Through evaluations on two widely recognized datasets, the proposed model’s performance is compared to that of the baseline MobileNet. The results reveal a notable increase in classification accuracy, with a 10% improvement on CIFAR-10 and a substantial 16% enhancement on CIFAR-100. Additionally, a shallower version of the proposed architecture, also equipped with wavelet pooling, undergoes evaluation. Remarkably, this shallower model not only maintains classification accuracy at or above the deep versions of MobileNet but also achieves a reduction of nearly 40% in the number of model parameters. Secondly, the proposed application focuses on examining the influence of discrete wavelet transform pooling (DWTPL) in the context of multi-label remote sensing classification images. This approach, known as MLRS-CNN-DWTPL, leverages wavelet pooling to harness spectral information, a valuable asset in multi-label remote sensing tasks. Performance evaluations are evaluated on two extensively utilized datasets, including the widely recognized AID dataset, and conducted comparisons with baseline CNNs. Thirdly, a novel matched wavelet pooling (MWP) approach has emerged, showing great promise, especially when applied to Mobile Nets. This method involves selecting wavelet band(s) for inclusion during training by matching input images to specific wavelet sub-bands. This thesis aims to answer a fundamental research question: How does the performance of light-weight models such as Mobile Nets differ when MWP is employed in comparison to standard MobileNets and MobileNets using non-matched wavelet pooling? The primary contribution of this study is addressing this inquiry. It hypothesizes that MWP, when applied to MobileNets, requires a fewer amount of training data than standard MobileNets and non-matched wavelet pooling to achieve the same level of recognition accuracy. The STL10 and CINIC10 benchmarks, which consistently reveal significant data reductions (maintained through MWP) of nearly 30% with regard to both the basis Mobile-Net scheme and non-matched wavelets pooling, are used as the basis for the evaluations. These benchmarks also consistently report superior recognition accuracy. In addition, sub-band selection procedures are designed for image-specific pooling in the final application. This approach has been tailored for CNN designs that use semantic segmentation, with a focus on the U-Net model. The suggested pipeline based on MWP, using the MWP-UNet architecture, consistently outperforms traditional pooling techniques across three widely used datasets. When compared to current literature, it produces a notable average improvement in Intersection over Union (IoU) of more than 25%. In summary, wavelet pooling stands as a noteworthy progression within the realm of deep learning and convolutional neural networks (CNNs). This novel pooling technique presents a range of benefits, primarily aimed at mitigating the inherent shortcomings found in conventional pooling methods.