Search In this Thesis
   Search In this Thesis  
العنوان
SPEEDING UP LARGE SCALE MACHINE LEARNING ALGORITMS USING GPGPU /
المؤلف
Elgarhy,Islam Ahmed Hamed.
هيئة الاعداد
باحث / Islam Ahmed Hamed Elgarhy
مشرف / Hossam El-Deen Mostafa Fahim
مشرف / Rania Abd El-Rahman El Gohary
مشرف / Heba Ahmed Khaled
تاريخ النشر
2019
عدد الصفحات
93p.:
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
Computer Science (miscellaneous)
تاريخ الإجازة
1/1/2019
مكان الإجازة
جامعة عين شمس - كلية الحاسبات والمعلومات - الحاسبات و المعلومات
الفهرس
Only 14 pages are availabe for public view

Abstract

As many of the machine learning algorithms, SVM requires a high computational cost (memory and time) to solve a complex quadratic programming (QP) optimization problem, so SVM necessitate a high computing hardware capabilities.
Due to the physical limitation in miniaturization process, the central processing unit (CPU) clock frequency can’t be increased, therefore the huge improvements done in packaging multiple CPU cores onto the same silicon chip and also using graphical processing unit (GPU) for general purpose numerical computing.
With the advantages of parallel multi-architecture in both multi-core CPU and a high-scalable GPU, there is a promising candidate to enhance the performance of the algorithms that fits well to run in parallel multi-architecture, so there is a chance to enhance the SVM high computational time for solving the optimization problem.
Moreover, Tensorflow is an open-source machine learning framework library that allows to implement machine learning algorithms using Application program interfaces (APIs). Tensorflow has the ability to migrate to alternative hardware components, and it will reduce time for developing alternative algorithms, so there is a chance to use Tensorflow library for implementing a cross-platform SVM implementation with short development time.This thesis, presents the design and implementation of a hybrid parallel implementation for SVM algorithm, also it show a comparative study between this hybrid parallel implementation and tensorflow implementation.
The benchmark shows a significant improvements in speed up for hybrid parallel implementation over sequential implementation, other parallel implementation and tensorflow implementation.
The proposed hybrid parallel implementation achieves a speed up of 40X over the sequential open-source library (LIBSVM), a speed up of 7.5X over the CUDA-OPENMP for training process with (44442 records, 102 features size, and 9 classes), a speed up of 13.7X over LIBSVM in classification process for 60300 records, and a speed up of 14.9X over the SVM Tensorflow implementation on pavia centre dataset.
CUDA-GPU achieves a speed up of (154.3X, 60.5X, and 119.7X) over Tensorflow-GPU for three different training datasets (pavia centre hyperspectral, breast cancer, and iris folower) respectively. Also, Experimental results show that the explicit control in CUDA API have a speed up over the implicit control in Tensorflow. However, Tensorflow is a cross-platform implementation where it can be migrated to alternative hardware components, which will reduces the development time.