Author: Sabah Sayed Mohammad Abdelghany/ Title: A computational framework for colorectal cancer /

Search In this Thesis

العنوان

A computational framework for colorectal cancer /

الناشر

Sabah Sayed Mohammad Abdelghany ,

المؤلف

Sabah Sayed Mohammad Abdelghany

هيئة الاعداد

باحث / Sabah Sayed Mohammad Abdelghany

مشرف / Ibrahim Farag Abdelrahman

مشرف / Amr Ahmed Anwar Ali Badr

مشرف / Mohammad Nassef Fattoh

تاريخ النشر

2019

عدد الصفحات

115 Leaves :

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

Computer Science Applications

تاريخ الإجازة

17/10/2019

مكان الإجازة

جامعة القاهرة - كلية الحاسبات و المعلومات - Computer Science

الفهرس

Only 14 pages are availabe for public view

from

131

from

131

Abstract

Cancer is a dangerous disease that causes death worldwide. Discovering few genes rel- evant to one cancer disease can result in e{uFB00}ective treatments. The challenge associated with the Microarray datasets is its high dimensionality; the huge number of features compared to the modest number of samples in these datasets. Recent research e{uFB00}orts attempted to reduce this high-dimensionality using di{uFB00}erent feature selection techniques. This thesis presents two ensemble feature selection techniques based on t-test and Ge-netic algorithm; Nested genetic algorithm (NestedGA) and the ensemble feature pool approach (EFPA). After preprocessing the data using t-test, the two proposed ap- proaches are used to get the optimal subset of features by combining data from two di{uFB00}erent microarray datasets. NestedGA consists of two nested genetic algorithms (Outer and Inner) that run on two di{uFB00}erent kinds of datasets. The outer genetic algorithm (OGA-SVM) works on Microarray gene expression dataset, whereas the Inner Genetic algorithm (IGA-NNW) runs on DNA methylation dataset. NestedGA is performed on a Colorectal cancer dataset with 5-fold cross validation. After applying NestedGA, the Incremental feature selection (IFS) strategy is used to get the smallest optimal genes subset. The genes subset has been validated on an independent dataset resulting in 99.9% classi{uFB01}cation accuracy the non- nested GAs