Search In this Thesis
   Search In this Thesis  
العنوان
Computational selection of cancer DNA methylation genes /
الناشر
Alhasan Ali Hefdhaldin Alkuhlani ,
المؤلف
Alhasan Ali Hefdhaldin Alkuhlani
هيئة الاعداد
باحث / Alhasan Ali Hefdhaldin Alkuhlani
مشرف / Ibrahim Farag Abdelrahman
مشرف / Mohammad Nassef Fattoh Abdelrahman
مشرف / Ibrahim Farag Abdelrahman
تاريخ النشر
2017
عدد الصفحات
115 Leaves :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
Computer Science (miscellaneous)
تاريخ الإجازة
25/2/2018
مكان الإجازة
جامعة القاهرة - كلية الحاسبات و المعلومات - Computer Science
الفهرس
Only 14 pages are availabe for public view

from 131

from 131

Abstract

Cancer is a serious disease that causes death worldwide. There are many types of cancer that can be caused by either genetic or epigenetic changes. The epigenetic changes represent the modi cations that can be forwarded to next generations with no changes in the DNA sequence. DNA methylation (DNAm) is a common epigenetic mechanism, which controls the regulation of gene expression and is useful for early detection of cancer. Microarrays are the best tool to identify and analyse the DNA methylation. The challenge with DNA methylation microarray datasets is its high dimensionality. More speci c, the huge number of CpG sites compared to the number of samples in these datasets. Recent research e orts attempted to reduce this high-dimensionality by di erent feature selection techniques. This thesis proposes a new approach, namely multi-stage feature selection (MSFS), to select the optimal CpG sites from three di erent DNAm cancer datasets (breast, colon and lung). The proposed approach utilizes the Filter and Wrapper feature selection methods. MSFS combines three di erent Filter feature selection methods including Fisher Criterion, t-test and Area Under ROC Curve (AUC) by using ensemble-based strategy. In addition, as a Wrapper feature selection, we apply a hybrid Genetic Algorithm with Support Vector Machine - Recursive Feature Elimination (SVM-RFE) as its tness function, and SVM as its evaluator. Using the Incremental Feature selection (IFS) strategy, subsets of 24, 13 and 27 optimal CpG sites are selected for the breast, colon and lung cancer datasets respectively