Author: El-Sayed, Mohammed El-Sayed Helal./ Title: An intelligent classification system for cancer diagnosis /

Search In this Thesis

العنوان

An intelligent classification system for cancer diagnosis /

المؤلف

El-Sayed, Mohammed El-Sayed Helal.

هيئة الاعداد

باحث / محمد السيد هلال السيد

مشرف / رشيد مختار العوضى

مشرف / محمد محفوظ الموجى

مناقش / علاء الدين محمد رياض

مناقش / محمود محمد أحمد عبداللطيف

الموضوع

diagnosis - Data processing. Medicine - Data processing. Medical Informatics Computing.

تاريخ النشر

2017.

عدد الصفحات

168 p. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

Information Systems

تاريخ الإجازة

01/05/2017

مكان الإجازة

جامعة المنصورة - كلية الحاسبات والمعلومات - Information System

الفهرس

Only 14 pages are availabe for public view

from

168

from

168

Abstract

Medical diagnosis is one of the most current research areas, which used machine learning algorithms to increase the efficiency and quality of the medical diagnosis. Dealing with medical data is a hard process in nature, which must overcome many problems such as uncertainty, inconsistency, incompleteness, and complexity of data types. Medical data analysis is a complex task because it requires knowledge of the medical dataset as well as advanced techniques for processing, storing, and accessing information from the data. Traditional techniques are not capable enough of producing optimal results from incomplete or redundant data through the analysis process. So, an accurate and reliable diagnosis technique is needed to help experts to get optimal results that can lead to successful treatment. Automation of diagnostic system is needed because not all specialists are experts in cross-domain. In these systems, actual data may be unrelated, excessive, noisy, and not all the attributes are valuable for classification. In this case, feature selection is necessary while dealing with actual datasets. In this thesis, we produce our proposed framework. This framework has been applied to two case studies. In the first case, feature selection technique is combined with an individual classifier (C5.0 and SVM) and ensemble classifiers (Boosting C5.0 and Boosting SVM). The Rough set is employed as feature selection technique and (C5.0, SVM, Boosting C5.0, and Boosting SVM) are employed as classification techniques. In this case, HCV data sets are involved. The HCV data set was collected from clinical trials of a newly developed medication for HCV. In the experimental result, we compare the result for each classifier with and without feature selection. It shows that the proposed hybrid RS-Boosting SVM has higher accuracy, sensitivity, and specificity rates with selected subset features than using hybrid RS-Boosting C5.0 for all snapshots (3 Months, 6 Months, and 9 Months) datasets. In the second case, we concentrate on integrating feature selection technique with individual and heterogeneous ensemble classifier (Naïve Bayes, C5.0, and SVM). In this case two different cancer data sets are involved. The first is the HCV data set, which was collected from clinical trials of a newly developed medication for HCV. The other data set is called ‘Ovarian Cancer,’ which is collected and confirmed by the ethical institutional review board at Mansoura University. In the experimental results, we compare the result of each classifier with heterogeneous ensemble classifier. It shows that heterogeneous ensemble model has not achieved high performance than other techniques on HCV dataset. Although, it shows that heterogeneous ensemble model has achieved high performance than other techniques on the Ovarian dataset. In addition, different classification techniques behave differently on different datasets depending on the nature of their attributes and size.