Author: Ayoub, Ahmed Younis Hanafy./ Title: A Study On Recent Approaches In Cluster Analysis \

Search In this Thesis

العنوان

A Study On Recent Approaches In Cluster Analysis \

المؤلف

Ayoub, Ahmed Younis Hanafy.

هيئة الاعداد

باحث / احمد يونس حنفي ايوب

مشرف / اسلام محمد ابراهيم الدسوقي

مناقش / سعيد علي السيد الصيرفي

مناقش / محمود زكي رجب

الموضوع

Cluster Analysis. Mathematical statistics - Data Processing. Social Sciences - Statistical Methods.

تاريخ النشر

2019.

عدد الصفحات

35 p. :

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

الهندسة (متفرقات)

تاريخ الإجازة

26/10/2019

مكان الإجازة

جامعة المنوفية - كلية الهندسة - العلوم الأساسية الهندسية

الفهرس

Only 14 pages are availabe for public view

from

135

from

135

Abstract

Data analysis plays an indispensable role in understanding different phenomena. Cluster analysis is one of the most important data analysis methods. Cluster analysis is a type of statistical method that can be applied to data, where the initial data are generated and grouped into clusters. The cluster is a group of relatively homogeneous situations or observations. The components of the cluster are similar to each other, while the elements of the different clusters are less homogenous. Cluster analysis is applied in many different fields such as biology, data recovery, climate, psychology, medicine, project management, etc.
There are two traditional methods of cluster analysis: hierarchical cluster analysis and partional method. The most important method of partional methods is the k-means algorithm. K-means algorithm is the most popular and easiest method to perform cluster analysis. The problem of cluster analysis is considered one of the problems of nonlinear programming. Due to the disadvantages of k-means clustering, many evolutionary methods have emerged in recent decades such as genetic algorithm, particle swarm optimization (PSO), ant colony optimization (ACO), fuzzy optimization, artificial bee colony (ABC), simulated annealing (SA), differential evolution (DE), artificial immune system, an improved pigeon-inspired optimization (IPIO), monkey algorithm (MA), krill herd algorithm (KHA), bacterial foraging algorithm ‎(BFA), cat swarm optimization (CSO), etc.
This thesis introduces a new evolutionary algorithm using the genetic algorithm based k-means algorithm to perform cluster analysis.
This thesis consists of five main chapters. These chapters can be described in the following manner:
CHAPTER 1: The most important aim of this chapter is to introduce the basic concepts and definitions of cluster analysis and nonlinear programming. In addition, the classification of nonlinear programming is introduce.
CHAPTER 2: This chapter discusses the working principle and the Implementation of GA. Furthermore, the different ways of encoding; selection, crossover, and mutation are presented. Also, in this chapter, k-means algorithm is presented. In addition, a survey of researches to solve cluster analysis using the hybrid between genetic algorithm and k-means algorithm.
CHAPTER 3: This chapter proposed a new methodology to perform cluster analysis based on genetic algorithm (GA). Firstly, the population of GA is initialized by kmeans algorithm to reach the best centers of clusters. Secondly, the GA operators are applied. New mutation is proposed depending on the extreme points in clusters groups to overcome the limitations of k-means algorithm. Finally, the proposed approach is applied on a set of problems consists of a non-overlapping data and large datasets with high dimensionality from machine learning repository (UCI).
CHAPTER 4: In this chapter, the algorithm provided in the chapter 3 is applied to perform cluster analysis, to real world applications: image segmentation and electrical engineering to find the optimal place for the distribution generator.
CHAPTER 5: This chapter describes some concluding remarks, recommendations, and some points for further researches.