Abstract Context. Cluster analysis is a method of classification without a teacher, that is, under conditions where preliminary information on the number of clusters is previously unknown. Therefore, defining the optimal number of clusters and test results of partitioning data sets is a complex task and requires further research. Objective. The aim of paper is to study the efficiency of finding the natural data structure by crisp and fuzzy clustering validity indices, when the partition is realized by the clustering method based on fuzzy binary relations and conducting their comparative analysis. Method. For partition of data sets the method based on fuzzy binary relation was used that provides an opportunity to simultaneously conduct crisp and fuzzy grouping of objects by different types of similarity measures. The distance similarity measure, which divides data into ellipsoid clusters, is used in the research. Two synthetic 2-dimensional data sets of a special type are generated, natural clustering of which is possible in two ways. Both sets are Gaussian. The most effective and frequently used groups of crisp and fuzzy cluster validity indices, which allow to find the optimal data set structure are described. Results. The study of estimating the quality of clustering was conducted by means of method of fuzzy binary relations with six indices in two data sets. A comparative analysis of the effectiveness of determining the cluster and sub-cluster data structures by validity indices is made. Conclusions. In practice, for some cluster validity indexes it is important to find not only the global extreme, but also local ones. They can fix the optimal sub-cluster data structure with less separation. To ensure the effectiveness of estimating the quality of clustering and to obtain objective results it is appropriate to take into account not only one index, but several of them. In perspective studies, creating a combined criterion that would join the most effective cluster validity indices by means of method based on fuzzy binary relations by a distance similarity measure is anticipated as well as implementing generalized cluster validity index for any similarity measures of fuzzy binary relations method; developing a software system that would ensure the automatic grouping of objects into clusters by concentric spheres, cones, ellipses without the preliminary determination of the clustering threshold. Author Biography N. E. Kondruk, Uzhgorod National University, Uzhgorod PhD, Associate professor, Associate Professor of Department of Cybernetics and Applied Mathematics References Kondruk N. E. Decision Support System for automated diets, Management of Development of Complex Systems, 2015, Issue. 23(1), pp. 110–114.
Dergi Türü : Uluslararası
Benzer Makaleler | Yazar | # |
---|
Makale | Yazar | # |
---|