Main Article Content
Background: Fuzzy c-means (FCM) algorithm is popular in clustering data sets with soft boundaries. However, fuzzy c-means clustering (FCM) performance deteriorates with higher dimensions, in the presence of noise and due to high initial bias.
Objective: The article proposes a new approach to improve the performance of the conventional FCM by applying the concept of supervised learning.
Methodology: The proposed method comprises five stages. At first, a given data set is clustered using conventional FCM for 10 runs. Next, a newly introduced supervised clustering algorithm divides the data points into core and boundary data points based on the membership values obtained from FCM for each run. The ensemble of the core data points in multiple runs of FCM forms the final core data points. The final core data points are further clustered to obtain the labels using the K-means algorithm. The cluster labels of the boundary data points are estimated using a k-NN classifier from the cluster labels of the core data points.
Results: We check the performance of our proposed method on two well-known data sets of the UCI machine learning repository, viz., the Iris data set and the thyroid data set. Three performance measures, namely the rand index, adjusted rand index, and Minkowski score are used to compare our proposed algorithm with the conventional fuzzy c-means clustering algorithm.
Conclusion: The simulation results show the efficacy of our proposed algorithm. The proposed algorithm may be used in different clustering-related applications