Multilevel K-Means Density Based Flow Clustering Algorithm for Data Streams
Main Article Content
Abstract
The Data stream clustering is an active area of research that has recently emerged with the goal of discovering new knowledge from a large amount and variability of constantly generated data. In this context, different-different algorithm for unsupervised learning that clusters multiple data streams has been proposed by many researchers. There is a need for a more efficient and efficient data analysis method. This paper introduces a multi-level K-Means density-based flow clustering algorithm (MKDCSTREAM) for clustering problems. This approach proposes to view the problem of clustering is a optimization process hierarchy that follow different levels, from unrefined to subtle. In the clustering problem, for the solution first divide the problem in parts, by following different levels to make the first clustering a coarser problem than calculated. Coarse problem clustering is mapped level by level and improves the Clustering the original problem by improving intermediate clustering using the general K-means algorithm. Compare the performance of the hierarchical approach with its single-tier approach using tests with a set of datasets collected from different areas.
Downloads
Article Details
References
K-Means algorithm for the clustering problem. 2016 IEEE International Conference on Cloud Computing and Big Data
Analysis (ICCCBDA). doi:10.1109/icccbda.2016.7529544
[2] Ng, H.P., Ong, S.H., Foong, K.W.C., Goh, P.S., Nowinski, W.L.:
Medical Image Segmentation Using K-Means Clustering
and Improved Watershed Algorithm. 2006 IEEE Southwest
Symposium on Image Analysis and Interpretation.
[3] Lee, D., Althoff, A., Richmond, D., Kastner, R.: A streaming
clustering approach using a heterogeneous system for big data
analysis. 2017 IEEE/ACM International Conference on Computer-
Aided Design (ICCAD). (2017).
[4] Amini, A., Saboohi, H., Ying Wah, T., Herawan, T.: A Fast Density-
Based Clustering Algorithm for Real-Time Internet of Things
Stream. The Scienti c World Journal. 2014, 1–11 (2014).
[5] Aggarwal, C.C., Yu, P.S., Han, J., Wang, J.: A Framework for
Clustering Evolving Data Streams. Proceedings 2003 VLDB
Conference. 81–92 (2003).
[6] Nalluri, S. K., & Parasaram, V. K. B. (2015). Automating
Software Builds with Jenkins: Design Patterns and Failure
Handling. International Journal of Technology, Management
and Humanities, 1(01), 16-33.
https://doi.org/10.21590/ijtmh.01.02.03
[7] Dash, B., Mishra, D., Rath, A., Acharya, M.: A hybridized K-means
clustering approach for high dimensional data-set. International
Journal of Engineering, Science and Technology. 2, (2010).
[8] Aggarwal, C.C., Yu, P.S.: A Framework for Clustering Uncertain
Data Streams. 2008 IEEE 24th International Conference on Data
Engineering. (2008).
[9] Dubey, A.K., Gupta, R., Mishra, S.: Data Stream Clustering for
Big Data Sets: A comparative analysis. IOP Conference Series:
Materials Science and Engineering. 1099, 012030 (2021).
[10] Barbará, D.: Requirements for clustering data streams. ACM
SIGKDD Explorations Newsletter. 3, 23–27 (2002).
[11] Inuwa-Dutse, I., Liptrott, M., Korkontzelos, I.: A multilevel
clustering technique for community detection.
Neurocomputing. 441, 64–78 (2021).
[12] Ahmad, A., Dey, L.: A k-mean clustering algorithm for mixed
numeric and categorical data. Data & Knowledge Engineering.
63, 503–527 (2007).
[13] Khalilian, M., Mustapha, N., Sulaiman, N.: Data stream clustering
by divide and conquer approach based on vector model.
Journal of Big Data. 3, (2016).
[14] Lichman, M.: UCI machine learning repository, http://archive.
ics.uci.ed/ml.