A Fully Online Clustering Approach for Enhanced Performance of Health Information System

Authors

  • Ahmed Al-Shammari Department of Computer Science, College of Computer Science and Information Technology, University of Al-Qadisiyah, Al Diwaniyah

DOI:

https://doi.org/10.29304/jqcm.2023.15.2.1239

Keywords:

Data Mining, Clustering, Data stream, Maintenance, Healthcare

Abstract

Health Data clustering is a significant research direction aiming to extract knowledge from a continuous health data flow to support online health decisions. However, processing health data clusters is still a challenging task. Existing clustering approaches are subject to various limitations in terms of considering the neighbor clusters and conducting multiple operations during the maintenance process. In this paper, we model, design and implement a novel framework called IClustMaint for efficiently clustering and maintaining health data clusters incrementally. A two-phase algorithm is embedded in the framework. We first employ the Principal Component Analysis (PCA) method to efficiently reduce the high costs of the initial clustering phase. Next, in the maintenance phase, we propose the incremental Cluster maintenance (ICM) approach for managing the generated cluster during a period of time. Technically, when the data clusters are evolving over time and need to be maintained frequently, the ICM approach improves the performance of cluster maintenance by only tracking the edge points. The experimental results on a real medical dataset verify the efficiency of the proposed approaches.

Downloads

Download data is not yet available.

References

[1] M. Aamir and S. M. A. Zaidi. Clustering-based semi-supervised machine learning for ddos attack classification. Journal of King Saud University-Computer and Information Sciences, 33(4):436–446, 2021.
[2] C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A framework for clustering evolving data streams. In Proceedings of the 29th international conference on Very large data bases-Volume 29, pages 81–92. VLDB Endowment, 2003.
[3] A. Al-Shammari, C. Liu, M. Naseriparsa, B. Q. Vo, and T. Anwar. A framework for clustering and dynamic maintenance of xml documents. In Advanced Data Mining and Applications: 13th International Conference, ADMA 2017, Singapore, November 5–6, 2017, Proceedings, volume 10604, page 399. Springer, 2017.
[4] M. J. Awan, U. Farooq, H. M. A. Babar, A. Yasin, H. Nobanee, M. Hussain, O. Hakeem, and A. M. Zain. Real-time ddos attack detection system using big data approach. Sustainability, 13(19):10743, 2021.
[5] L. Bai, X. Cheng, J. Liang, and H. Shen. An optimization model for clustering categorical data streams with drifting concepts.
IEEE Transactions on Knowledge and Data Engineering, 28(11):2871– 2883, 2016.
[6] F. Cao, M. Estert, W. Qian, and A. Zhou. Density-based clustering over an evolving data stream with noise. In Proceedings of the 2006
SIAM international conference on data mining, pages 328–339. SIAM, 2006.
[7] T. Guo, K. Yu, M. Aloqaily, and S. Wan. Constructing a priordependent graph for data clustering and dimension reduction in
the edge of aiot. Future Generation Computer Systems, 128:381–394, 2022.
[8] M. Hahsler and M. Bola ˜ nos. Clustering data streams based on shared density between micro-clusters. IEEE Transactions on Knowledge and Data Engineering, 28(6):1449–1461, 2016.
[9] P. Kranen, I. Assent, C. Baldauf, and T. Seidl. The clustree: indexing micro-clusters for anytime stream mining. Knowledge and information systems, 29(2):249–272, 2011.
[10] A. Lal, H. C. Ashworth, S. Dada, L. Hoemeke, and E. Tambo. Optimizing pandemic preparedness and response through health information systems: lessons learned from ebola to covid-19. Disaster medicine and public health preparedness, 16(1):333–340, 2022.
[11] S. J. Miah, E. Camilleri, and H. Q. Vu. Big data in healthcare research: a survey study. Journal of Computer Information Systems, 62(3):480–492, 2022.
[12] J. A. Silva, E. R. Faria, R. C. Barros, E. R. Hruschka, A. C. de Carvalho, and J. Gama. Data stream clustering: A survey. ACM Computing Surveys (CSUR), 46(1):13, 2013.
[13] L. Sun, J. Ma, Y. Zhang, and H. Wang. Exploring data mining techniques in medical data streams. In Australasian Database Conference, pages 321–332. Springer, 2016.
[14] R. P. Wijayanti, P. W. Handayani, and F. Azzahro. Intention to seek health information on social media in indonesia. Procedia Computer Science, 197:118–125, 2022.
[15] T. Zhang, R. Ramakrishnan, and M. Livny. Birch: A new data clustering algorithm and its applications. Data Mining and Knowledge Discovery, 1(2):141–182, 1997.
[16] M. Zhao, A. Jha, Q. Liu, B. A. Millis, A. Mahadevan-Jansen, L. Lu, B. A. Landman, M. J. Tyska, and Y. Huo. Faster meanshift: Gpu-accelerated clustering for cosine embedding-based cell segmentation and tracking. Medical Image Analysis, 71:102048, 2021.
[17] A. Zhou, F. Cao, W. Qian, and C. Jin. Tracking clusters in evolving data streams over sliding windows. Knowledge and Information Systems, 15(2):181–214, 2008.
[18] P. Zhou, X. Wang, L. Du, and X. Li. Clustering ensemble via structured hypergraph learning. Information Fusion, 78:171–179, 2022.

Downloads

Published

2023-09-24

How to Cite

Al-Shammari, A. (2023). A Fully Online Clustering Approach for Enhanced Performance of Health Information System. Journal of Al-Qadisiyah for Computer Science and Mathematics, 15(2), Comp Page 146–154. https://doi.org/10.29304/jqcm.2023.15.2.1239

Issue

Section

Computer Articles