Efficient Distributed Clustering with Cuckoo Search Algorithm and GPU Acceleration for Big Data Analysis

Authors

  • Hadjir Zemmouri MISC Laboratory, University of Constantine 2 Abdelhamid Mehri, Ali Mendjeli, 25000, Constantine, Algeria
  • Said Labed MISC Laboratory, University of Constantine 2 Abdelhamid Mehri, Ali Mendjeli, 25000, Constantine, Algeria
  • Akram Kout MISC Laboratory, University of Constantine 2 Abdelhamid Mehri, Ali Mendjeli, 25000, Constantine, Algeria
  • El-Bay Bourennane ImVia Laboratory, University of Bourgogne, 9, avenue Alain Savary, 21078 Dijon, France

Keywords:

Clustering, Big Data, cuckoo search algorithm, distributed computing, GPU

Abstract

Clustering analysis is a crucial method in data mining, aimed at identifying clusters of data objects in the attribute space. Distributed clustering has gained prominence due to the emergence of Big Data. The rapid growth of data, particularly with the advent of technologies, such as the Internet of Things and 5G, has resulted in numerous challenges for data analysis and processing. Traditional clustering methods, such as K-means and hierarchical clustering, struggle with extensive datasets designed for smaller to moderately sized datasets. Meta-heuristic techniques have garnered significant attention among the various distributed clustering algorithms due to their ability to deliver high-quality solutions across a wide range of optimization problems. In this study, we proposed a new Cuckoo search (CS) clustering algorithm for distributed clustering to address the challenges of Big Data clustering. First, the CS clustering algorithm is executed on each local site, utilizing GPU acceleration for efficient local data clustering. Second, on a global scale, representative data from each site are aggregated and processed worldwide, with centroids iteratively updated to generate the final clustering result. We have significantly enhanced the processing efficiency by minimizing transmission costs and eliminating the need for inter-node communication. Furthermore, our approach demonstrates adaptability in handling large datasets with competitive execution times through the utilization of parallel processing and distributed computing. Our approach demonstrates its efficiency and scalability across diverse datasets, showing potential for various applications.

Downloads

Download data is not yet available.

Published

2025-10-30

How to Cite

Zemmouri, H., Labed, S., Kout, A., & Bourennane, E.-B. (2025). Efficient Distributed Clustering with Cuckoo Search Algorithm and GPU Acceleration for Big Data Analysis. Computing and Informatics, 44(5). Retrieved from http://147.213.75.17/ojs/index.php/cai/article/view/7091