In this paper, we propose a GA-based unsupervised clustering technique that selects cluster centers directly from the data set, allowing it to speed up the fitness evaluation by constructing a look-up table in advance, saving the distances between all pairs of data points, and by using binary representation rather than string representation to encode a variable number of cluster centers. More effective versions of operators for reproduction, crossover, and mutation are introduced. Finally, the Davies-Bouldin index is employed to measure the validity of clusters. The development of our algorithm has demonstrated an ability to properly cluster a variety of data sets. The experimental results show that the proposed algorithm provides a more stable clustering performance in terms of number of clusters and clustering results. This results in considerable less computational time required, when compared to other GA-based clustering algorithms.
淡江理工學刊=Tamkang journal of science and engineering 8(2), pp.113-122