Clustering algorithms have been a big part of the research I have conducted over the last four years.
The k-Means clustering algorithm works as follows;
This is implemented in C using CUDA kernels for the assignment, calculating new cluster centers and checking for movement. It is important to note that this was written for a purpose so may not be robust when used for other things. It is published here as it may be useful for others.
Compile:
To run it is the filename followed by k value and then number of columns in your data. e.g.
It assumes a space seperated file of points. e.g.
20085 8450
6861 17864
688 2877
5679 2363
10763 19460
4542 19642
7753 6872
7934 22417
12341 27547
12472 19925