What it does
Input: a data set Output: number of clusters in that data set and the positions of the clusters
3 enthusiasts. 3 different majors. First time to a Hackathon. Like all other young guys, want to make something big. Thus, decided to do the FlowJo challenge, which is all about creating a data clustering plugin. None of us knew data clustering. No problem. Started learning data clustering algorithms. Came up with awesome ideas. Realized the plugin must be written in Java. None knew anything about Java or plugins. Not a problem. It's not rocket science. Started learning Java and making plugins. Still enthusiastic. Also, started learning inbuilt function of FlowJo, cells cytometry, smoothing algorithms. But then. Found out installing a plugin to FlowJo requires a license number. We all felt so depressed finding that out. It seems like an impassible wall. It was late, we were tired and stuck. One quit. Two left. We continue. 8 am, receive answer to the license problem from a scientist at FlowJo. Fighting spirit rose. But not enough time left to make and debug a plugin in Java. Still want our idea to be heard. Made a MATLAB prototype. Finished debugging just in time.
Main idea of the program
FlowJo can cluster the data only if the user input the number of cluster. Our plugin enable FlowJo to choose the number of clusters all by itself. The idea is: Make a histogram for the data with large steps Count the number of maxima The number of clusters will be the same as the number of maxima
The most interesting part is the large steps. Small step will create a lot of noise. The large steps eliminate the noise and still give the right answer (because the size of the clusters are still about 5 times the size of each step)
What's next for Density Clustering
Increase the dimension of data from 2D to n-D. And make it into a FlowJo plugin.