A Small Research Hack: Clustering and Visualisation of Protein Interaction Networks (and other cool networks)
Protein interaction networks are an incredible bed for hypothesis generation for potential studies into interactions of specific proteins whose interactions are unknown by looking at the way they are clustered, as this can hint towards protein complexes or functional interactions. The challenging part of this hack was implementing clustering methods that are not typically taught at the undergraduate level, ie. topographical/graphical clustering methods (unlike the usual k-means which has a notion of distance), so I am essentially clustering binary interaction networks.
This small research hack implements five modern topographical clustering methods applied onto a protein interaction network extracted from high throughput studies performed on Sacchoromyces Cerevisiae and other datasets like a friendship interaction network from social networks like Facebook.
- K Clique Percolation
The clustering algorithms can be applied to many more different datasets such as interaction graphs in transportation, literature, social networks, etc.
I like machine learning and unsupervised methods. Moreover its part of an ongoing quest to become better at maths and graph theory in particular
What it does
The clustering algorithms create clusterings of interaction networks. In short each of the algorithms attempt to find dense subgraphs within the datasets.
How I built it
Clustering algorithms were created on the descriptions made in academic papers, the clustering algorithms output a format which can be used to graph with my own implementation, but for the sake of speed (cause the networks are large) was offloaded to a GPU cluster in Ontario, and in Edinburgh
Challenges I ran into
Reading and Understanding the papers and dealing with mathematical understanding of the clustering algorithms used.
Accomplishments that I'm proud of
Getting it finished on time despite having lost 10 hours due to the wifi situation.
What I learned
Immeasurable amounts in coding design and mathematical notation, latex.
What's next for Cluster Punk: A Little Exploratory Research
More datasets and attaching weights to edges of the graphs.