Corpus Analysis for Public Reddit Data

Built for Hack&Roll 2016 at School of Computing, National University of Singapore.


  • Node 4.0.0
  • npm


npm install -g grunt-cli #May require sudo
npm install
grunt browserify
grunt watch #To auto build on js changes
grunt serve #To serve files on port 8000

Head over to http://localhost:8000/index.html

Today, the heart of political discussion has moved to the Internet and social media. While the mainstream media focusses on Twitter and Facebook - the most challenging conversations happen on Reddit. This project uses topic modelling to find what was big on Reddit's /r/politics in each month, allowing news junkies and political enthusiasts and strategists to get a handle on what is big in the Internet without having to dive into the murky depths of Reddit.

The idea can easily be extended to other subreddits - for instance /r/music to find popular genres or /r/programming to find what's big in the tech scene

