Just wanted to do something small and fun. Thought, word clouds are pretty interesting especially when it's build on Reddit comments data.
What it does
Given link of a Reddit post, it creates a wordcloud of the comments on the post.
How I built it
Initially it presents with a blank page with text input field. When a Reddit URL of a post is given as input and submitted, it authenticates using Reddit API and parses the comments of the given post. It caches it, tokenizes it, filters it of unwanted stopwords, sent to wordcloud api and is saved to a file on the server. The flask server then gives the image as output wordcloud.
Challenges I ran into
Reddit API had limitations as of how many reads I can make per minute. We have to be careful not to exceed them. And at the end once we get the output, the server crashes force quitting python. This is an unexpected behavior and is a bug with Mac OS Mojave. I couldn't fix in the given time. Then I thought of using AWS EC2 to host there wasn't enough time. Please check the Youtube video link posted below. It works perfectly except for the crashing which I'll fix in later versions.
Accomplishments that I'm proud of
Though it was a small project, I learned how problems arise unexpectedly. I really didn't hope to complete a project by today. I came just to see how the dev community is around me. And it turned out a fun project with challenges.
What I learned
This is my first full fledged project in python and that too I wrote it from scratch. I feel very good about it
What's next for RedditWordCloud
Perform sentiment analysis on the words and color them accordingly. Introduce shapes to choose from a pool of predefined clouds. Try to host it on a cloud such as Azure or AWS.
Currently it runs on localhost so I can't show a demo of a webapp hosted on the internet. I've made a video showing how to run it, how it works and also showing the part where it is crashing particularly in Mac OS Mojave. Please check the youtube link. You can run it by cloning my repo and type python sample.py to run.