What it does
Detects social media communities based on what users post.
How I built it
50 thousand posts were vectorized using TF-IDF. K-means model was trained on those vectors to produce 50 clusters/communities.
Challenges I ran into
Started with 1.6 million tweets, my laptop was limited by memory and computational power (it was too slow). Thought about using an AWS EC2 server, but then decided to decrease the number of tweets. The decreased dataset took around 2-3 hours to train.
What's next for Twitter Social Media Community Detection
Look into word embeddings, such as GloVe, to improve the results.
Log in or sign up for Devpost to join the conversation.