What it does

Detects social media communities based on what users post.

How I built it

50 thousand posts were vectorized using TF-IDF. K-means model was trained on those vectors to produce 50 clusters/communities.

Challenges I ran into

Started with 1.6 million tweets, my laptop was limited by memory and computational power (it was too slow). Thought about using an AWS EC2 server, but then decided to decrease the number of tweets. The decreased dataset took around 2-3 hours to train.

What's next for Twitter Social Media Community Detection

Look into word embeddings, such as GloVe, to improve the results.

Built With

Share this project:

Updates