Inspiration
Our inspiration stems from the challenge presented by CANIS to transform a raw dataset into an engaging visual narrative. Motivated by a shared passion for data gathering, cleaning, exploration, and visualization, Team Insightful Four embarked on a mission to uncover hidden patterns and stories within the provided data.
What it does
Our project is a dynamic and comprehensive exploration of the given dataset and the additional scraped data from Twitter. Leveraging systematic data gathering, cleaning, and analysis, we've transformed raw figures into polished graphs and insightful visuals to address the scale of Chinese platforms in social media.
How we built it
Our approach was systematic and exploratory:
- Data Gathering: Initiated by a thorough analysis of CANIS data, then utilized scraping methods to gather more data regarding the state actors from Twitter.
- Data Cleaning: Filtered and cleaned data to gain higher quality results using our ML models.
- Data Analysis: Delved into the data to uncover trends, anomalies, and noteworthy findings.
- Visualization: Transformed our discoveries into an array of engaging, insightful visuals.
Challenges we ran into
Throughout the project, we encountered various challenges, including scraping data from Twitter which had many limitations, implementing effective data cleaning strategies, hyper-parameters tuning, and ensuring the accuracy of our machine learning models.
Accomplishments that we're proud of
We take pride in our achievements, including:
- Gathering valuable data from Twitter with many limitations in accessing the API.
- Developing a structured presentation that guides viewers through our data journey.
- Providing interesting insights for 348 topics through visualization derived by machine learning models.
- Creating semantic search infrastructure for easily finding related tweets.
- Summarizing our methodology, providing insights into the analytical tools and techniques employed.
- Creating a codebase that showcases our technical expertise and the power behind our analysis.
What we learned
Our journey provided valuable insights into data exploration, visualization techniques, and effective collaboration. We honed our skills in data gathering, cleaning, and analysis, learning how to transform complex datasets into compelling narratives. We learned the topic modeling techniques, BERTopic library, and sentiment analysis with fine-tuned models in the Huggingface. In addition, we learned how to work with Streamlit.
What's next for our team
Expanding the data visualization knowledge by exploring different datasets.
Log in or sign up for Devpost to join the conversation.