Inspiration
We were inspired by the surge of misinformation and biased reporting on social media, especially among younger audiences who rely on Instagram for their daily news. Recognizing that users often don’t have the time or tools to verify content, we wanted to create an easy, one-stop solution that flags questionable posts and highlights potential biases—ultimately helping people make more informed judgments about what they see online.
What it does
InstaCheck pulls in Instagram posts (via the Instagram Basic Display API or manual uploads) and uses natural language processing (NLP) to:
- Fact-Check: Compares captions or text extracted from posts against known fact-checking resources (like PolitiFact, Snopes) to identify potential misinformation.
- Bias Detection: Classifies the post’s language to determine if it leans left, right, or appears neutral.
- User-Friendly Dashboard: Shows users each post’s potential factual accuracy and a bias label, making it simple to spot unverified or heavily biased content.
How we built it
- Data Collection: We used manual collection for post captions, usernames, and timestamps.
- Preprocessing: Cleaned and tokenized text with Python’s spaCy and applied language detection if needed.
- Fact-Checking: We plan to incorporate automated semantic search for scaling up. Bias Detection: We built a straightforward text classification model (Logistic Regression + TF-IDF) to label posts as “Left,” “Right,” or “Neutral.” Future versions might use fine-tuned Transformer models for more nuanced classification.
- User Interface: A simple dashboard displays each post’s content, fact-check status, and bias score. Users can also see overall statistics or filter by account.
Challenges we ran into
Limited Training Data: Building a robust bias detection model requires large, well-labeled datasets, which can be time-consuming to create.
What we learned
- Importance of Human-in-the-Loop: Fully automated fact-checking is still a challenge—expert or community feedback can significantly improve results.
- Ethical & Privacy Considerations: Handling user-generated data responsibly and respecting platform terms is crucial for a sustainable project.
- Model Flexibility: Bias can be subtle or context-specific; building a flexible system that can adapt to evolving social discourse is key.
What's next for Instacheck
- Automated Claim Detection: Integrate a semantic similarity engine or BERT-based model to automatically match Instagram captions with known fact-check entries.
- OCR for Image-Based Posts: Many Instagram posts are text on images; we’ll add OCR to extract and analyze text from these images.
- Multi-Lingual Support: Expand language detection and modeling to address misinformation in non-English captions.
- User Feedback & Crowdsourcing: Allow users to report inaccuracies or provide additional context, improving the model over time.
- Mobile Integration: Eventually, build a mobile app or browser extension for quick, on-the-go verification.
Built With
- csv
- figma
- python
Log in or sign up for Devpost to join the conversation.