Inspiration

While browsing social media, I encountered an individual who consistently made negative comments about various creators. Each creator took these comments seriously and attempted to provide explanations, yet the commenter’s behavior remained consistently negative, regardless of the recipient. This prompted me to investigate the existence of online tools that could analyze such behavior and potentially assist creators to not take this behavior as a personal attack. In our hyper-connected world, our digital communications reveal profound insights about our mental state and well-being. CyberPrint was born from the recognition that millions of people leave digital breadcrumbs across social platforms (comments, posts, and interactions) that collectively paint a picture of their emotional landscape. I wanted to create a tool that could analyze these patterns and provide meaningful insights for digital wellbeing, mental health awareness, and personal growth.

What it does

CyberPrint is a sentiment analysis platform that analyzes users' digital communications across Reddit and YouTube to generate comprehensive mental health and wellbeing reports. The system:

  • Analyzes Digital Footprints: Processes comments from Reddit users and YouTube channels
  • Sentiment Classification: Uses state-of-the-art DeBERTa transformer models achieving over 93% accuracy
  • Granular Sub-Label Detection: Identifies specific emotional categories like gratitude, sarcasm, concern, excitement, and more
  • Mental Health Monitoring: Detects potential mental health concerns and provides supportive resources
  • Personalized PDF Reports: Generates professional, visually appealing reports with insights and recommendations
  • Dual Perspective Analysis: Provides insights for both self-reflection ("if this is you") and understanding others ("if this is a stranger")

How I built it

AI/ML Architecture

  • Primary Model: Fine-tuned DeBERTa (Decoding-enhanced BERT with Disentangled Attention), achieving 93%-97% sentiment classification accuracy
  • Fallback System: Logistic regression model for reliability
  • Enhanced Sub-Label Classification: Rule-based system for granular emotional categorization
  • Active Learning Pipeline: Continuous model improvement through misclassification detection

Backend Infrastructure

  • FastAPI Server: High-performance API with comprehensive endpoints
  • Data Processing Pipeline: Robust text preprocessing and batch processing capabilities
  • Multi-Platform Integration: Reddit API and YouTube Data API v3 integration
  • PDF Generation: Beautiful report generation with ReportLab

Frontend Experience

  • React: Responsive single-page application
  • Animated UI: Engaging user interface with smooth animations
  • Real-time Analysis: Live sentiment analysis with progress indicators
  • Mobile-Responsive: Optimized for all device sizes

Challenges I ran into

Gratitude Classification Bias The Problem: DeBERTa consistently misclassified gratitude expressions as neutral

  • Root Cause: Training data imbalance and model bias toward neutral predictions
  • Solution: Developed a sophisticated post-processing override system for gratitude detection

Sub-Label Classification Precision The Problem: The rule-based sub-label classifier was misclassifying neutral comments (e.g., gaming discussions) as negative sentiment due to overly broad keyword matching

  • Solution: Refined the pattern matching algorithms and implemented context-aware classification rules, achieving +80% accuracy across 15+ sub-categories.

Training Data Quality vs Quantity The Problem: Achieving a high confidence required significantly more high-quality labeled data than initially available

  • Solution: Built an active learning pipeline with data augmentation techniques (paraphrasing, context variations) and human-in-the-loop feedback to iteratively improve model performance.

Accomplishments that I'm proud of

  • Achieved >93% sentiment classification accuracy with a fine-tuned DeBERTa model
  • Implemented a sophisticated gratitude detection system, solving a critical AI bias issue
  • Created beautiful, professional PDF reports that users actually want to share
  • Built a comprehensive active learning pipeline for continuous model improvement
  • Developed robust multi-platform integration supporting both Reddit and YouTube
  • Achieved seamless user experience from URL input to detailed analysis report
  • Managed to create a logo for CyberPrint by using Canva _ Last but not least, I successfully brought my vision to life! _

What I learned

  • Advanced Transformer Fine-tuning: Deep understanding of DeBERTa architecture and optimization
  • Production ML Deployment: Critical lessons about model versioning and deployment strategies & limitations
  • Bias Detection and Mitigation: Identifying and solving AI bias in sentiment analysis
  • Full-Stack Integration: Seamless connection between React frontend and FastAPI backend
  • User Experience Design: Creating intuitive interfaces for complex AI-powered applications
  • Mental Health Technology: Responsible development of tools that impact user well-being

What's next for CyberPrint

Enhanced AI Capabilities

  • Multi-language Support: Expand beyond English to global audiences
  • Temporal Analysis: Track sentiment changes over time for trend identification -Advanced Mental Health Models: Specialized models for depression, anxiety detection

Platform Expansion

  • Twitter/X Integration: Analyze tweets and social media presence
  • Instagram & TikTok: Expand to visual platform comment analysis
  • LinkedIn Professional Analysis: Career-focused sentiment insights

Enterprise Features

  • Team Analytics: Organizational communication health monitoring
  • API Access: Developer-friendly API for third-party integrations
  • Custom Model Training: Industry-specific sentiment analysis models

Wellness Integration

  • Intervention Recommendations: Personalized mental health resource suggestions
  • Therapist Integration: Professional mental health provider dashboard
  • Wellness Tracking: Long-term digital well-being monitoring

CyberPrint - Where AI meets digital well-being

Share this project:

Updates