CyberPrint

The Landing Page of CyberPrint
Landing Page: The Input written by User
The Results Page
Continued: The Results Page ("If This Is You" & "If This Is a Stranger")
The Generated Personalized PDF
Overview of "Contact Page"
Overview of "About CyberPrint" Page

Inspiration

While browsing social media, I encountered an individual who consistently made negative comments about various creators. Each creator took these comments seriously and attempted to provide explanations, yet the commenter’s behavior remained consistently negative, regardless of the recipient. This prompted me to investigate the existence of online tools that could analyze such behavior and potentially assist creators to not take this behavior as a personal attack. In our hyper-connected world, our digital communications reveal profound insights about our mental state and well-being. CyberPrint was born from the recognition that millions of people leave digital breadcrumbs across social platforms (comments, posts, and interactions) that collectively paint a picture of their emotional landscape. I wanted to create a tool that could analyze these patterns and provide meaningful insights for digital wellbeing, mental health awareness, and personal growth.

What it does

CyberPrint is a sentiment analysis platform that analyzes users' digital communications across Reddit and YouTube to generate comprehensive mental health and wellbeing reports. The system:

Analyzes Digital Footprints: Processes comments from Reddit users and YouTube channels
Sentiment Classification: Uses state-of-the-art DeBERTa transformer models achieving over 93% accuracy
Granular Sub-Label Detection: Identifies specific emotional categories like gratitude, sarcasm, concern, excitement, and more
Mental Health Monitoring: Detects potential mental health concerns and provides supportive resources
Personalized PDF Reports: Generates professional, visually appealing reports with insights and recommendations
Dual Perspective Analysis: Provides insights for both self-reflection ("if this is you") and understanding others ("if this is a stranger")

How I built it

AI/ML Architecture

Primary Model: Fine-tuned DeBERTa (Decoding-enhanced BERT with Disentangled Attention), achieving 93%-97% sentiment classification accuracy
Fallback System: Logistic regression model for reliability
Enhanced Sub-Label Classification: Rule-based system for granular emotional categorization
Active Learning Pipeline: Continuous model improvement through misclassification detection

Backend Infrastructure

FastAPI Server: High-performance API with comprehensive endpoints
Data Processing Pipeline: Robust text preprocessing and batch processing capabilities
Multi-Platform Integration: Reddit API and YouTube Data API v3 integration
PDF Generation: Beautiful report generation with ReportLab

Frontend Experience

React: Responsive single-page application
Animated UI: Engaging user interface with smooth animations
Real-time Analysis: Live sentiment analysis with progress indicators
Mobile-Responsive: Optimized for all device sizes

Challenges I ran into

Gratitude Classification Bias The Problem: DeBERTa consistently misclassified gratitude expressions as neutral

Root Cause: Training data imbalance and model bias toward neutral predictions
Solution: Developed a sophisticated post-processing override system for gratitude detection

Sub-Label Classification Precision The Problem: The rule-based sub-label classifier was misclassifying neutral comments (e.g., gaming discussions) as negative sentiment due to overly broad keyword matching

Solution: Refined the pattern matching algorithms and implemented context-aware classification rules, achieving +80% accuracy across 15+ sub-categories.

Training Data Quality vs Quantity The Problem: Achieving a high confidence required significantly more high-quality labeled data than initially available

Solution: Built an active learning pipeline with data augmentation techniques (paraphrasing, context variations) and human-in-the-loop feedback to iteratively improve model performance.

Accomplishments that I'm proud of

Achieved >93% sentiment classification accuracy with a fine-tuned DeBERTa model
Implemented a sophisticated gratitude detection system, solving a critical AI bias issue
Created beautiful, professional PDF reports that users actually want to share
Built a comprehensive active learning pipeline for continuous model improvement
Developed robust multi-platform integration supporting both Reddit and YouTube
Achieved seamless user experience from URL input to detailed analysis report
Managed to create a logo for CyberPrint by using Canva _ Last but not least, I successfully brought my vision to life! _

What I learned

Advanced Transformer Fine-tuning: Deep understanding of DeBERTa architecture and optimization
Production ML Deployment: Critical lessons about model versioning and deployment strategies & limitations
Bias Detection and Mitigation: Identifying and solving AI bias in sentiment analysis
Full-Stack Integration: Seamless connection between React frontend and FastAPI backend
User Experience Design: Creating intuitive interfaces for complex AI-powered applications
Mental Health Technology: Responsible development of tools that impact user well-being

What's next for CyberPrint

Enhanced AI Capabilities

Multi-language Support: Expand beyond English to global audiences
Temporal Analysis: Track sentiment changes over time for trend identification -Advanced Mental Health Models: Specialized models for depression, anxiety detection

Platform Expansion