SafeGuard AI

Intelligent Real-Time Harassment Detection System


Overview

SafeGuard AI is an intelligent harassment detection system that protects users from online threats in real-time by combining Artificial Intelligence and Blockchain technology.

It is designed to detect, record, and prevent online harassment, providing victims with tamper-proof evidence and instant protection.


Quick Stats

Metric Value
Detection Speed < 500ms per message
Accuracy 93%+ on threat detection
Blockchain 100% tamper-proof evidence
Languages English

Average Detection Speed: ~500ms per message; longer messages may take slightly longer depending on length.


Problem Statement

The Crisis

Online harassment and cyberbullying are serious concerns in India, especially for women.

  • Reports show a steady rise in harassment across social media and digital platforms.
  • Cyberbullying cases have increased significantly since the pandemic, with many incidents going unreported.
  • Victims often struggle to collect digital evidence to support their cases.
  • Major platforms face challenges in real-time detection and moderation.
  • Psychological impacts such as stress, anxiety, and fear of online spaces remain widespread.

The Challenge

Current solutions often lack:

  • Real-time detection
  • Immutable evidence for legal cases
  • Pattern recognition for coordinated attacks
  • Victim-centric design

Solution

SafeGuard AI addresses these challenges through a three-layer protection system:

1. AI Detection Layer

  • Uses pre-trained Toxic-BERT model
  • Real-time analysis (< 500ms)
  • Detects: sexual harassment, violent threats, hate speech, abusive language
  • Categorizes severity: HIGH, MEDIUM, LOW

2. Blockchain Evidence Layer

  • Logs every threat to an immutable blockchain
  • Generates tamper-proof evidence for court use
  • Stores: timestamp, threat type, severity, content hash
  • Victims can download evidence reports anytime

3. Pattern Detection Layer

  • Identifies coordinated harassment campaigns
  • Links multiple accounts targeting the same user
  • Detects escalation patterns
  • Alerts users about organized attacks

Why Blockchain?

  • Stores instances securely
  • Tamper-proof: even if the original content is deleted, evidence remains
  • Includes timestamp and threat type for each user

Pattern Recognition for Coordinated Attacks

Detects when a user receives multiple threatening or toxic comments from different accounts in a short period.

Flags possible coordinated harassment campaigns instead of treating each comment individually.

Summarizes the attack:

  • Number of threats
  • Accounts involved
  • Time span of the attack

Helps protect users by providing early warnings and allowing for escalation or reporting of organized attacks.


Key Features

Feature Category Highlights
Beautiful UI/UX - Modern gradient design (UX/UI designed with Claude AI)
- Responsive layout
- Intuitive navigation
- Dual views: Post Owner & Commenter
AI-Powered Detection - Real-time threat analysis
- Confidence scoring
- Multi-category detection
Blockchain Security - Immutable evidence logging
- SHA-256 hashing
- Chain verification
- Built-in Blockchain Explorer
Smart Alerts - Pattern attack detection
- Severity gauges
- Real-time notifications
- Coordinated attack warnings
Analytics Dashboard - Live threat statistics
- Severity and type distribution
- Timeline visualizations
- Graphs created using Plotly
- System performance metrics
Evidence Export - Download threat reports (CSV)
- Court-ready documentation
- Complete incident history
- Blockchain proof included

Tech Stack

Layer Technologies & Notes
Backend Python, Pandas, Hashlib, Transformers (Hugging Face), PyTorch
Frontend Streamlit, Plotly, Custom CSS
AI/ML Pre-trained Toxic-BERT (used for real-time threat detection), NLP techniques, BERT architecture understanding and implementation
Blockchain Custom Python implementation, SHA-256 cryptographic hashing, Chain integrity verification

Personal Contributions and Learning

  • Learned about BERT and real-time harassment detection.
  • Studied coordinated attack detection linking multiple accounts.
  • Created interactive graphs using Plotly for attack patterns and threat statistics.
  • Assisted with UX/UI design using Claude AI.
  • User testing: Preeti Va acted as a user tester, providing feedback on usability, interface clarity, and overall experience, helping refine SafeGuard AI’s design.

Challenges & Mitigation

Multilingual Support

  • Challenge: Initially aimed to support multiple languages (English + Hindi), but multilingual harassment detection models are limited. Collecting and labeling raw datasets would take too much time.
  • Mitigation: Focused on English-language detection using pre-trained Toxic-BERT for high accuracy and real-time performance. Future work includes fine-tuning multilingual models for regional Indian languages.

Custom Model Training

  • Challenge: Training a custom BERT model could improve classification accuracy and better capture tone/context of messages, but it requires extensive data collection and labeling not feasible during the hackathon.
  • Mitigation: Used pre-trained Toxic-BERT for real-time toxicity detection, combined with a keyword-based categorization system for Sexual Harassment, Violent Threat, Hate Speech, and Abusive Language. While robust, subtle or nuanced messages may not always be perfectly classified. Severity assigned as HIGH (>0.9), MEDIUM (>0.7), LOW (≤0.7).

Interdisciplinary Approach

SafeGuard AI is not just code it combines multiple fields:

  • AI/ML: Real-time harassment detection using pre-trained models
  • Blockchain Security: Tamper-proof evidence storage
  • Social Sciences / Psychology: Understanding victim behavior and harassment patterns
  • UX/UI Design: Creating accessible, user-friendly interfaces for victims
  • Legal / Ethics Awareness: Ensuring evidence is admissible and data handling is ethical

This combination of disciplines ensures the system is technically strong, socially aware, and legally usable.


Future Scope

  • Language Expansion: Add support for Indian regional languages (Tamil, Bengali, Marathi, Telugu, etc.)
  • Victim Support Network: Integration with helplines, NGOs, and law enforcement for direct reporting
  • Custom Dataset Training: Train on custom datasets to improve accuracy and label subtle harassment types

Share this project:

Updates