HH26 Team Code ID: 17E4D1B532F34273

Inspiration

Data breaches don't start with hackers - they start with a screenshot of your seed phrase, a photo with your location embedded, or a QR code hiding malicious URLs. We built VIGIL because what you can't see in an image can hurt you. Digital forensics tools exist, but they're slow, siloed, and built for experts. We wanted one-click intel for everyone.

What it does

VIGIL is a visual intelligence scanner that hunts for hidden threats in your images. Drop in a folder and it instantly:

  • OCRs text to catch exposed passwords, crypto seed phrases, and sensitive keywords
  • Decodes QR codes and flags crypto wallets or sketchy URLs
  • Detects objects like credit cards, phones, and laptops using YOLO + Google Vision
  • Extracts GPS coordinates from EXIF data (your photos are snitching on you)
  • Cracks steganography using entropy analysis, LSB extraction, and trailing byte detection

Every finding is severity-ranked and compiled into a clean HTML report with thumbnails, badges, and drill-down details.

How we built it

Python-powered backend integrating:

  • Tesseract + EasyOCR for text extraction
  • pyzbar + OpenCV for QR decoding
  • YOLOv8 for real-time object detection
  • Google Cloud Vision API for cloud-powered label/logo/object recognition
  • PIL + exif libraries for GEOINT metadata extraction
  • scikit-image + custom LSB decoders for steganography detection
  • Jinja2 templating for slick HTML reports

Graceful fallbacks everywhere—if one library fails, another picks up the slack.


Challenges we ran into

  • Steganography is subtle. Detecting hidden data without false positives required tuning entropy thresholds, chi-square tests, and Laplacian frequency analysis.
  • OCR accuracy varies wildly. Tesseract chokes on stylized text; we added EasyOCR as a fallback and preprocessing pipelines with Otsu thresholding.
  • EXIF GPS decoding is a nightmare. DMS-to-decimal conversion across different library formats took way too many edge-case fixes.
  • Google Vision API credentials. Supporting inline JSON, file paths, base64-encoded secrets, and .env files meant writing a mini credential resolver.

Accomplishments that we're proud of

  • Multi-layer steganography decoder that catches stylesuxx-encoded messages, LSB payloads, trailing file data, and embedded files (PNG inside JPEG? We extract it).
  • Self-contained HTML reports with base64 thumbnails—no external assets, just one file you can share.
  • Zero-crash scanning. Every module is wrapped to fail gracefully. Missing Tesseract? YOLO model not found? Vision API down? VIGIL keeps running.
  • BIP39 seed phrase detection. We literally catch screenshots of crypto recovery phrases.

What we learned

  • Image forensics is deep. Every pixel can hide data, every metadata field can leak location, every barcode can redirect to malware.
  • Real-world OCR needs preprocessing—raw image-to-text rarely works out of the box.
  • API fallbacks matter. Cloud services fail; local inference saves the day.
  • Hackathon code can still be clean code. Dataclasses + Jinja2 + modular functions = maintainable chaos.

What's next for VIGIL

  • Upload the Live photos that user wants to directly within GUI
  • Live camera feed scanning for real-time threat detection
  • Browser extension to scan images before download
  • Video frame analysis for surveillance footage intel
  • Malware signature matching in QR/barcode payloads
  • PDF and document scanning beyond just images
  • Enterprise dashboard for bulk organizational audits

Built With

Share this project:

Updates