SafeSight AI is a real-time safety intelligence desktop app for detecting people, masks, and distancing violations from images, videos, or live webcams — powered by YOLOv8 ONNX and OpenCV.
Features • Screenshots • Demo • Installation • Usage • How It Works • Architecture • Tech Stack
Features
- 🛡️ Multi-mode input: Image, Video File, or Live Webcam
- ⚡ Real-time YOLOv8n ONNX person detection on CPU (no PyTorch required)
- 😷 Face mask classification with OpenCV DNN + MobileNetV2 ONNX
- 📏 Social distancing enforcement with homography + metre calibration
- 🛰️ Bird's-eye view map with violation lines and labeled dots
- 📊 Live metrics bar (people, compliance %, mask violations, distance violations)
- 📈 Streaming violations chart for video/webcam sessions
- 🎯 Adjustable confidence threshold and minimum safe distance
- 🧵 QThread-powered pipeline keeps the UI fully responsive
- 💾 Save annotated images or rendered videos
- 🧠 Smart fallbacks when a model is missing (no blank screens)
- 🧩 Clean PyQt6 UI with custom frameless chrome
Screenshots
Main UI overview (current available screenshot) |
Detection overlays + metrics (same screenshot until more are added) |
Demo / How to Run
python3 main.py
Installation
git clone https://github.com/Vihaan-Singhal1/safesight-AI.gitcd safesightpython3 -m venv venv && source venv/bin/activate(Mac/Linux) ORvenv\Scripts\activate(Windows)pip install -r requirements.txtmkdir -p models && curl -L "https://huggingface.co/webml/yolov8n/resolve/main/onnx/yolov8n.onnx" -o models/yolov8n.onnxpython3 main.py
Usage
Image Mode
- Select
Imagein the sidebar. - Click
Browse…and choose a.jpg/.png/.webpfile. - Adjust
Confidence ThresholdandMin. Safe Distanceif needed. - Click
Analyseto run detection. - Click
Save Outputto export the annotated image.
Video Mode
- Select
Video File. - Click
Browse…and choose a.mp4/.mov/.avi/.mkvfile. - Click
Analyseto start playback with overlays. - Use
Stopanytime; clickSave Outputto export the annotated video.
Webcam Mode
- Select
Webcam (Live). - Choose the
Camera index(0–9). - Click
Analyseto start live detection. - Click
Stopto end the session;Save Outputwrites the latest frame/video.
Settings Reference
| Setting | Type | Range | Default | Effect |
|---|---|---|---|---|
| Input Source | Radio | Image / Video / Webcam | Image | Chooses pipeline mode and source type |
| Camera Index | Spinbox | 0–9 | 0 | Selects webcam device ID |
| Confidence Threshold | Slider | 0.10–1.00 | 0.50 | Filters low-confidence person detections |
| Min. Safe Distance | Slider | 0.50–5.00 m | 1.50 m | Triggers distance violations below this threshold |
| Show Distance Lines | Checkbox | On/Off | On | Draws violation lines between people |
| Show Bird's-Eye View | Checkbox | On/Off | On | Displays top-down map of positions |
How It Works
1. Person Detection
YOLOv8n is exported to ONNX and executed with onnxruntime on CPU. Frames are letterboxed to 640, run through the model, and filtered with NMS to keep clean person boxes. No GPU, PyTorch, or Ultralytics runtime required.
2. Face Mask Detection
Each person crop is scanned using OpenCV's Caffe face detector, then classified by a MobileNetV2 ONNX head. The UI surfaces a three-class outcome: Mask On, No Mask, or Unknown (when no face/model is available).
3. Social Distancing
Bottom-centre foot-points of each person box are projected through a perspective homography into a bird's-eye plane. Pairwise distances are computed in real-world metres using calibrated pixels-per-metre scaling, then rendered as violation lines and colored dots.
Architecture
.
├── __main__.py — python -m safesight launcher
├── main.py — Qt application entry point
├── convert_model.py — builds mask_detector.onnx from Keras weights
├── requirements.txt — Python dependencies
├── assets/
│ ├── banner.svg — README hero banner
│ ├── screenshots/
│ │ └── image.png — UI screenshot
│ └── icons/
│ └── .gitkeep — keeps icons directory in git
├── core/
│ ├── __init__.py — core package init
│ ├── annotator.py — supervision-based drawing + labels
│ ├── detector.py — YOLOv8 ONNX + mask pipeline
│ ├── distancing.py — homography + bird's-eye map
│ └── video_thread.py — threaded processing pipeline
├── ui/
│ ├── __init__.py — ui package init
│ ├── main_window.py — frameless window + signal wiring
│ ├── sidebar.py — input controls + settings
│ ├── display_panel.py — main preview, bird's-eye, charts
│ ├── metrics_bar.py — realtime metrics cards
│ └── styles.qss — Qt stylesheet
├── models/
│ ├── yolov8n.onnx — person detector (auto-download)
│ ├── mask_detector.onnx — mask classifier (MobileNetV2)
│ ├── _mobilenetv2_base.onnx — base ONNX for conversion script
│ └── face_detector/
│ ├── deploy.prototxt — Caffe face detector config
│ └── res10_300x300_ssd_iter_140000.caffemodel — Caffe weights
├── output_videos/
│ └── IMG_0872.mp4 — sample output video
└── venv/ — local virtualenv (not committed)
Tech Stack
| Component | Technology | Why This Choice |
|---|---|---|
| GUI | PyQt6 | Native desktop UI with strong threading support (QThread) |
| Person Detection | YOLOv8n ONNX | Fast inference without PyTorch |
| Mask Detection | OpenCV Caffe DNN | Lightweight face detection pipeline |
| Distancing | NumPy + OpenCV homography | Accurate real-world coordinate mapping |
| Video Processing | OpenCV VideoCapture | Reliable cross-platform capture |
| Annotation | Supervision (Roboflow) | Clean, consistent overlays |
| Graphing | pyqtgraph | Real-time charts inside Qt |
| Runtime | onnxruntime | Cross-platform CPU inference |
Models
| Model | Format | Size | Purpose | Auto-download |
|---|---|---|---|---|
| yolov8n.onnx | ONNX | ~12MB | Person detection | Yes (on first run) |
| mask_detector | Caffe (face) + ONNX (mask) | ~4MB + ~3–5MB | Mask classification | Bundled |
Datasets & Training (Separate Branches)
Model training assets are stored on separate branches to keep this repo clean:
datasetsbranch — raw training data and annotationsmodel-trainingbranch — training scripts, Jupyter notebooks, experiment logs To explore the training pipeline:git checkout datasets/git checkout model-training
Project Background
Originally built during COVID-19 (2020) using YOLOv3 + basic Tkinter UI. Rebuilt from scratch in 2026 with:
- YOLOv3 → YOLOv8n (3x faster, better accuracy)
- Tkinter → PyQt6 (professional desktop app)
- Pixel distance → Perspective homography (real metre estimates)
- Basic CNN → MobileNetV2 DNN (lighter, more accurate)
- No threading → QThread pipeline (fully responsive UI)
- Partially implemented → Fully working including live webcam
Contributing
Issues and pull requests are welcome. Please keep changes focused, document new behavior, and include screenshots for UI changes.
License
MIT
Log in or sign up for Devpost to join the conversation.