VisionProbe: A Zero-Shot AI System

Inspiration

We were inspired by a simple but common real-world problem: “How do you quickly find a specific object in a camera feed?” Traditional detectors only recognize fixed classes, person, bag, bottle, etc. and fail when users need to search for unique, custom, or never-before-seen objects. This gap in surveillance, retail security, and industrial monitoring motivated us to build VisionProbe, a system that can find any object from just an image or text prompt instantly and without retraining.

What it does

VisionProbe performs zero-shot detection, real-time tracking, and cross-camera re-identification. Users provide a text prompt like “blue helmet” or an image of an object, and the system immediately searches for it across live video feeds. It offers: Open-world detection Multi-object tracking Object memory & heatmaps Trajectory prediction Cross-camera identity matching

Built With

bytetrack
clip
detection
dinov2
groundingdino
multi-object
oc-sort
open-world
openai)
opencv
python
pytorch
sam
sam2
segmentation
tracking

Updates

Sarthak Verma started this project — Nov 16, 2025 09:52 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.