Inspiration

We were inspired by a simple but common real-world problem: “How do you quickly find a specific object in a camera feed?” Traditional detectors only recognize fixed classes, person, bag, bottle, etc. and fail when users need to search for unique, custom, or never-before-seen objects. This gap in surveillance, retail security, and industrial monitoring motivated us to build VisionProbe, a system that can find any object from just an image or text prompt instantly and without retraining.

What it does

VisionProbe performs zero-shot detection, real-time tracking, and cross-camera re-identification. Users provide a text prompt like “blue helmet” or an image of an object, and the system immediately searches for it across live video feeds. It offers: Open-world detection Multi-object tracking Object memory & heatmaps Trajectory prediction Cross-camera identity matching

Built With

  • bytetrack
  • clip
  • detection
  • dinov2
  • groundingdino
  • multi-object
  • oc-sort
  • open-world
  • openai)
  • opencv
  • python
  • pytorch
  • sam
  • sam2
  • segmentation
  • tracking
Share this project:

Updates