Inspiration

I wanted to build a system that could track and count objects in real time, even under occlusions, lighting changes, and motion.

What it does

OS2D+ takes a query image and live video feed, detects the target object, tracks it across frames, and outputs real-time counts with persistent IDs.

How we built it

I used a shared CNN backbone for feature extraction, feature correlation for matching, a detection head for bounding boxes, re-identification embeddings for occlusions, and OC-SORT for smooth tracking.

Challenges we ran into

Real-time performance, maintaining IDs through occlusion, tuning detection thresholds, and integrating tracking were the main challenges.

Accomplishments that we're proud of

I built a robust end-to-end pipeline that maintains object identity and counts accurately in real time.

What we learned

I gained hands-on experience in query-based detection, re-identification, and real-time object tracking optimization.

What's next for OS2D+

I plan to add multi-object queries, transformer backbones, improved embeddings, and edge-device deployment.

Built With

Share this project:

Updates