NTCS: NextGen Traffic Challan System

Inspiration

In developing nations like India, traffic management faces a critical infrastructure gap. While developed countries rely on expensive hardware like LIDAR and Radar guns for speed enforcement, our roads are primarily monitored by standard, uncalibrated CCTV cameras.

These cameras can see who is driving, but they cannot tell how fast they are going.

We were inspired to bridge this gap not by adding more hardware, but by building smarter software. We asked ourselves: Can we turn a simple, low-cost CCTV feed into a precision speed detection instrument using Generative AI? This question led to the birth of NTCS—a system designed to democratize road safety technology.

What it does

NTCS is an end-to-end intelligent traffic monitoring system that transforms standard video feeds into calibrated speed detection units.

  • Auto-Calibration: It solves the hardest problem in computer vision—depth perception—by using Gemini 1.5 Flash to "look" at the road and estimate physical distances without manual measurements.
  • Precision Tracking: It tracks vehicles in real-time, calculates their speed, and identifies lane violations.
  • Automated Evidence Generation: When a violation occurs, it instantly packages a 5-second video clip, an annotated snapshot, and the vehicle metadata.
  • Gemini-Powered OCR: It reads number plates from difficult angles and blurry frames where standard OCR engines fail, ensuring high accuracy for challan generation.

How we built it

We built NTCS as a hybrid AI system, combining deterministic Computer Vision with probabilistic Generative AI.

1. The Vision Pipeline (Python & OpenCV)

  • Scene Understanding: We used DeepLabV3+ for semantic segmentation to mask the drivable road area, ensuring we only track vehicles on the actual road.
  • Detection & Tracking: We implemented YOLOv8 for vehicle detection, coupled with ByteTrack and Kalman Filters. This combination allows us to maintain stable vehicle IDs even when cars overlap or are temporarily occluded in heavy traffic.

2. The Intelligence Layer (Gemini 3 Flash)

  • Spatial Reasoning: This is our core innovation. We feed road frames to Gemini and ask it to estimate the camera's perspective and the dimensions of tracked vehicles (e.g., "This is a sedan, approx. 4.5m long"). We use these estimates to mathematically derive the Pixels-Per-Meter (PPM) ratio, allowing us to calculate speed without physical sensors.
  • Cognitive OCR: For evidence processing, we send the violation crop to Gemini Vision. Its multimodal capabilities allow it to "read" number plates that are dirty, angled, or low-resolution.

3. The Full-Stack App

  • Frontend: Built with React, offering a dashboard for live monitoring, manual/auto calibration controls, and evidence review.
  • Infrastructure: The system is containerized with Docker and stores violation evidence securely on Azure Blob Storage.

Challenges we ran into

  • The "Calibration" Paradox: To measure speed, you need distance. To get distance from a 2D image, you usually need to physically measure the road. We struggled to make this "software-only."
    • Solution: We utilized Gemini's "world knowledge." By tracking a standard vehicle (like a popular car model) and asking Gemini to estimate its size and position, we could reverse-engineer the road's geometry using perspective math.
  • Occlusion in Traffic: Indian roads are chaotic. Vehicles constantly block each other.
    • Solution: Standard trackers failed. We implemented ByteTrack, which utilizes low-confidence detection boxes (that other trackers throw away) to keep tracking vehicles even when they are partially hidden.

Accomplishments that we're proud of

  • World's First LLM-Calibrated Speed System: We successfully demonstrated that an LLM can be used for physics calibration, not just text generation.
  • High-Accuracy OCR: Our Gemini-assisted OCR significantly outperformed standard Python OCR libraries (like EasyOCR) on real-world, low-quality CCTV footage.
  • Full Production Pipeline: We didn't just build a model in a notebook; we built a deployable full-stack application with a React dashboard, live video streaming, and cloud storage.

What we learned

  • LLMs have "Spatial Intuition": We learned that modern multimodal models possess a surprising amount of understanding about 3D space and object permanence, which can be leveraged for mathematical tasks.
  • Hybrid AI is the Future: Pure GenAI is too slow for real-time tracking, and pure CV is too "dumb" for calibration. Combining them (CV for speed, GenAI for reasoning) creates a system that is both fast and smart.

What's next for NTCS CALIBRATION SYSTEM

  • Self-Healing Infrastructure (Agentic AI): We plan to integrate OpenClaw (Moltbot) to create an autonomous agent that monitors camera health. If the camera shifts due to wind, the agent will detect the drift and automatically trigger the Gemini calibration script to fix it.
  • Night Vision Mode: Implementing low-light enhancement models (like Zero-DCE) to allow speed detection at night.
  • Edge Deployment: optimizing the pipeline to run on edge devices like NVIDIA Jetson for decentralized deployment in remote areas.

Built With

Share this project:

Updates