VizCom

🧠 Inspiration

Visual inspections are still largely manual, slow, and prone to human error. In industries like manufacturing, infrastructure, and brand compliance, even a small unnoticed defect or deviation can lead to huge costs, safety risks, or brand damage. We wanted to build a universal visual change detector that can automatically spot differences across time or versions — just like the human eye, but faster, more consistent, and scalable.

⚙️ What it does

VizCom (Visual Difference Engine) analyzes two or more images of the same scene over time, detects visual changes, classifies their type (e.g., defect, corrosion, missing part, logo mismatch), and highlights them for quick review. It outputs both visual overlays and structured reports, making it easy to act on findings. VizCom works across multiple domains — from factory floor inspections to monitoring infrastructure health or ensuring brand compliance in retail displays.

Key features:

Detects pixel-level changes using deep learning–based change segmentation.
Classifies type and severity of visual differences.
Visualizes diffs via an intuitive side-by-side viewer.
Generates automated reports (JSON/CSV/PDF) for audit and traceability.

🛠️ How we built it

Data Pipeline: Created paired datasets of “before” and “after” images, including synthetic changes to simulate real-world variations.
Preprocessing: Used image alignment via OpenCV (ORB + RANSAC) and photometric normalization to handle viewpoint and lighting differences.
Model: Built a Siamese U-Net for pixel-level change detection, fine-tuned with PyTorch using contrastive loss.
Change Classification: Added a lightweight CNN (ResNet-based) to classify each detected change region.
Backend & API: Flask-based microservice for inference and report generation.
Frontend: React + Tailwind dashboard for visualization, built with an interactive image comparison slider and timeline view.

🧩 Challenges we ran into

Aligning images captured from slightly different angles or lighting conditions was harder than expected — small registration errors caused false positives.
Labeling change types consistently across domains required careful curation and data balancing.
Optimizing model inference speed for near real-time detection without sacrificing accuracy.
Designing a visualization interface that conveys differences clearly without overwhelming the user.

🏆 Accomplishments that we're proud of

Achieved over 90% accuracy in detecting and classifying visual changes across diverse test sets.
Built an end-to-end working prototype, from image ingestion to visual diff output and structured reporting.
Designed a modular pipeline that can plug into manufacturing QA systems or inspection drones.
Created a clean, intuitive UI that makes complex visual analysis accessible to non-technical operators.

📚 What we learned

Robust image alignment and preprocessing are just as crucial as the deep model itself.
Training with synthetic augmentations (e.g., brightness shifts, blur, small occlusions) dramatically improves real-world performance.
Effective communication between model outputs and human operators requires thoughtful UX — clarity beats complexity.
Building general-purpose visual inspection AI requires balancing domain specificity with flexibility.

🚀 What's next for VizCom

Integrate temporal tracking: extend to continuous monitoring (video or multi-timepoint sequences).
Expand dataset: gather real-world labeled data from manufacturing, retail, and infrastructure partners.
Deploy at the edge: optimize and quantize models for embedded inspection cameras and drones.
Add explainability layer: provide natural-language summaries explaining why a change was flagged.
API release: offer VizCom as a cloud-based visual inspection service.

Built With

lovable
tailwindcss

Updates

Subhranil Mondal started this project — Oct 25, 2025 12:38 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.