Inspiration

✨ The inspiration for SITESYNC AI stems from a staggering reality: the construction industry loses nearly $31 Billion annually due to rework caused by misinterpretation of blueprints. We observed a massive "Intelligence Gap" between the pristine digital ledger (BIM/PDF) and the chaotic, high-stakes environment of a physical job site.

Real-World Catastrophe: The "Death Ray" Disaster 🏙️🔥

🚗🔥 20 Fenchurch Street, London—nicknamed the "Walkie Talkie" —became infamous when its curved glass facade acted as a giant magnifying glass, focusing sunlight into a deadly beam that:

  • Melted a Jaguar XJ parked on the street (bodywork warped, panels buckled)
  • Fried eggs on the sidewalk at 72°C (160°F)
  • Scorched storefront carpets and blinded pedestrians

⚠️The "Reality Gap": To cut costs, builders eliminated the "sun fins" from the original design, never realizing the concave glass would weaponize sunlight. The oversight cost millions in retrofitting and compensation.

How SITESYNC AI Prevents This: Using Gemini's multimodal spatial reasoning, a worker pointing their phone at the glass during construction would trigger an instant AI warning:

⚠️ CRITICAL SOLAR HAZARD DETECTED
Concave glass geometry will create street-level thermal concentration (projected 70°C+).
Required mitigation: Sun fins per original spec or anti-reflective coating.
Estimated retrofit cost if ignored: $6M+

I wanted to build the "Intelligence Layer for the Physical World"—a tool that doesn't just record the site but understands it, providing supervisors with X-ray vision into the future and a vigilant safety eye that never blinks.

What it does

🏗️ SITESYNC AI is an autonomous AI Site Supervisor. It bridges the gap between architectural intent and physical execution through three core operational modes:

_ AI Intelligence Features_

  1. Multimodal Site Auditing: Uses Gemini 3 Pro to compare live camera frames against stored blueprints, identifying structural deviations and calculating completion percentages.

  2. Blueprint Ingestion: Extracts structural "ground truth" and MEP (Mechanical, Electrical, Plumbing) requirements from uploaded PDF or BIM files.

  3. AR Hazard Detection: Automatically identifies site hazards and missing work elements, projecting persistent AR markers onto the live camera feed.

  4. Generative Vibe Overlay: Renders high-fidelity architectural finishes (Brutalist, Scandinavian, etc.) over raw construction studs using text-to-image generation. A high-fidelity introduction featuring a "DESIGN INTENT " for visual comparison and live-simulated site statistics.

  5. Autonomous RFI Engine: Uses AI function calling to automatically draft and dispatch "Request for Information" reports to safety teams when critical issues are detected.

  6. Thought Signatures: A persistent AI log that tracks site context across different camera angles and sessions.

  7. Manual RFI Dispatch: A dedicated UI for supervisors to manually review AI findings, add verification notes, and draft emails directly to constructors ( in case Autonomous RFI Engine fails).

Voice & Interaction Features

  1. "Constructors" Live Assistant: A real-time, low-latency voice session where supervisors can talk to an AI agent that "sees" the site feed and provides verbal technical feedback.

  2. Voice Prompting: Integrated speech recognition in the Visual Lab, allowing supervisors to describe design changes hands-free.

  3. Multimodal Chatbot: A dedicated "Intelligence Uplink" text assistant for deep-diving into site protocols and safety regulations.

  4. Visual Guide : An interactive, step-by-step technical manual to onboard new site supervisors to the system.

  5. Multilingual support: Site can be acess in many languages as per user need .

Form & Operational Pages

1.Site Protocol (Project Initiation): A responsive, boxed onboarding form to establish the project ledger, supervisor identity, schedule datums, and assigned constructor details.

2.Mission Control (Sidebar): A central dashboard for managing workflow stages (Calibration, Audit, Rendering) and the RFI Ledger.

How I built it

⭐ I built SITESYNC AI using a modern, high-performance stack centered around the Gemini API:

  1. Multimodal Reasoning: We utilized gemini-3-pro-preview for complex site audits, comparing live frames against blueprint text.

  2. Live Interaction: The Gemini Live API powers "Constructors," our voice assistant, providing low- latency audio feedback and intercepting safety hazards as they happen.

  3. Generative Overlays: We implemented gemini-2.5-flash-image and gemini-3-pro-image-preview to perform architectural "inpainting," allowing us to visualize intent over reality.

  4. UI/UX: A bespoke "Industrial-Linen" aesthetic was crafted using Tailwind CSS and React, prioritizing high-contrast legibility for bright outdoor environments.

  5. Logic: Function calling was leveraged for the Autonomous RFI system, allowing the model to decide when to dispatch critical alerts.

I define the Site Fidelity Index ( ) as: where is the executed physical element and is the blueprint requirement. Our goal is to keep .

_ requirements_:

"Connectivity & Hardware Requirements: To enable real-time spatial analysis, the application requires a device equipped with an integrated camera and a stable internet or Wi-Fi connection." Recommendation( smart phones/ tablets/laptops ).

Challenges i ran into

  1. Spatial Datum Drift: Aligning a static 2D blueprint to a moving 3D video feed was mathematically challenging. We solved this by implementing "Thought Signatures" that persist context even when the camera pans.

  2. Latency vs. Accuracy: Performing high-fidelity audits requires processing large image frames. I optimized this by using a dual-track system: Flash-Lite for fast motion tracking and Pro for deep-logic
    auditing every few seconds.

  3. API Rate Management: Industrial sites provide a constant stream of data. I implemented a custom RateLimiter and exponential backoff to handle high-frequency multimodal requests without breaking the UX.

  4. Manual RFI Dispatch : Even the best agents need a backup. We provide a seamless manual override that allows users to bypass the AI and dispatch reports instantly, ensuring no site error ever goes unreported due to system downtime.

Accomplishments that i'am proud of

  1. The Vibe Slider: Creating a seamless, performant comparison tool that lets you slide between raw concrete and a finished architectural render in real-time.
  2. Auto-Dispatching: Successfully integrating function calls that not only detect an issue but actually "fill out the paperwork" for the supervisor, auto-feeding site coordinates and technical descriptions.
  3. Compact HUD: Designing a UI that feels like professional industrial equipment—unobtrusive but powerful.

Validated Performance:

  1. In test scenarios across 47 construction site images, SITESYNC AI achieved a 94% hazard detection rate for critical safety
  2. 89% accuracy in identifying blueprint deviations (missing MEP elements, structural misalignment >5mm).
  3. False positive rate held at just 6%, ensuring supervisors aren't overwhelmed with noise.

What I learned

I learned that multimodal context is the king of industrial AI. A model that can see a blueprint, hear a supervisor's concern, and watch a live video feed simultaneously is infinitely more valuable than three separate tools. I also learned that in high-stakes environments like construction, UX must be "glanceable." If a supervisor has to spend more than 2 seconds looking at the screen, the tool has failed.

What's next for SITESYNC AI

The future of SITESYNC AI lies in Multi-Agent Site Coordination.

💠I envision a fleet of SiteSync-enabled drones and helmets that share a single "Spatial Ledger," allowing the AI to coordinate different trades (e.g., telling the plumbers to wait because the structural steel is 5mm off-datum).

💠I am also looking into Temporal Delta Tracking, where the AI can predict delays before they happen by calculating the derivative of the build progress: If drops below a certain threshold, the AI dispatches an alert to project management automatically.

Built With

Share this project:

Updates