HVAC Margin Rescue Agent

Inspiration

Commercial HVAC contractors operate on thin margins. A project bid at 15% gross margin routinely closes at 7%. By the time a project manager opens a spreadsheet and notices the overrun, it is already too late.

We built the HVAC Margin Rescue Agent to help HVAC contractors catch margin bleeds in real time with data-driven actions. A 405-project portfolio generates over 1.4 million records. No PM has time to read it, but an agent does.

What We Built

A fully autonomous AI agent that ingests raw HVAC project data, detects margin erosion across an entire portfolio, explains root causes in plain language, and surfaces dollar-quantified recovery actions.

The pipeline runs in three phases:

Phase 1 — Data Cleaning (Claude + DuckDB) Claude orchestrates a tool-use loop that calls validated DuckDB queries to clean the raw CSVs: deduplicating labor logs, normalizing 33+ role name variants (e.g. "JM Pipefitter", "Journeyman P.F.", "Pipefitter JM" → one role), and fixing mixed date formats across ten files.

Phase 2 — Aggregation DuckDB collapses ~1.2M rows into per-project, per-SOV-line summaries using a six-stage SQL pipeline. Key metrics computed:

$$\text{Labor Cost} = \sum \left( h_{ST} + h_{OT} \times 1.5 \right) \times r \times \mu$$

$$\text{Variance} = \text{Actual Cost} - \text{Budget}$$

$$\text{Billing Gap} = %\text{Complete} - %\text{Billed}$$

$$\text{Realized Margin} = \frac{\text{Contract Value} - \text{Actual Cost}}{\text{Contract Value}}$$

Aggregated summaries are written to a persistent Supabase (Postgres + pgvector) table for the agent to read from.

Phase 3 — Claude as Analyst (RAG) Field notes are embedded via the Voyage API and stored as vector(1024) columns. At analysis time, Claude retrieves the most semantically relevant notes via cosine similarity, combines them with the structured summaries, and generates a proactive findings report with prioritized recovery actions.

The frontend is built with v0 + Next.js + Shadcn/ui + Recharts and deployed on Vercel.

Challenges

Data quality was brutal. The labor logs alone contained 33 distinct spellings of the same handful of roles. Without normalization, labor cost calculations were meaningless. A "Journeyman Pipefitter" and a "JM Pipefitter" looked like two different cost centers. We used Claude's tool-use loop to identify and collapse variants programmatically rather than hardcoding a lookup table.

Making the agent truly proactive was harder than expected. It's easy to build a dashboard. It's much harder to build something that acts without the user asking. We had to resist the temptation to build query interfaces and instead force the output into a findings feed with concrete, dollar-quantified actions.

The full dataset (labor_logs_all.csv alone is >100 MB) is too large for browser upload. We architected a local-dir ingest mode so the agent can process the full 1.46M-row dataset server-side while still supporting browser uploads for smaller datasets.

What We Learned

That the gap between "data exists" and "decision gets made" is almost entirely a workflow problem, not a data problem. Every piece of information needed to catch a -91% margin project existed in the source files on day one. The agent's job isn't to find new information, but to collapse the distance between raw records and a CFO-level action item.