MarginGuard AI
Inspiration
We come from the commercial construction world — a $2 trillion industry where profit margins average just 5–8%. We've watched contractors lose entire project margins not because of bad estimates, but because nobody caught the bleed early enough. A foreman verbally agrees to extra work, it never becomes a change order, and $50K vanishes. We built MarginGuard to be the analyst who never sleeps, never misses a field note, and always follows the money.
How We Built It
MarginGuard is a Next.js app powered by Gemini 2.5 Flash via the Vercel AI SDK's streamText with multi-step tool use (maxSteps: 15). The agent has 7 tools — from portfolio scanning to field note search to sending live email reports via a Google Apps Script webhook.
The core insight was designing tools that cascade. A single user prompt like "How's my portfolio?" triggers scanPortfolio, which flags a critical project, which autonomously triggers investigateProject, then analyzeLaborDetails, then searchFieldNotes — 4+ chained tool calls with zero user intervention.
All project data (~18K records across 10 CSVs) is parsed with PapaParse and cached in memory on first request. Every cost calculation follows:
$$\text{Labor Cost} = (h_{st} + 1.5 \cdot h_{ot}) \times r \times b$$
where $h_{st}$ is straight-time hours, $h_{ot}$ is overtime hours, $r$ is the hourly rate, and $b$ is the burden multiplier. The overtime premium — the pure waste from poor scheduling — is isolated as:
$$\text{OT Premium} = 0.5 \cdot h_{ot} \times r \times b$$
Margin erosion is tracked per SOV line item using earned value analysis:
$$\text{EAC} = \frac{\text{Actual Cost}}{\text{\% Complete} / 100}, \quad \text{Projected Overrun} = \text{EAC} - \text{Budget}$$
What We Learned
- Tool design is prompt engineering. The shape of what a tool returns matters more than the system prompt. When our tools returned rich, pre-analyzed JSON (variances, risk levels, dollar impacts), the LLM produced vastly better reasoning than when we returned raw data.
- Autonomous chaining requires trust. Setting
maxSteps: 15and telling the model to investigate on its own felt risky — but the system prompt guardrails ("always quantify in dollars," "never stop after one tool call when problems exist") kept it on track. - Construction data is messy. Field notes contain the most valuable signals (verbal approvals, scope creep) but are unstructured text. Keyword search with context turned out to be surprisingly effective.
Challenges
- Token budget vs. depth. 18K records across 10 CSVs is a lot. We had to pre-aggregate in tool execute functions rather than passing raw data to the LLM — the tools do the math, the model does the storytelling.
- Streaming tool invocations. Getting tool call status cards to render in real-time while the agent is still thinking required careful handling of the
toolInvocationsarray states (call→result) from the Vercel AI SDK. - Email delivery under pressure. Google Apps Script webhooks have quirks with CORS and content types. We burned 15 minutes debugging a
302 redirectbefore realizing GAS web apps require following redirects on POST. - Time. 2.5 hours. No safety net. Every architectural decision was filtered through one question: does this help us win?
Built With
- vercel
Log in or sign up for Devpost to join the conversation.