Inspiration

Supply chains fail silently. A flood hits a specialty chemicals plant in Gujarat — a Tier-3 supplier nobody in procurement has ever heard of, three contractual hops away from anything with your logo on it. Nothing happens for weeks. Then a blender misses a shipment, then a production line pauses, and six weeks after the flood there's an empty shelf and a forecast miss nobody can explain. McKinsey estimates disruptions like this cost the average company almost half a year's profits every decade, with a month-long shock landing every 3.7 years.

Here's what kept us up at night: the warning was public the entire time. USGS publishes every earthquake within minutes. NOAA publishes every storm warning. GDACS, the FDA, the world's news — it's all free, real-time, and machine-readable. The information to prevent the empty shelf exists. What's missing is something that can read all of it, fuse it with your supplier graph, and act.

That is the gap Faultline is built for.

What it does

Faultline is a living control tower: a rotating globe of today's actual world events, an agent reasoning across them in real time, and a single human approval standing between a flood in Gujarat and an empty shelf in Seattle.

It runs a continuous SENSE → TRACE → ASSESS → ACT → VERIFY loop:

  • 🌍 Sense — the real world, live. Five public feeds (USGS, NOAA, GDACS, FDA, GDELT) stream into Elasticsearch every five minutes, where ELSER embeds every event by meaning on ingest. The intel ticker shows today's actual headlines; hover any monitored-event dot for the source report, minutes old.
  • 🔎 Trace — semantic, not keyword. When the Watcher (Gemini 3) flags an event, the Tracer asks Elastic to match it against supplier records by meaning: "Gujarat flooding" and "specialty chemicals plant, Vadodara" share almost no words — ELSER connects them anyway. Then a graph traversal walks Tier-3 → blenders → finished products.
  • 💰 Assess — in business terms. Days of inventory cover vs. estimated disruption length → dollars of revenue at risk, ranked. (The demo incident: 9 days of cover vs. a 21-day disruption = $460,000 exposed.)
  • 🤝 Act — but only with a human. Analysis is autonomous; spending money is not. Faultline drafts the plan and stops at an approval gate. Once approved: it discovers qualified alternates via Elastic, drafts a contingent PO as a real PDF in Cloud Storage, and the Negotiator runs a supplier call on Gemini's native-audio Live API — transcript streaming into the UI, with a 🔊 button to hear it.
  • ✅ Verify — the math that matters. Does the alternate's 7-day lead beat the 9-day runway? Gap closed, two days of margin — and on the globe, a brand-new mint supply line draws itself from Singapore. Motion is information: the baseline network stays faint, only threatened paths pulse coral, and the re-route appears only when it's real.
  • 📜 Decision Log — every claim cited. Each step writes an auditable decision to Elasticsearch with evidence event ids; every entry in the UI carries chips linking to the live source events (click one — the globe flies you there). Ends in a downloadable situation report.
  • 🧪 What-If console — inject a hypothetical (frost in Minas Gerais, Suez closure) and the identical live pipeline runs it, clearly badged SIMULATED. Typed locations geocode via the Maps API.
  • 📈 It remembers. Sixty days of risk history served live from BigQuery, surfacing recurring chokepoints.
  • 🔁 It's a platform, not a demo. Swap one JSON config and the same tower runs a pharma cold chain.

Judges: open the hosted URL and press ▶ WATCH DEMO — a 70-second scripted replay of the full verified incident. Everything else on screen is live.

How we built it

The brain — Elastic Agent Builder over MCP. Six custom tools (five parameterized ES|QL tools + one workflow write-tool), built in Agent Builder and exposed through Elastic's MCP server, called by the agent over streamable HTTP: search_events, match_event_to_suppliers (hybrid BM25+ELSER), traverse_supply_graph, lookup_exposure, find_alternate_suppliers, write_decision. The UI badges every Elastic MCP call as it streams, so the partner integration is visible, not claimed.

The judgment — Gemini 3 on Vertex AI, built with Google's ADK. An Orchestrator (gemini-3.1-pro-preview) drives Watcher / Tracer / Assessor / Resourcer / Negotiator / Verifier agents (gemini-3.5-flash), plus a pluggable depth registry (Briefer, multimodal Enricher, BigQuery export). Every emission is validated against frozen JSON-Schema contracts — the LLM gets judgment, never bookkeeping. A WebSocket streams every plan step and tool call to the UI from Cloud Run.

The voice — Gemini Live API (gemini-live-2.5-flash-native-audio): two-party negotiation calls with real native audio; voice commands transcribe via multimodal Flash.

The face — React + deck.gl. A GlobeView world where every visual derives from the agent's semantic event stream — the backend never sends pixels. Great-circle flight paths, ripple labels, narration line, an accordion rail that follows the agent through the run.

The ground — Google Cloud. Cloud Run ×5 (agents · feed-ingest · po-generator · voice-gateway · web), Cloud Scheduler (OIDC) driving ingestion, BigQuery, Cloud Storage, Secret Manager, Maps Geocoding.

The process — eight parallel AI coding sessions, zero merge conflicts. We froze interface contracts and golden fixtures in a 90-minute Phase 0 (including a scripted WebSocket replay of the whole incident), gave every session sole ownership of its directories, and built mock-first. Eight agents built simultaneously for ~24 hours; the merge train never hit a conflict.

Challenges we ran into

1. Elastic Agent Builder is young — we hit its edges and engineered through them. ES|QL tool parameters can't be optional (every param must always bind); array membership needs MV_CONTAINS(?arr, field), not IN; and workflow templating only carries scalar strings — arrays arrive stringified. Auditable decision writes seemed impossible until we passed the whole decision as one JSON string and parsed it server-side with an ingest pipeline, idempotent on decision id, restoring full type fidelity. Verified 29/29 through the production MCP endpoint.

2. Model availability is region-shaped. Gemini 3.x returned 404 everywhere we looked — until we discovered it lives only in Vertex's global location (us-central1 serves the 2.5 family). One session nearly downgraded the whole stack before a cross-region probe settled it. And the native-audio Live model turned out to be audio-out only — it returns no text and no input transcription — so voice-in re-routes through multimodal Flash transcription.

3. deck.gl's ArcLayer silently doesn't render under GlobeView. Our supply routes vanished on the globe with zero errors. We rebuilt them as great-circle PathLayers with altitude lift — and fixed country-shaped black holes by discovering that large concave land polygons sag below the ocean sphere and get depth-culled (the fix: lower the ocean radius).

4. The live world fought our demo. The real planet is noisy: actual Iowa storm warnings out-ranked our seeded flood in triage, and ELSER matched a downstream blender almost as strongly as the true chokepoint. That pressure produced real engineering: severity-weighted triage, match-strength deduplication across multi-root paths, and operator-pinned focus runs.

5. Contract drift between two AI sessions, caught red-handed. The events endpoint returned flat lat/lon; the UI client expected location:{lat,lon} — and silently fell back to fixture data, so the "live" ticker showed a stale earthquake. The fix was small; the lesson wasn't: every cross-agent boundary needs a schema, and every fallback needs to be loud.

6. ELSER inference made bulk ingestion outrun its own HTTP timeout — documents landed while the client reported failure. Longer request timeouts + smaller chunks: 423 events per tick, clean.

Accomplishments that we're proud of

  • The full loop is real, end to end, in production: live flood event → ELSER ranks the right Tier-3 plant #1 → graph traversal to two product lines → $460k quantified → human approval → alternate discovered → PO PDF in GCS → native-audio negotiation → verified gap closure → auditable trail in Elasticsearch. Golden-path suite: 13/13 against live Elastic + live Gemini simultaneously.
  • 423 real world events ingested per tick, ELSER-embedded, refreshing every 5 minutes — the ticker shows today's headlines during judging.
  • Honesty as a feature: live data is real, the company is seeded, simulations are badged SIMULATED, the analytics backfill is labeled — every boundary between real and demo is visible in the UI itself.
  • A globe where motion is information: nothing pulses unless it means something.

What we learned

  • Semantic retrieval is the load-bearing wall for agents that face the real world. Remove Elastic and there is no fusion of messy reality with structured records — no matching, no traversal, no agent.
  • Let the LLM judge; never let it bookkeep. Deterministic math, frozen schemas, and validated structured outputs everywhere — Gemini spends its intelligence on what matters and what to do, and the numbers always reconcile.
  • Contracts are how AI agents scale as a team. Eight parallel coding agents, frozen interfaces, golden fixtures, sole-ownership directories: zero merge conflicts in ~24 hours. The one bug that slipped through was exactly the one boundary where two agents drifted from the contract.

What's next for Faultline

Real telephony transport for negotiation calls (the AI stays Google; only the audio pipe changes) · more verticals — the pharma cold-chain profile already runs · ERP/procurement write-back for the POs · learned severity models trained on the BigQuery history · persistent MCP sessions (~15s faster per run).to

Built With

Share this project:

Updates