Sir Alibi 🛡️

AI-powered relationship-repair agent — YHack 2026


Inspiration

We've all had moments where a small mistake turns into a much bigger problem because we freeze and avoid it. We wanted a tool that doesn't just "generate text," but actually helps you recover the relationship with a concrete plan — drafting the apology, sending a real gift, and scheduling the follow-up, all autonomously. The knight theme came from the idea of having a loyal squire who helps you restore honor after you've messed up.


What it does

Sir Alibi is an agentic relationship-repair assistant. You describe the situation ("I forgot Sarah's birthday, she's my coworker"), and it:

  1. Perceives the context and stakes — work vs. personal, severity, confidence level, and what info might be missing
  2. Researches practical gift options with search queries and links, and a suggested follow-up window
  3. Reasons on the right tone, alibi policy, budget range, and risks (including whether an explanation is even appropriate)
  4. Writes a personalized apology message + a follow-up message, respecting the alibi policy
  5. Acts by creating a Gmail draft, scheduling a Calendar follow-up, sending a real brand gift card (Amazon, Starbucks, Subway, and more) via Tremendous scaled to severity, and generating a Spotify playlist matched to the recipient's probable taste as a personalized apology gift
  6. Speaks the apology in your own cloned voice via ElevenLabs — you record a 60-second sample during onboarding and every apology is delivered as audio that actually sounds like you

How we built it

Agent pipeline: A multi-step workflow (Perception → Research → Pause Gate → Reason → Write) where every step returns strict JSON validated with schemas, with one automatic retry and deterministic fallbacks to keep the system stable under real inputs.

LLMs: Routed through Lava AI Gateway forwarding to OpenAI-compatible chat completions. Step behavior is controlled via separate system prompts and schemas per stage — the model is asked for one JSON object per step, never a free-form answer.

Pause gate: An agentic safety mechanism — if perception surfaces clarifying questions under high-stakes heuristics (birthday/anniversary patterns + low confidence), the pipeline stops before Reason/Write and returns needs_user_input so the UI can ask the human first, rather than confidently guessing wrong.

Integrations:

  • Google OAuth 2.0 + Gmail API — creates email drafts (draft-only, no auto-send)
  • Tremendous REST API (v2, direct fetch) — sends real brand gift cards scaled to severity score
  • ElevenLabs — Audio Isolation cleans the voice sample, Voices API clones it, Text-to-Speech delivers the apology in the user's own voice
  • Spotify API — generates a playlist matched to the recipient's probable musical taste as a supplemental apology gift

Backend: Node.js + Express, SSE streaming for live agent progress in the UI. MongoDB for user persistence.

Frontend: React + Vite + Three.js, showing step-by-step agent progress as "quest cards" — gift ideas, apology draft, follow-up plan, and audio playback.


Challenges we ran into

Reliability: LLMs produce extra text and hallucinate JSON — we enforced strict schemas, one automatic retry per step, and repair fallbacks so the demo never hard-crashes on a bad model output.

Voice cloning UX: Getting a clean 60-second recording in a browser without external libraries, then running it through Audio Isolation before cloning, required careful sequencing of the ElevenLabs endpoints and graceful fallback to a default voice if cloning fails.

Integration complexity: Google OAuth scopes, refresh tokens, and redirect URIs are finicky. We scoped down to draft-only Gmail to avoid accidental sends and kept tokens in-memory for the hackathon build.

Tone correctness: The right apology for a coworker is very different from one for a partner — so we built severity scoring, incident type classification, and an alibi policy into the Reason step to prevent inappropriate or manipulative framing.

Gift calibration: Tremendous gift card amounts are tiered to the agent's severity score (≤0 → skip, <20 → $15, <50 → $30, <100 → $75, ≥100 → $150) so the gesture always feels proportional, not random.


Accomplishments that we're proud of

  • A working multi-step agentic loop that produces one stable JSON contract — robust enough for real UI and live integrations
  • Voice cloning that makes the apology sound like it actually came from you, not a robot
  • Clear separation between thinking (Perception/Research/Reason) and doing (Write/Act), with a pause gate that keeps humans in control on high-stakes situations
  • Gmail drafting, Tremendous gift cards, Spotify playlists, and ElevenLabs voice — all firing from a single confession form

What we learned

"Agentic" isn't about more LLM calls — it's about state, tool boundaries, and reliability. Schemas and fallbacks are the difference between a cool idea and a demo that survives real inputs. A pause gate is not a weakness — stopping to ask a clarifying question before acting on a high-stakes situation is exactly what a trustworthy agent should do.


What's next for Sir Alibi

  • More channels: Slack/Teams drafts, SMS drafts, WhatsApp DMs
  • Recipient vibe check — agent autonomously sends a one-question "preference nudge" to a trusted contact before finalizing the gift
  • A relationship repair tracker showing which relationships are mending and which still need attention

Built With

Share this project:

Updates