Inspiration

Vibe coding has made shipping software almost frictionless—which means founders can now build 100× faster… and still build things nobody wants. The bottleneck isn’t code anymore. It’s truth: finding the buyer’s trigger moment, the real pain, the switching friction, and what people will actually pay for.

Vis-AI-Vis was inspired by that gap: founders need customer discovery interviews, but interviews are hard to run well—especially when the most important signals aren’t the words, they’re the micro-reactions (hesitation, confusion, discomfort, conviction) that tell you where the real pain lives.

What it does

Vis-AI-Vis is a privacy-first, real-time customer discovery copilot.

  • Guides discovery interviews live, helping you drill into the buyer’s trigger moment
  • Detects nonverbal cues (in the moment) to ask better follow-ups and surface deeper truth
  • Produces structured outcomes: problem statements, triggers, objections, switching constraints, and willingness-to-pay signals

🚫 No recording. Just reasoning.

Unlike “meeting AI” tools that record and store liabilities, Vis-AI-Vis processes the live video stream ephemerally for real-time insight and then discards it. No video files. No “upload and wait.” Just coaching at the speed of conversation.

How we built it

We designed an “ephemeral stream” pipeline:

Camera → Live multimodal stream → Real-time signals + interview guidance → Structured insights → /dev/null

The model observes the live conversation, identifies moments worth probing, and generates the next best question while the interview is happening. Instead of producing a transcript-first workflow, the system is optimized for discovery outcomes.

Challenges we ran into

  • Privacy + trust: The project had to be useful without becoming a recorder. We leaned into zero-retention design and kept only structured, consented outputs.
  • Latency: Interview coaching only matters if it’s real-time. The pipeline was designed to minimize delay and avoid record/upload/chunk steps.
  • Signal vs. noise: Not every facial reaction is meaningful. We focused on patterns that correlate with discovery moments: uncertainty, hesitation, contradiction, and conviction.

What we learned

  • Builders don’t need more speed to ship—they need more speed to validate.
  • The fastest route to problem/solution fit is finding the trigger moment and proving willingness to switch/pay.
  • Privacy-first architecture isn’t just safer—it improves interview quality because people speak more honestly when there’s no permanent record.

What’s next

  • Add a remote interview “extension” mode for Zoom/Meet/Teams sessions
  • Build a repeatable “Discovery Score” that measures evidence quality (trigger clarity, pain intensity, switching friction, willingness-to-pay)
  • Expand from discovery to “solution tests” (pricing, positioning, and landing-page experiments) driven by interview evidence

Built With

Share this project:

Updates