Inspiration

I was building an iOS app last month and had agents handle the entire development process. They got the functionality working, but the UI was always a mess. No matter how I prompted or what I tried, I couldn't get them to produce something that actually looked good. After digging around for a solution and coming up empty, I realized the problem was fundamental: agents have no real way to see and understand polished interfaces. So I decided to build one myself.

What it does

Spectre takes screen recordings of any UI and converts them into MD files that AI agents can ingest and replicate. Instead of guessing what a good interface looks like, agents now have a concrete visual reference to work from. Point it at any app, record it, and hand the output to your agent.

How we built it

We built Spectr as a two-part system: a Next.js frontend for upload, tracking, and spec delivery, and a Python FastAPI worker that processes app recordings behind the scenes. When a user uploads a mobile screen recording, the video is stored in Supabase, then the worker extracts key frames with FFmpeg, analyzes the UI and design language with Claude, and assembles everything into a structured frontend spec. The result is a product that turns visual inspiration into something concrete: a shareable document that agents can actually build from

Challenges we ran into

Getting the MD file format right took a lot of iteration. It needed to carry enough information about layout, spacing, and visual hierarchy without becoming so complex that agents struggled to parse it. There was also the challenge of handling different screen sizes and resolutions cleanly.

Accomplishments that I am proud of

Getting an agent to replicate a real, polished UI from a Spectre recording for the first time was the moment it clicked. The output actually looked like something a human designer built, which had never happened for me before when working with agents alone.

What we learned

The gap between what agents can reason about and what they can actually produce visually is a tooling problem, not a model problem. Give them the right reference and the quality jumps immediately. The modality matters more than we give it credit for.

What's next for Spectre

Support for more input formats beyond screen recordings, tighter integration with popular agent frameworks, and a library of curated MD files for common UI patterns so you do not even need to record anything to get started.

Built With

Share this project:

Updates