Inspiration

As an indie game developer, I spend countless hours learning new tools and teaching others. I love watching tutorials and sharing knowledge with my community—but there's always been a painful gap: creating those tutorials takes forever.

I'd finish implementing a cool feature in Unity, or figure out a tricky bug fix, and think "I should document this." But the process of opening a screen recorder, editing the footage, adding voiceover, and uploading it would take hours. So most of the time, I just... didn't.

I looked at existing tools like Scribe, but they only work for web apps. As a developer, 90% of my workflow happens in native apps—IDEs, game engines, design tools, terminal windows. I needed something that could watch everything I do on my Mac and turn it into documentation automatically.

So I built Trace.

What it does

Trace is a native macOS application that lives in your menu bar and transforms any workflow into three formats:

  1. 📄 HTML Documentation – Beautiful step-by-step guides you can share instantly
  2. 🎮 Interactive Tutorials – Guided walkthroughs that run directly on macOS (coming soon)
  3. 🎥 Video Explainers – Narrated MP4 tutorials generated in seconds with AI voiceover

How it works:

  1. Record: Click "New Recording" and perform your task in any app—Chrome, Xcode, Figma, Terminal, anything.
  2. Analyze: For every click, Trace captures a screenshot and coordinates, then sends the visual data to Google Gemini.
  3. Generate: Gemini analyzes the UI context (e.g., "User clicked the 'Deploy' button in Xcode") and writes a concise instruction.
  4. Produce: Trace instantly creates your documentation including:
    • Export as HTML for static docs.
    • Interactive Tutorial: A guided overlay that highlights exactly where to click, running natively on macOS.
    • ✨ AI Video Mode – Trace asks Gemini to write a natural voiceover script, then uses text-to-speech and AVFoundation to stitch screenshots and audio into a smooth .mp4 tutorial—no video editor required.

How I built it

Trace is a native SwiftUI app optimized for macOS. I built it using Gemini 3.0 in a week.

  • Vision & Reasoning: I use the Gemini 2.0 Flash API for its incredible speed and multimodal capabilities. I feed it compressed, high-fidelity screenshots, and it returns structured instructions and JSON-formatted video scripts.
  • System Integration: I use ScreenCaptureKit for low-latency screen recording across all apps and AXUIElement (Accessibility API) to detect window focus.
  • Video Engineering: Instead of relying on generative video (which can hallucinate UI details), I built a Deterministic Rendering Engine using AVAssetWriter. I combine real screenshots with synthesized audio tracks (NSSpeechSynthesizer) to ensure the video is pixel-perfect and 100% accurate to the user's actions.
  • Performance: I implemented aggressive background threading and image compression to handle Retina-quality screenshots without blocking the main UI thread.
  • Development: I used Gemini 3 as my development assistant throughout the build process.

Challenges I ran into

  • The "OOM" Crash: Handling dozens of high-res Retina screenshots initially caused memory spikes that crashed the app. I had to rewrite my entire image processing pipeline to use streaming data and background actors.
  • Audio/Video Sync: Generating a video programmatically is hard. I had to manually calculate the duration of every spoken sentence to ensure the video frame changes exactly when the voiceover finishes that sentence.
  • Cross-App Recording: Unlike web-only tools like Scribe, I needed to capture every macOS app. This required deep integration with ScreenCaptureKit and careful permission handling.
  • Gemini JSON Parsing: Getting an LLM to return strictly formatted arrays for my video engine was tricky. I used rigorous prompt engineering to ensure the output was always machine-readable.

Accomplishments that I'm proud of

  • Building a native macOS experience that feels like a system app, not a web wrapper.
  • Universal App Coverage: Unlike web-only tools, Trace works with Xcode, Unity, Blender, Terminal—any app on your Mac.
  • The "AI Video" button. Seeing the app generate a full MP4 with voiceover from scratch in under 10 seconds was a magical moment.
  • Interactive Tutorial Overlays: Successfully implementing a guided click-through system that highlights exactly where users should click.
  • Achieving near-instant analysis speeds by optimizing my image compression before sending to Gemini.

What's next for Trace

  • Distribution System for Interactive Tutorials: The interactive overlay mode currently works on the creator's Mac, but viewers also need Trace installed to see the overlays. I need to build a distribution system so viewers can see the interactive tutorial with just a link.
  • Multi-language Support: Using Gemini to translate the guide and voiceover into languages like Spanish, Chinese, and Japanese instantly.
  • Direct Integration: Exporting guides directly to Notion, Confluence, or Jira API.
  • Focus Highlighting: Automatically drawing red boxes around the clicked elements in the final video.

Built With

Share this project:

Updates