Inspiration

Most AI media tools today feel like a black box — creators type a prompt, press a button, and hope something good comes out. This process strips away creative control and leaves both beginners and experienced editors frustrated.

On one end of the spectrum, beginners and solo creators lack the skill, time, or resources to learn traditional tools like Adobe. On the other end, experienced users feel constrained by AI tools that are too abstracted or too rigid, limiting their creative process.

VIX bridges that gap.

We combine the best parts of traditional editing workflows like the granular annotation flow pioneered by tools such as Frame.io with an AI-native agent that works alongside the creator rather than replacing them.

This harmony between traditional control and AI-native power allows VIX to serve both beginners and experts, capturing the massive wedge between the two extremes.

What it does

For this hackathon, we built The Focus Editor, a core piece of the VIX platform.

The Focus Editor demonstrates a new kind of AI-driven editing flow where creators can:

  • Annotate or highlight what they want changed
  • Have the AI agent understand context via multimodal analysis
  • Execute editing actions (cutting, masking, focusing, etc.) intuitively
  • Maintain full creative control while leveraging AI in the background

It’s editing powered by AI without sacrificing the creator's vision.

How we built it

We intentionally kept our stack fast, modern, and AI-native:

  • Next.js for a smooth, reactive front-end editing experience
  • Convex for real-time data sync, state management, and media handling
  • Docker to containerize editing microservices
  • Python powering our editing harness and agent-side execution
  • FFmpeg our low-level powerhouse for media manipulation
  • Gemini for quick multimodal understanding that provides contextual intelligence to downstream services
  • n8n for rapid prototyping and iteration during development

This combination let us move incredibly quickly while still building an architecture that can scale and evolve into a full AI-native editing platform.

Challenges we ran into

We ran into several major challenges:

Time constraint vs complexity Building a backend harness capable of real editing tasks (cutting, masking, etc.) took significant groundwork. AI integration Getting the agent to intelligently interact with our editing functions was doable but required more time to fully unlock. Data modeling Designing a clean data layer between front-end, backend, and microservices was a known weak spot and required extra thought.

We accomplished all the foundational work, but simply ran out of time to fully connect the agent to the editing harness in the way we envisioned.

Accomplishments that we're proud of

Despite the time crunch, we achieved several milestones we’re extremely proud of:

  • Built a functioning microservice-based editing harness
  • Integrated multimodal AI understanding into the workflow
  • Demonstrated a workflow that merges traditional annotation with AI-native capabilities

And truthfully: We’re confident that with just a few more hours, we would have fully integrated the agent and made this demo shine even brighter. We reached the edge of something really exciting, and that momentum is carrying us forward.

What we learned

The importance of time management and quality task delegation 😅

What's next for VIX

VIX is not just a project, it’s the beginning of an entire platform. Next steps:

  • Fully flesh out and ship the Focus Editor to real users
  • Build a node-based workflow editor, merging intuitive visual flows with powerful AI actions
  • Develop video retrieval features to supercharge agent reasoning and editing intelligence
  • Expand the platform with deeper agentic behavior, collaborative workflows, and editing primitives

Built With

Share this project:

Updates