Inspiration

github branch: https://github.com/mrlancelot/sniply/tree/audio-maya-geni

The internet is drowning in AI-generated slop - generic, hallucinated content that erodes trust. At the same time, creators struggle with a painful choice: spend $500+ per video on freelancers or waste hours doing it themselves. We asked: What if AI could create videos that are fast, cheap, AND factually accurate?

We built Sniply to fight AI slop by putting real-time research at the heart of video generation. Every video starts with facts, not hallucinations.

What it does

Sniply transforms any topic into a professional explainer video in under 60 seconds:

  1. User inputs a topic (e.g., "How quantum computing works")
  2. AI Research Agent conducts real-time web research to gather current, factual information
  3. Script Agent writes an engaging narration based on actual data
  4. Production Pipeline generates professional voiceover and relevant visuals in parallel
  5. Video Assembly merges everything into a ready-to-publish MP4

Cost: $0.06 per video (vs $500+ market rate) Time: 45 seconds (vs hours of manual work) Quality: Factually accurate (vs AI hallucinations)

How we built it

Tech Stack:

  • Backend: FastAPI + Pydantic AI for orchestrating multi-agent workflows
  • AI Models: OpenRouter (Google Gemini for research & scripting), OpenAI (TTS for voiceover, DALL-E for images)
  • Frontend: React + Vite for a fast, modern UI
  • Auth: Clerk with webhook sync to Supabase
  • Database: Supabase (PostgreSQL + Storage) with Row Level Security
  • Video Processing: FFmpeg for merging audio/visuals
  • Prompts: Jinja2 templates for dynamic, maintainable AI prompts

Architecture: User → Clerk Auth → FastAPI → Research Agent (real-time search) ↓ Script Agent (factual writing) ↓ Parallel: TTS + Image Generation ↓ FFmpeg Video Assembly ↓ Supabase Storage + Database

Key Innovation: Multi-agent pipeline with research-first approach to ensure factual accuracy.

Challenges we ran into

  1. Research Quality vs Speed: Balancing thorough research with the 45-second generation goal required careful prompt engineering and model selection (Gemini Flash Lite for research, Flash for scripting)
  2. Parallel Processing: Coordinating simultaneous TTS and image generation while handling API rate limits and errors gracefully
  3. Video Sync: Ensuring audio and visuals align perfectly - required precise timing calculations and FFmpeg parameter tuning
  4. Auth Flow Complexity: Implementing Clerk → Webhook → Supabase sync with proper error handling and user creation edge cases
  5. Cost Optimization: Getting from $0.25/video down to $0.06 by optimizing model choices and prompt efficiency

Accomplishments that we're proud of

✅ Fighting AI Slop: Built a research-first pipeline that generates factual, trustworthy content - not hallucinations

✅ 45-Second Generation: Achieved blazing-fast video creation through parallel processing and optimized workflows

✅ $0.06 Per Video: 100x cheaper than hiring freelancers, making professional video accessible to everyone

✅ Production-Ready: Full authentication, database with RLS, credit system, and error handling - not just a hackathon demo

✅ Clean Architecture: Maintainable codebase with Jinja2 prompt templates, type-safe Pydantic models, and modular agent design

✅ Complete Product: From user signup to video download - the entire flow works end-to-end

What we learned

  • Multi-Agent AI is powerful but complex: Orchestrating research → scripting → generation requires careful state management and error handling
  • Prompt engineering is critical: Jinja2 templates let us iterate quickly and maintain consistency across agents
  • Model selection matters: Gemini Flash Lite for research, Flash for scripting, and OpenAI for TTS gave us the best speed/cost/quality balance
  • Real-time research is feasible: With proper caching and parallel processing, we can do live research without sacrificing speed
  • Auth is hard but essential: Clerk + Supabase webhooks provided enterprise-grade security from day one
  • FFmpeg is magic: Video processing that would take custom code is a few well-tuned FFmpeg commands

What's next for sniply.fun

Short-term (next sprint):

  • ✨ Source citations - Display research sources in video descriptions for transparency
  • 🎨 Custom branding - Logo overlays, color schemes, custom fonts
  • 🗣️ Voice selection - Multiple voices, accents, and speaking styles
  • 📊 Analytics dashboard - View counts, engagement metrics, export history

Medium-term:

  • 🌍 Multi-language support - Generate videos in 20+ languages
  • ⏱️ Longer formats - 2-5 minute deep dives with chapters
  • 🎬 Video styles - Documentary, tutorial, listicle formats
  • 🔗 Integrations - YouTube, TikTok, LinkedIn direct publishing

Long-term vision:

  • 🚀 API access - Let developers embed Sniply into their products
  • 🤖 Custom agents - Users can train domain-specific research agents
  • 📚 Knowledge base - Upload your docs for brand-aligned, factual videos
  • 🏢 Enterprise features - Team collaboration, approval workflows, white-labeling

The Big Picture: Become the standard for AI-generated video that people actually trust. In a world of AI slop, Sniply is the mark of quality.

Built With

Share this project:

Updates