Inspiration
github branch: https://github.com/mrlancelot/sniply/tree/audio-maya-geni
The internet is drowning in AI-generated slop - generic, hallucinated content that erodes trust. At the same time, creators struggle with a painful choice: spend $500+ per video on freelancers or waste hours doing it themselves. We asked: What if AI could create videos that are fast, cheap, AND factually accurate?
We built Sniply to fight AI slop by putting real-time research at the heart of video generation. Every video starts with facts, not hallucinations.
What it does
Sniply transforms any topic into a professional explainer video in under 60 seconds:
- User inputs a topic (e.g., "How quantum computing works")
- AI Research Agent conducts real-time web research to gather current, factual information
- Script Agent writes an engaging narration based on actual data
- Production Pipeline generates professional voiceover and relevant visuals in parallel
- Video Assembly merges everything into a ready-to-publish MP4
Cost: $0.06 per video (vs $500+ market rate) Time: 45 seconds (vs hours of manual work) Quality: Factually accurate (vs AI hallucinations)
How we built it
Tech Stack:
- Backend: FastAPI + Pydantic AI for orchestrating multi-agent workflows
- AI Models: OpenRouter (Google Gemini for research & scripting), OpenAI (TTS for voiceover, DALL-E for images)
- Frontend: React + Vite for a fast, modern UI
- Auth: Clerk with webhook sync to Supabase
- Database: Supabase (PostgreSQL + Storage) with Row Level Security
- Video Processing: FFmpeg for merging audio/visuals
- Prompts: Jinja2 templates for dynamic, maintainable AI prompts
Architecture: User → Clerk Auth → FastAPI → Research Agent (real-time search) ↓ Script Agent (factual writing) ↓ Parallel: TTS + Image Generation ↓ FFmpeg Video Assembly ↓ Supabase Storage + Database
Key Innovation: Multi-agent pipeline with research-first approach to ensure factual accuracy.
Challenges we ran into
- Research Quality vs Speed: Balancing thorough research with the 45-second generation goal required careful prompt engineering and model selection (Gemini Flash Lite for research, Flash for scripting)
- Parallel Processing: Coordinating simultaneous TTS and image generation while handling API rate limits and errors gracefully
- Video Sync: Ensuring audio and visuals align perfectly - required precise timing calculations and FFmpeg parameter tuning
- Auth Flow Complexity: Implementing Clerk → Webhook → Supabase sync with proper error handling and user creation edge cases
- Cost Optimization: Getting from $0.25/video down to $0.06 by optimizing model choices and prompt efficiency
Accomplishments that we're proud of
✅ Fighting AI Slop: Built a research-first pipeline that generates factual, trustworthy content - not hallucinations
✅ 45-Second Generation: Achieved blazing-fast video creation through parallel processing and optimized workflows
✅ $0.06 Per Video: 100x cheaper than hiring freelancers, making professional video accessible to everyone
✅ Production-Ready: Full authentication, database with RLS, credit system, and error handling - not just a hackathon demo
✅ Clean Architecture: Maintainable codebase with Jinja2 prompt templates, type-safe Pydantic models, and modular agent design
✅ Complete Product: From user signup to video download - the entire flow works end-to-end
What we learned
- Multi-Agent AI is powerful but complex: Orchestrating research → scripting → generation requires careful state management and error handling
- Prompt engineering is critical: Jinja2 templates let us iterate quickly and maintain consistency across agents
- Model selection matters: Gemini Flash Lite for research, Flash for scripting, and OpenAI for TTS gave us the best speed/cost/quality balance
- Real-time research is feasible: With proper caching and parallel processing, we can do live research without sacrificing speed
- Auth is hard but essential: Clerk + Supabase webhooks provided enterprise-grade security from day one
- FFmpeg is magic: Video processing that would take custom code is a few well-tuned FFmpeg commands
What's next for sniply.fun
Short-term (next sprint):
- ✨ Source citations - Display research sources in video descriptions for transparency
- 🎨 Custom branding - Logo overlays, color schemes, custom fonts
- 🗣️ Voice selection - Multiple voices, accents, and speaking styles
- 📊 Analytics dashboard - View counts, engagement metrics, export history
Medium-term:
- 🌍 Multi-language support - Generate videos in 20+ languages
- ⏱️ Longer formats - 2-5 minute deep dives with chapters
- 🎬 Video styles - Documentary, tutorial, listicle formats
- 🔗 Integrations - YouTube, TikTok, LinkedIn direct publishing
Long-term vision:
- 🚀 API access - Let developers embed Sniply into their products
- 🤖 Custom agents - Users can train domain-specific research agents
- 📚 Knowledge base - Upload your docs for brand-aligned, factual videos
- 🏢 Enterprise features - Team collaboration, approval workflows, white-labeling
The Big Picture: Become the standard for AI-generated video that people actually trust. In a world of AI slop, Sniply is the mark of quality.
Log in or sign up for Devpost to join the conversation.