Prompt Reels

Inspiration

AI models are only as good as their prompts—but prompt engineering is tedious, manual work. We asked: what if prompts could optimize themselves across multiple deployments, learning from real-world usage? News videos are perfect for this: they need accurate scene descriptions AND relevance ratings. We built Prompt Reels to prove Federated Prompt Optimization (FPO) can make AI continuously improve at both tasks without centralized training.

What it does

Prompt Reels automatically fetches news articles with videos, analyzes them using AI, and learns to do it better over time:

Fetches news with videos (Tavily API + BrowserBase scraping)
Downloads videos, including HLS streams
Describes video scenes with AI vision (Gemini 2.5 Pro / Azure GPT-4o)
Transcribes dialogue with Whisper
Rates video-article match (0-100 score)
Optimizes prompts across deployments using FPO

Two optimization loops run continuously:

Scene Description FPO: Learns to describe video content more accurately
Match Rating FPO: Learns to better evaluate video-article relevance

The system tracks everything with W&B Weave, providing full observability into prompt evolution and performance improvements.

How we built it

Tech Stack:

Backend: Node.js + Express
AI: Google Gemini 2.5 Pro, Azure OpenAI (GPT-4o, Whisper)
Video Processing: FFmpeg (scene detection, frame extraction, audio transcription)
News APIs: Tavily (search), BrowserBase (headless browser for video extraction)
Tracking: Weights & Biases Weave
Deployment: Docker on VPS (reels.hurated.com)

Architecture:

News Fetcher: Exponential backoff strategy (3→6→12→96 articles) until finding video
Video Downloader: Detects and handles HLS streams with ffmpeg
Scene Analyzer: FFmpeg scene detection → frame extraction → AI description + Whisper transcription
FPO Engine: Stores prompts in JSON, iterates with feedback, shares improvements across nodes
Workflow Manager: Tracks articles through states (fetched → described → rated)

Key Features:

Duplicate prevention (URL-based across sessions)
Provider failover (Azure ↔ Gemini with retry limits)
Markdown rendering on article pages
Interactive scene viewer with video playback
Clean text extraction (removes junk like "watch now VIDEO05:11")

Challenges we ran into

HLS Video Streams: ABC News and many sites use .m3u8 playlists, not direct MP4s. Our axios downloader saved 1.9KB playlist files, causing ffprobe to fail with "moov atom not found." We fixed it by detecting HLS URLs and using ffmpeg to download/stitch segments.
Infinite Recursion Bug: Provider failover had no retry limit. When both Gemini and Azure failed, the system switched providers infinitely, crashing with stack overflow. We added MAX_RETRIES = 1 and a retry counter.
Missing Scene Parameters: Called describeScene(framePaths) but function expected 4 parameters. This caused undefined.toFixed() errors. Fixed by passing sceneId, start, end.
Duplicate Articles: listArticles() returned source as string, not object. Code tried source.url on string → undefined → duplicates not detected. Changed to return {domain, url} object.
Junk Text Extraction: CNBC articles included "watch now VIDEO05:11" patterns from related videos. Created regex-based cleanArticleText() to remove these patterns.
Video Not Showing in Scene Viewer: Scene data didn't store videoPath, and fallback used wrong directory. Fixed by storing path and updating fallback to uploads/articles/{id}.mp4.

Accomplishments that we're proud of

Federated Prompt Optimization: Implemented a working FPO system with prompt versioning, feedback loops, and cross-deployment sharing

Full Pipeline: End-to-end workflow from news search → video download → AI analysis → rating, all automated

Audio + Visual Analysis: Combined computer vision (scene description) with speech recognition (Whisper) for richer context

Production-Ready: Deployed live at reels.hurated.com with Docker, proper error handling, and monitoring

Smart Video Handling: Automatically detects and downloads HLS streams that typical scrapers miss

Zero Duplicate Strategy: Prevents fetching same articles across sessions using URL deduplication

Dual AI Provider: Seamless failover between Gemini and Azure with intelligent retry logic

Clean UX: Dashboard, article pages, and scene viewer with proper navigation and markdown rendering

What we learned

Federated Learning IRL: FPO is more than theory—it works! We saw prompts genuinely improve across iterations. The key is having good feedback signals (scene accuracy, match scores) that guide optimization.

Prompt Engineering is Critical: Small changes in prompts drastically affect output quality. FPO automates this iteration, but you still need strong baseline prompts and clear evaluation criteria.

Video is Messy: Every news site handles video differently (direct MP4, HLS, YouTube embeds, JSON-LD metadata). Robust extraction requires multiple strategies and graceful fallbacks.

AI Observability Matters: W&B Weave was essential for debugging. Seeing exact prompts, responses, and latencies helped us identify infinite loops and parameter issues fast.

Error Handling Saves Lives: Production AI systems fail in creative ways. Infinite recursion, undefined parameters, missing files—defensive coding and retry limits are non-negotiable.

Audio Adds Context: Transcripts significantly improved video-article matching. News reporters often say things not captured in visuals (names, details, context).

What's next for Prompt Reels

Multi-Node FPO: Deploy across multiple servers and implement true federated optimization with prompt sharing and voting

Evaluation Metrics: Add ground truth labels and automated scoring for scene descriptions and ratings

Fine-Tuned Models: Use FPO-optimized prompts to create training data for fine-tuning smaller, faster models

More Sources: Expand beyond news to YouTube, podcasts, educational content

Search & Discovery: Build semantic search over transcripts and descriptions

Mobile App: Native apps for browsing and discovering relevant video content

Agent Workflows: Let AI agents automatically find, analyze, and curate video collections based on topics

Reel Generation: Automatically create short-form reels from long videos using scene analysis and transcripts

Monetization: Partner with news orgs to provide video relevance scoring and content recommendations

Built With

browserbase
docker
express.js
ffmpeg
google-gemini-2.5-pro
gpt-4.1
node.js
openai-(gpt-4o
tavily-api
w&b
whisper
whisper)

Updates

Denis Bystruev started this project — Oct 12, 2025 04:29 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.