VoxShift

Inspiration

Manual dubbing and publishing workflows are slow, error-prone, and hard to audit.
We wanted a practical tool that automates multilingual video localization while keeping strict publishing controls.

What it does

VoxShift is a Gemini-first dubbing pipeline that:

Transcribes and translates media
Generates dubbed audio with TTS
Produces output media, subtitles, and segment JSON
Runs YouTube intake risk checks when a source URL is provided
Uploads to a specific YouTube channel with metadata, dry-run validation, and audit manifests

How we built it

Node.js + TypeScript CLI architecture
Gemini API for transcription/translation and TTS
ffmpeg/ffprobe for media processing and muxing
YouTube Data API for intake metadata checks
YouTube OAuth upload flow with channel-ID enforcement
CI-style checks with typecheck, build, and smoke tests

Challenges we ran into

OAuth scope mismatches (youtube.upload vs channel verification needs)
Handling structured model output reliably across edge cases
Keeping upload automation flexible without weakening safety
Managing API auth differences between Gemini and YouTube APIs
Designing duplicate protection and idempotent run behavior

Accomplishments that we're proud of

End-to-end dubbing pipeline with production-style outputs
youtube:run supports both pipeline mode and upload-only mode
Optional --source-url with policy-based intake checks
Strong safety controls: target channel enforcement, dry-run upload, manifest trail
Real speech fixture + automated smoke paths including model-variant checks

What we learned

Automation needs guardrails as much as speed
Strong schemas and validation save time in LLM-driven pipelines
Channel-level publishing checks are essential for real operations
Dry-run + manifest logging dramatically improves trust and debugging

What's next for VoxShift

Add rights-aware source ingestion workflow (with explicit policy gates)
Improve dubbing quality (speaker consistency, pacing, prosody control)
Add batch job orchestration and queue-based processing
Build a lightweight UI on top of the CLI engine
Expand monitoring, retry logic, and publish-state observability

Built With

Updates

Private user started this project — Feb 08, 2026 02:50 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.