ShotSpec – JSON-Native Visual Control for FIBO
Inspiration
Modern image generation tools are powerful, but they fall apart in real production workflows. Small changes—like adjusting camera POV—can unexpectedly change the subject, lighting, or overall look. Creative teams don’t just need images; they need control, continuity, and accountability.
We were inspired by how film, advertising, and game studios actually work: shots are defined by specifications, not vague prompts. Camera, lighting, identity, and color are intentional decisions that must persist across iterations. ShotSpec was built to bring that production mindset to visual AI using FIBO’s structured generation capabilities.
What it does
ShotSpec is a JSON-native creative control system built on top of Bria’s FIBO model.
Instead of generating isolated images, ShotSpec treats every output as the result of a versioned shot specification. Each specification explicitly defines:
- Subject identity (anchored and lockable)
- Camera POV, FOV, and framing
- Composition rules
- Lighting setup
- Color palette with HDR / 16-bit intent
- Output constraints and seed handling
ShotSpec allows creators to safely change parameters like camera POV without changing the subject, compare versions visually and structurally, and understand exactly what changed and why.
FIBO generates images.
ShotSpec manages creative intent.
How we built it
ShotSpec is a full-stack web application built with:
- Next.js + TypeScript for the UI and API layer
- Monaco Editor for schema-validated JSON editing
- Prisma + SQLite for versioned shot storage
- Zustand for predictable state management
- Hugging Face FIBO as the image generation engine
We designed a deterministic translation layer that converts structured ShotSpec JSON into constrained FIBO inputs. Identity, camera, lighting, and color are handled as separate, canonical blocks, ensuring that changing one does not unintentionally affect the others.
To ensure reliability during demos, ShotSpec includes a mock inference provider that mirrors real generation behavior, while the full FIBO integration is implemented server-side and can be activated when inference credits or local compute are available.
Challenges we ran into
Identity drift across shots
Changing POV caused the subject to change. We solved this by introducing explicit, lockable identity anchors in the JSON spec.Managed inference randomness
Pixel-level determinism isn’t guaranteed via APIs. We reframed the system around shot-level determinism, which aligns with real production needs.Prompt ambiguity
Even small text changes caused variance. We solved this with canonical JSON-to-prompt translation.Reliability vs live inference
To avoid flaky demos, we separated creative control from execution using a mock provider while keeping real FIBO integration intact.
Accomplishments that we’re proud of
- Built a JSON-first creative control system, not just a prompt UI
- Achieved consistent subject identity across camera POV changes
- Implemented versioning, JSON diffs, and visual comparisons
- Designed a system that is honest about model limits while still production-ready
- Created a tool aligned with real professional workflows
What we learned
- Determinism in generative AI is about intent, not pixels
- Identity must be a first-class concept, not an emergent side effect
- Structured control outperforms prompt engineering for real workflows
- Reliability and clarity matter more than flashy demos in professional tools
What’s next for ShotSpec – JSON-Native Visual Control for FIBO
Next, we plan to expand ShotSpec into a full production tool by adding:
- Multi-shot sequences and storyboard timelines
- Team roles with locked permissions (director, brand, junior designer)
- Deeper integration with FIBO’s advanced parameters
- Export pipelines for game engines, ad platforms, and creative suites
- Optional local inference support for pixel-level determinism
ShotSpec is a step toward a future where visual AI is controlled, repeatable, and trustworthy—not just impressive.
Built With
- bria-fibo
- hugging-face-inference-api
- monaco-editor
- next.js
- node.js
- prisma
- react
- shadcn/ui
- sqlite
- tailwind-css
- typescript
- zustand
Log in or sign up for Devpost to join the conversation.