Streamberry: Finding the Cinematic in the Everyday

What inspired you to build this project? The inspiration for Streamberry came directly from a piece of science fiction that many of us found both terrifying and fascinating: the Black Mirror episode, "Joan is Awful." The episode introduced a dystopian streaming service that used AI to turn the mundane, private details of people's lives into binge-worthy television. It was a powerful warning about privacy, data, and the unchecked power of AI.

While watching, I couldn't shake a counter-intuitive thought: what if we could flip that concept on its head? Instead of a tool for exploitation, what if we could use that same technology as a tool for creative empowerment?

That question became the foundation of this project. I wanted to build the "good version" of Streamberry—a platform that helps people craft and visualize their own personal stories. The goal was to create a tool that takes a simple memory, a fleeting feeling, or a daily frustration and transforms it into something cinematic, helping users see the beauty, drama, and significance in their own lives.

How did you build it? Streamberry is a front-end-only web application built to be a lightweight but powerful demonstration of multi-API orchestration. The entire project lives within a single index.html file, a classic hackathon constraint that forces creative solutions.

The Tech Stack:

Prototyping: The initial concept and code were rapidly prototyped using bolt.new.

Front-End Framework: None! I used vanilla JavaScript to handle all the logic, demonstrating direct manipulation of the DOM and API data.

Styling: Tailwind CSS was used for its utility-first approach, allowing for the quick creation of a modern, responsive UI that echoes the Netflix/Streamberry aesthetic.

Icons & Fonts: Lucide Icons and the "Inter" Google Font were pulled in via CDN to keep the project self-contained.

The AI Orchestra: The magic of Streamberry lies in how it conducts a symphony of four distinct AI APIs:

The Screenwriter (Google Gemini): When a user submits their text, the first call goes to the Gemini API. I instruct it to act as a creative screenwriter, generating a dramatic title and a short, cinematic synopsis for the "show." The structured JSON output feature was crucial here.

The Voice Actor (ElevenLabs): The generated synopsis is then sent to the ElevenLabs API to create a rich, realistic voiceover. This gives the final video its narrative soul.

The Poster Designer (Google Imagen): In parallel with the audio generation, the synopsis is also sent to the Imagen API with a prompt to create a photorealistic, text-free movie poster. This runs concurrently to save time.

The Production Studio (Vadoo.tv): Once the audio and image are ready, they are sent to the Vadoo.tv API. I used a clever feature where you can upload an image as a thumbnail and an audio file as the video, and Vadoo automatically combines them into a video file.

The "No Webhook" Challenge: Since this is a front-end-only app, I couldn't use webhooks to be notified when Vadoo's video processing was complete. I solved this by implementing a polling mechanism with setInterval. After the initial upload, the app checks the Vadoo.tv API every 10 seconds. It shows the user a "processing" status and, once it receives a "success" message, it grabs the final video URL and displays the player.

What did you learn? This project was a deep dive into the practical challenges of building modern, AI-powered applications.

The Art of Orchestration: My biggest takeaway was learning how to manage a complex, multi-step asynchronous workflow. It’s not just about calling one API; it’s about chaining them, running them in parallel where possible (Promise.all was my best friend), and handling the data transformations between each step (JSON to text, text to audio blob, text to Base64 image, etc.).

Graceful State Management: On a front-end, you have to keep the user informed. I learned how to meticulously track the application's state—generating_text, generating_media, uploading, polling_video—and reflect it in the UI in real-time. This prevents the user from feeling like the app is broken during the inevitable wait times.

Front-End Limitations are Real: Building without a backend forces you to confront challenges like API key security and the limitations of browser-based requests. While fine for a hackathon, it highlighted the necessity of a server-side component for any production-level version of this application to protect credentials and manage CORS policies.

What challenges did you face? Asynchronous Hell: The dependency chain is long: the video depends on the audio and image, which both depend on the text. Managing this with async/await while keeping the UI responsive was the primary challenge. A single error in the chain could break the entire flow, so robust try...catch blocks at every step were essential.

API Latency: Video rendering is not instant. The biggest user experience challenge was the wait time for Vadoo.tv to process the video. The polling mechanism and clear status updates on the UI were my solution to keep the user engaged and informed.

Working with FormData: Sending files to an API from the browser requires constructing a FormData object. Converting a Base64 image string from Imagen into a Blob that could be appended to the form data was a tricky but necessary step that required some research.

Built With

  • elevenlabs
Share this project:

Updates