Inspiration

Long-form generation means that we can produces podcasts of high quality at length - up to 4+ hours - comprehensively diving into a subject of interest.

What it does

Our users come with a prompt, at any level of detail. We generate a full cast of characters, comprehensive dialogue, and play it through distinct voices.

How we built it

We used a novel long-form generation technique, taking full advantage of Claude's 100k context by continually passing in the full transcript of the partial podcast generated to continue to generate it until the podcast is complete. We generate a structure for the podcast which is iterated through at each generation. We use SDXL (Stable diffusion extra large) to create the art for the podcast, Elev.en Labs for voice generation, and Claude for a lot of auxiliary generation (creating characters, image art prompt for stable diffusion, etc.)

Challenges we ran into

We had to get good at structured generation, continually filling dicts or lists with parsed generated content. Audio bites generation can hit exceptions around having a number of bytes that can appropriately be loaded from buffer. Generating an MP4 by combining an audio file with an image turned into several strange ffmpeg / moviepy errors. We couldn't find a way around using regular expressions for text parsing in parts.

Accomplishments that we're proud of

First use, as far as we know of, of our Claude long-form generation technique!! The end product is really incredible in some podcasts, and really opens up the platform to the creativity of users who can now try to create any podcast they can imagine.

What we learned

How to effectively parse generative text. How to generate extremely coherent long sequences.

What's next for Echos of AGI

Having the world create podcasts on our platform! We may also Dreambooth SDXL images of our personalties and combine them with Runway2 to create generative movies.

Built With

  • claude
  • elevenlabs
  • pydub
  • python
  • stablediffusion
Share this project:

Updates