Inspiration
I'm someone who listens to a lot of podcasts, I find them a good way to learn things and broaden my knowledge in a fun and engaging manner. However, sometimes when I'm interested in something and I'd like to listen to a podcast on that topic, I find there's no podcasts of that sort. This problem is what inspired me to build TailoredTales.
What it does
TailoredTales uses AI to create podcasts based on the topics you choose. This allows for the curation of highly specific, tailored podcasts specifically for you. All you have to do is give a title for the podcast, describe the content in a few words, configure the number of episodes you want and the AI does the rest.
How I built it
I used the Google Gemini API to plan the structure of the podcasts, the episode titles, what each episode contains etc. I then passed this structure into the Gemini API again to generate a full-length podcast script for each episode. I instructed it to include an introduction based on the script of the previous episode and generate content based on the planned structure for this episode. I used the DALLE API to generate the cover image for the podcast. Additionally, I also used the OpenAI TTS API to generate audio files from the generated scripts. These audio files are then stored in blob storage and the episode link is stored in Postgres along with the rest of the podcast metadata.
Challenges I ran into
There were quite a few challenges I faced along the way, the primary one being issues with the structure of the podcast. A lot of the times, the scripts were not cohesive and didn't function in a podcast style. I solved this by adding the script of the previous episode to the prompt of the current episode and asking Gemini to make a free-flowing script that fits in with the end of the script of the previous episode.
What I learned
I learned quite a lot while making this project:
- How to orchestrate multiple AI API's and make them work in sync
- How to use blob storage to effectively store and retrieve large files
- How to use LLMs for planning and using the plan to generate subsequent outputs
What's next for TailoredTales
Some ideas I have for where to take this in the future are:
- Make it a PWA for mobile listening
- Automatically create podcasts without user input based on the podcasts they have created previously
- Improve latency by caching and exploring other TTS endpoints
- Monetization
Built With
- dalle
- gemini
- next.js
- node.js
- postgresql
- react
- tailwind
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.