Inspiration

Reading has always been a major hobby of mine, giving me new perspectives in both fiction and non-fiction. I am often in awe of the complex personalities authors create and found myself wondering what it would be like to talk to them directly. I wanted to ask characters about their feelings and choices, and that desire to interact with the story is how Fourth Wall was conceived.

What it does

Fourth Wall enables you to talk to book characters in a realistic way. You can have casual conversations with your favourites, asking them questions about their lives and choices directly within the context of the story.

How I built it

I built the backend using Python, initially starting with FastAPI before pivoting to Flask for better web service integration. I utilized Google Cloud Vertex AI (Gemini) to process book text, extract character personalities, and map them to specific ElevenLabs voice IDs, storing the resulting metadata in Firestore. The frontend was built with vanilla HTML and JavaScript alongside the ElevenLabs SDK to generate interactive character cards. When a conversation starts, the specific voice configuration is triggered for real-time dialogue, and the entire application is hosted on Google Cloud Run for seamless scalability.

Challenges I ran into

My primary challenges were architectural efficiency and API constraints. I initially planned to process entire books, but ingesting 200k tokens per request was cost-prohibitive. I pivoted to using a 30,000-word sample (typically about 1/5th of the book) and implemented a "community bookshelf" with database caching to preserve GCP credits. Additionally, the ElevenLabs free tier limited me to three voices, so I upgraded to the standard plan to access ten voices for proper testing. I also had to overcome significant stability issues regarding voice overrides to ensure smooth character interactions.

Accomplishments that I'm proud of

I am primarily proud of overcoming roadblocks to build a project that matches my original vision. Hosting the application live instead of running it locally was ambitious, requiring me to implement rate limiting and safety logic. As a solo developer, I had to be ruthless with feature scoping to manage my time. Through this process, I learned that software development is effectively 50% planning, 30% debugging, and only 20% actual coding.

What I learned

Completing this project has made me significantly more proficient with Cloud and AI tools. Earlier this year, I found the idea of implementing voice automation daunting, but this hackathon provided the perfect opportunity to master that technology through a subject I am passionate about. Beyond the technical skills, seeing a concept evolve from a rough sketch to a live product has been incredibly satisfying, giving me the confidence to tackle any engineering challenge in the future.

What's next for Fourth Wall Audio

My objective with Fourth Wall was to make reading more enjoyable, but the next step is adding real utility. I plan to develop an educational "Scholar Mode" where students can ask characters to retrieve specific quotes or explain themes for their reports. I also want to create a "Living Book Club" experience where you chat with the character while reading to boost engagement and comprehension. With these updates, I can evolve Fourth Wall from a fun immersion tool into a valuable asset for literacy and learning.

General Access Note

If you have an access key, you must manually append it to the URL in this format: website.com/?access=YOUR_KEY. Without this parameter, the application will remain in read-only mode.

Built With

Share this project:

Updates