Inspiration

As the children of immigrant parents ourselves, we struggled to learn English at a young age. Even though they wanted the best for our education, they were often working long hours, struggled to read English, and were forced to balance between English and culturally relevant stories. From these experiences, we wanted an opportunity for children to interact with their surrounding language, even if their family wasn’t fluent or . However, we also recognized the emotional and cultural connection that a parent’s voice brings throughout the storytelling process. By creating Kahani, it allows for these crucial aspects of storytelling to be preserved, helping young English learners without having to give up the parental connection.

What it does

Kahani turns children’s books into interactive English learning experiences that are narrated in a parent’s own voice. After uploading and capturing a PDF with Gemini Vision API, Kahani translates the text if needed, sections small phrases, and adds simple bilingual definitions. Each page is read aloud using an ElevenLabs model of the parent’s voice, followed by relevant comprehension quiz multiple-choice questions created using Gemini, allowing for active and focused learning. Users can also interact with an ElevenLabs conversational agent to ask questions about the book. Retrieval-Augmented Generation (RAG) ensures the conversation is safe and on topic.

How we built it

We used Gemini’s OCR to extract text from uploaded book PDFs, then implemented the Gemini API to adapt and translate the content into age-appropriate English. The processed text is structured into pages and enhanced with highlighted vocabulary terms. We then used ElevenLabs to generate a natural reading with a cloned model of the parent’s voice. Supabase handles our frontend and data flow between text processing and audio generation, allowing the experience to feel seamless and responsive.

Challenges we ran into

One of our biggest challenges was fine-tuning the ElevenLabs voice model to make the parent’s original audio high-quality educational narration. Small changes in tone and pacing impacted how natural the narration felt, so we iterated on input quality and length to preserve the parent’s voice. We also were challenged with designing the user interface. Because our target users are both busy parents and young children, the design needed to be intuitive for non-english speaking users and younger audiences alike. Balancing simplicity and functionality required layout adjustments to keep the experience accessible and comfortable.

Accomplishments that we're proud of

We’re proud to build a project that is meaningful and relevant to each of our English learning stories. Growing up, there weren’t any good options for English exposure, so we’re glad to make a project that can help other immigrant families and their journey with English literacy.

We’re also proud to use the ElevenLabs API in a way that maintains authenticity and follows their mission of “mak[ing] content accessible in any language and in any voice”.

What we learned

As we built Kahani, we learned how to implement the ElevenLabs and Gemini API, in addition to building an intuitive and frontend website for a specified purpose.

What's next for Kahani

We would love to convert Kahani into an easier medium of use for children, such as us through an IOS app, to make the user experience more comfortable and accessible. We also hope to partner with libraries and other organizations to create a large catalog of cultural and traditional books, allowing users to quickly select a book for their children to learn from.

Built With

Share this project:

Updates