What was I building?
I built Kiki, a web app designed to solve the problem of forgetting information learned passively through audio during commutes or chores. The core idea was to combine AI-generated personalized audio lessons with the powerful FSRS spaced repetition algorithm to ensure long-term retention.
How did it evolve?
Kiki started in early October as just a concept. The first week focused on designing the core user flow (Topic, Context, Goal) and researching the best AI tools. Week two involved wrestling with the Gemini API prompts to get usable scripts and integrating the ElevenLabs API for audio. Week three was the deep dive into implementing FSRS – understanding the state variables, getting the calculations right, and connecting it to the database felt like a major milestone. The final week was a push on the frontend – building the UI with vanilla JS proved challenging but rewarding, alongside integrating everything and deploying via ngrok for the demo and pushing it to GitHub. It definitely felt like "building through the fall" -- steady progress, tackling one complex piece at a time, rather than a frantic rush. I had a lot of fun building Kiki.
One of the many funny events during the making of Kiki: During testing, I discovered the rapid depletion rate of ElevenLabs free credits. My sophisticated solution to continue debugging the audio generation flow involved temporarily downgrading the AI's vocabulary. For several trials, its only output was "woof" :p
Why does it matter?
Kiki matters because it addresses a real, common problem: wasted learning potential during unavoidable "dead time." By making audio learning active and personalized, it significantly boosts retention and makes education more accessible, especially for auditory learners or those with dyslexia/ADHD. It demonstrates a practical way to integrate cutting-edge AI and learning science into a tool that fits modern, busy lives. Finishing the core loop during Hacktober proves this isn't just an idea, but a viable concept worth pursuing.
The inspiration behind Kiki
I have been an Anki user for quite a while now. It's gotten me through countless hurdles I have faced in my academic life. I have also been fascinated by the algorithm that powers Anki and how effectively it helps to lock in knowledge.
As a sophomore, my commute and gym time add up. While I could technically review Anki cards then, pulling out my phone for flashcards feels like effort when I'd much rather just listen to music or a podcast This, to me, was a missed opportunity. What if that time could be used to learn effectively, like I did with Anki?
That was the idea behind Kiki. Kiki combines the Japanese verb for listening, 'kiku' (聞く) with the memorization power of Anki (暗記). It aims to fuse the convenience of audio learning with the proven science of spaced repetition, turning those passive listening hours into truly productive study sessions.
What does Kiki do?
Kiki is an AI-powered app that creates personalized, human-sounding audio lessons on any topic you want, tailored to your knowledge level. More importantly, Kiki automatically generates flashcards from each lesson and uses the battle-tested FSRS algorithm (which also powers Anki) to guarantee you actually remember what you learned, scheduling reviews perfectly for maximum retention with minimum effort. It turns your wasted time into efficient study sessions and makes learning accessible, especially for those who struggle with text, like learners with dyslexia and ADHD.
How I built Kiki
Kiki's backend was built using Python (Django), where I utilised the Django REST Framework (DRF) to handle the API endpoints. The database used was PostgreSQL, which is an industry-standard for structured data and also pairs well with Django's inbuilt ORM. The frontend is crafted with vanilla HTML, CSS and JavaScript to function on a minimal and functional user experience. For the AI pipeline, I integrated the Google Gemini API (2.5 pro model) to generate both the personalized audio lesson scripts and the flashcard content. The prompt was fine-tuned to ensure quality and relevance. The text-to-speech conversion leverages the ElevenLabs API for its natural sounding non-robotic voices. The crucial spaced repetition scheduling is powered by an implementation of the FSRS (Free Spaced Repetition Scheduler) in Python within the backend.
Kiki's potential impact
For busy students and professionals, Kiki transforms passive "wasted" time - commutes, chores and workouts - into highly efficient, active learning opportunities. Also, unlike simply listening to educational audio (where retention is often low), Kiki pairs personalized lessons with FSRS-powered active recall. This ensures that the information learned during those mobile sessions actually sticks long-term.
More importantly, Kiki offers a powerful alternative for individuals who struggle with traditional text-based learning. For learners with Dyslexia, audio-first lessons bypass the challenges of text decoding, allowing focus purely on comprehension. For learners with ADHD, engaging, human-sounding audio can be significantly more effective at maintaining focus compared to static text.
Kiki’s mission is to make high-retention, personalized education accessible to anyone with a smartphone and a pair of earbuds.
Challenges I ran into
- Integrating and understanding the FSRS algorithm's logic for scheduling was a significant hurdle, requiring careful thought to correctly calculate review intervals based on user feedback and updating the database to include the new due datetime.
- ElevenLabs API had a strict limit on free tier usage which made it difficult to test the application. This limitation could pose challenges in the future as well.
- One of the primary challenges was ensuring that the AI-generated audio lessons were engaging and not robotic, while also maintaining technical accuracy without dumbing down complex topics to ensure users of all knowledge levels can use the product to their liking. This required careful engineering of prompts given to Gemini. ## Accomplishments that I am proud of
- I am proud of successfully building a functional end-to-end prototype entirely by myself, realizing the core vision: to design an application that transforms a user's learning goals into personalized audio lessons coupled with an advanced spaced repetition system.
- Integrating the FSRS algorithm for flashcard scheduling, as it required a deep dive into its mechanics and permitted me to truly grasp how it optimizes spaced repetition.
- Getting the AI pipeline to work, and more importantly deciding on what user input to take (the topic, the context and the goal) so that the audio lessons are as personalized as possible also required significant thought and I am happy to see the final product.
- Completing this as my first full-stack project and delivering a functional, coherent application that meets the initial vision is something I am happy with :)
What I learned
This project was a deep dive into full-stack development. I significantly improved my skills in backend development with Django, API design, managing external API integrations, and also in frontend development for creating the user interface. I also gained some valuable practical experience integrating a complex algorithm like FSRS and understanding its underlying principles. It was a fantastic learning experience that provided insight into how modern spaced repetition systems (like Anki) function.
What's next for Kiki
The immediate next steps focus on polishing the user interface and moving beyond ngrok to a proper deployment. I will also have to find a solution for the strict limitations on ElevenLabs API's free tier. The vision for Kiki is not currently commercial as the project has been released under an open-source license to share it with the wider learning and development community.
Log in or sign up for Devpost to join the conversation.