-
-
Choosing verse for practice
-
Choice of continuing from where we left last time or to select a particular verse to practice.
-
Choosing Language we want Guru to converse in with us
-
Guru's guidance after Paarth has recited.
-
After selecting the verse, UI ready for Paarth to start reciting.
-
UI when Paarth is reciting.
-
Thumbnail
-
Architecture
-
google cloud run backend
Inspiration
The Bhagavad Gita has always been something many people want to learn and recite, but practicing correctly can be difficult without a teacher present. Traditional learning often happens through a guru who listens carefully, corrects pronunciation, and explains the deeper meaning of the verses. I wanted to explore whether modern AI could recreate a small part of that experience — an AI companion that listens, guides, and encourages practice.
The idea behind PAARTH is to combine ancient wisdom with modern technology. By using voice interaction and AI reasoning, the system listens to a user reciting a verse, evaluates how accurately it was spoken, and responds with helpful feedback in a gentle “guru-style” manner. The goal is not just to check correctness, but to create a supportive learning environment that makes practicing verses more accessible and engaging.
This project was built as part of the Gemini Live Agent Challenge, where the focus is on real-time interaction. PAARTH demonstrates how AI can listen to spoken recitation, analyze it using Gemini, and provide immediate guidance and meaning — creating a voice-driven learning companion for spiritual practice.
What it does
PAARTH - Personalized AI Assisted Recitation and Textual Helper PAARTH is an AI-powered recitation companion designed to help users practice verses from the Bhagavad Gita through real-time voice interaction.
Users select a chapter and verse and then recite it aloud. The application listens to the recitation using browser speech recognition and converts the spoken verse into text. This transcript is sent to the backend, where Gemini analyzes the recitation by comparing it with the correct verse.
The system evaluates pronunciation and accuracy, generates a score, and produces helpful guidance explaining where the recitation was correct and where improvements are needed. The feedback is displayed visually and also spoken aloud in a gentle “guru-style” voice so the experience feels like practicing with a teacher.
PAARTH also provides the meaning of the verse when recitation is successful, encouraging learners to not only memorize verses but also understand their deeper significance.
How I built it
PAARTH is built as a real-time voice interaction system combining a modern web frontend with AI-powered evaluation.
The frontend is built with React and TypeScript, providing the recitation interface, verse display, and guidance panel. The browser’s Speech Recognition API captures the user’s spoken recitation and converts it into text. This transcript is sent to a Node.js backend service which retrieves the correct verse and prepares the evaluation prompt.
The backend integrates with Gemini 2.5, which analyzes the recitation transcript against the original verse. Gemini generates structured feedback including an accuracy score, subtitle-style guidance, spoken feedback for the tutor voice, and the meaning of the verse. The frontend then presents this feedback visually and reads the guidance aloud using browser speech synthesis to simulate a conversational guru-style tutor.
This architecture allows PAARTH to provide instant feedback during recitation practice, combining voice interaction, AI reasoning, and real-time guidance.
Challenges I ran into
One of the main challenges was creating a smooth real-time voice interaction flow. Speech recognition can sometimes stop early or misinterpret Sanskrit words, so careful handling was needed to capture the user’s full recitation without interrupting the experience.
Another challenge was designing prompts for Gemini that could produce structured feedback. The system needed consistent outputs such as scores, subtitles, spoken responses, and verse meaning so the frontend could display and speak them correctly. Prompt design played an important role in making the feedback both helpful and natural.
Another challenge was deploying the application to Google Cloud Run within the hackathon time limit. While the service ran locally, resolving container configuration and deployment timing issues proved difficult under the deadline. This was a valuable learning experience about containerized deployments and cloud runtime environments.
Finally, creating a tutor-like experience required balancing technical accuracy with a supportive tone. Instead of sounding like a strict evaluator, PAARTH aims to respond like a patient guide — encouraging users to keep practicing and improving.
Accomplishments that I am proud of
The most rewarding part of this project was building a system that can listen to spoken recitation and respond immediately with meaningful guidance. Seeing the application evaluate a verse and provide feedback in real time made the idea feel very real.
I’m also proud of combining AI technology with a culturally meaningful learning experience. The Bhagavad Gita has been studied for centuries, and using modern AI to support that learning process opens interesting possibilities for the future.
PAARTH shows how AI can act not only as a tool, but as a learning companion — helping people practice, improve, and stay connected to knowledge traditions.
What I learned
Building PAARTH highlighted how powerful real-time voice interaction can be when combined with modern AI models. We learned that creating a natural experience requires carefully orchestrating several components — speech recognition, AI evaluation, structured responses, and voice feedback — so that the interaction feels smooth and supportive.
Another important learning was prompt design. To make the feedback useful for the application, Gemini needed to return structured outputs such as a score, subtitle guidance, spoken feedback, and meaning. Designing prompts that consistently produce this format was key to making the system reliable.
Finally, this project reinforced how AI can be used not only for productivity tools, but also for learning and cultural preservation. Combining voice interfaces with AI reasoning opens new possibilities for helping people practice, understand, and stay connected with traditional knowledge.
What's next for PAARTH
There are many directions this project could evolve. A natural next step would be integrating Gemini Live API for fully conversational voice interaction, allowing users to ask questions, interrupt the tutor, or request explanations while practicing.
Future improvements could also include deeper pronunciation analysis using phonetic similarity, personalized progress tracking across chapters, and support for additional languages so learners from different backgrounds can practice comfortably.
Ultimately, PAARTH could grow into a complete AI-powered recitation tutor, helping people practice scripture with guidance anytime, anywhere.
Built With
- browser-speech-recognition-api
- gemini-2.5-flash
- google-ai-api
- google-cloud-run
- node.js
- react
- rest-api
- typescript
- web-speech-synthesis-api
Log in or sign up for Devpost to join the conversation.