💡 Inspiration
In a world that is more connected than ever, we are facing a loneliness epidemic. Traditional mental health apps often feel clinical, cold, or barrier-heavy. They ask you to fill out forms, pay subscription fees, or type out your deepest feelings on a glass screen.
But when we are overwhelmed, anxious, or in crisis, we don't want to type. We want to be heard.
We were inspired by the "Next Billion Users"—people in emerging markets like India who are mobile-first and voice-first. We wanted to build a digital companion that breaks the literacy barrier and provides a safe, judgment-free space to vent, powered by the empathy of Gemini 2.5.
Mood-Mantra isn't just an app; it's a digital shoulder to lean on.
🤖 What it does
Mood-Mantra is a Full-Duplex Voice Companion that acts as a bridge between casual wellness apps and professional therapy.
- Immersive Voice Interface: No typing. You speak, and the AI replies with a human-like voice (powered by ElevenLabs).
- Real-Time "Barge-In": Unlike standard chatbots, you can interrupt Mood-Mantra mid-sentence. If you change your mind or want to clarify, just speak over it, and it stops listening instantly—mimicking a natural human conversation.
- Silence Detection ("Holding Space"): If you stop speaking, the app detects the silence. Instead of an error, it responds gently, understanding that sometimes presence is more important than words.
- Multilingual & Vernacular: Whether you speak English, Hindi, or a mix ("Hinglish"), the AI understands context and responds in the appropriate cultural tone.
- Adaptive Personas:
- Therapist Mode: Validates feelings and offers grounding techniques.
- Interviewer Mode: Shifts to a professional tone to help users practice for job interviews (a major source of anxiety).
- Crisis Safety: Automatically detects self-harm intent and triggers a "Safe Mode" with helpline resources.
⚙️ How we built it
We built Mood-Mantra as a Progressive Web App (PWA) using Next.js 16, ensuring it works seamlessly on low-end mobile devices without a heavy download.
1. The Voice Engine (Web Audio API)
The core challenge was creating a "Hands-Free" loop. We utilized the Web Audio API to analyze microphone input in real-time.
To detect when a user is speaking vs. background noise, we implemented a Root Mean Square (RMS) volume calculation algorithm in our useVolume hook:
$$V_{rms} = \sqrt{ \frac{1}{N} \sum_{i=0}^{N-1} x_i^2 }$$
Where $x_i$ is the audio sample amplitude and $N$ is the buffer size. We set a dynamic threshold $T_{silence}$ (approx -50dB) to trigger the "Auto-Send" logic.
2. The Brain (Google Vertex AI & Gemini 2.5)
We utilized Gemini 2.5 Flash via Google Vertex AI for its low latency and high reasoning capabilities. We engineered a complex System Prompt that acts as a state machine:
- Input: Audio Transcript + Conversation History
- Logic: Analyze Sentiment $\rightarrow$ Detect Crisis Keywords $\rightarrow$ Select Persona
- Output: JSON structure containing the reply, the detected mode, and the required voice tone.
3. Voice Synthesis & Barge-In Logic
We stream the response text to ElevenLabs for high-fidelity audio. To enable "Barge-In," we enforced Acoustic Echo Cancellation (AEC) constraints on the browser's getUserMedia stream:
const stream = await navigator.mediaDevices.getUserMedia({
audio: { echoCancellation: true, noiseSuppression: true }
});
We combined this with a "Grace Period" timestamp check ($t_{now} - t_{start} > 500ms$) to prevent the AI from hearing its own echo during the first few milliseconds of playback.
🚧 Challenges we ran into
- The "Infinite Echo" Loop: Initially, when the AI spoke, the microphone would pick up the AI's voice, think the user was speaking, and interrupt itself.
- Solution: We implemented a strict Volume Threshold Gate combined with hardware AEC constraints. We also added a logic check that ignores volume spikes for the first 0.5 seconds of AI speech.
- Race Conditions in State Management: Users would click "Stop" to manually send a message, but the "Silence Detection" would trigger simultaneously, causing the app to send the request twice or get stuck in a "Thinking" loop.
- Solution: We rewrote the React state logic using
useRefflags (shouldLoop.current) to explicitly handle the "One-Shot" vs. "Continuous Loop" intentions.
- Solution: We rewrote the React state logic using
- Latency: Voice-to-Voice latency is critical.
- Solution: We optimized the pipeline by using Gemini 2.5 Flash (faster inference) and lightweight audio blobs (WebM/Opus) to reduce upload times.
🏆 Accomplishments that we're proud of
- Web-Based Barge-In: Achieving true "interruptibility" in a browser without native code is notoriously difficult. We are proud of the fluid feel we achieved.
- Crisis Safety Architecture: We didn't just build a chatbot; we built a safe one. The logic that detects "Crisis" intent and silently modifies the UI to offer help (while maintaining a compassionate voice) is a feature we believe is essential for ethical AI.
- PWA Performance: The app feels completely native on mobile, with a responsive "Particle Orb" visualization that runs at 60fps even on mid-range devices.
🧠 What we learned
- Prompt Engineering is UX: We learned that the "personality" of the app isn't code—it's the System Prompt. Tweaking Gemini to avoid "toxic positivity" and instead "hold space" made the difference between a robot and a friend.
- Audio is hard: Handling different browser policies (Safari vs. Chrome) for AudioContext autoplay permissions taught us the importance of robust error handling and fallbacks.
🔮 What's next for Mood-Mantra
- Biometric Analysis: Using the audio stream to analyze pitch and jitter to detect stress levels, allowing the AI to preemptively suggest calming exercises.
- Long-Term Memory: Implementing a Vector Database (like Pinecone) to let Mood-Mantra remember details from conversations weeks ago (e.g., "How did that interview go last Tuesday?").
- Gamified Wellness: Introducing streaks and calming mini-games within the PWA to encourage daily check-ins.
Mood-Mantra is just the beginning of Empathetic AI.
Built With
- elevenlabs
- gemini
- google-cloud
- google-vertex-ai
- next.js
- pwa
- typescript
Log in or sign up for Devpost to join the conversation.