Tala: Where Your Voice Lights the Way

Tala

Inspiration

Mental health support in the Philippines is still out of reach for a lot of people, long waitlists, stigma, cost, and the simple exhaustion of having to explain yourself before you can be heard. We kept coming back to a different question: what if the first step didn't have to be a clinic? What if it could be a quiet moment with something that just listens, in Tagalog, in English, in whatever half-formed thought comes out first?

That's Tala. The name means "star" in Tagalog, and the idea is simple: when you feel lost in the dark, you don't need someone to solve it. You need a steady point of light to walk toward.

What it does

Tala is a gentle, always-available wellness companion that meets you where you are.

Voice journaling - Tap and talk. Tala listens in real time via the Gemini Live API, responding with its own voice and synced captions.

Text journaling - Guided mode offers prompt flows for different moods. Freeform gives you a blank page with soft inline nudges powered by Gemini 2.5 Flash-Lite when you pause.

Wellness toolkit - Guided breathing, box-breathing visualizer, AI-validated 5-4-3-2-1 grounding, and a "Bring Me" camera game that's powered by Gemini 2.5 Flash-Lite that grounds you in your surroundings.

Memory - Tala remembers mood patterns and prior entries via vector embeddings, gently noticing when you've been here before.

Safety first - Every response passes through a crisis-detection gate (keyword + intent scanning). If something concerning surfaces, Tala shifts tone and surfaces Philippine hotlines.

PWA - Installable on iOS or Android.

How we built it

We built Tala on Next.js 16 with React 19, using the App Router and React Server Components for instant page loads. Styling is handled by Tailwind CSS 4 with a deliberately minimal palette — cream, ink, and a brand purple — paired with Instrument Serif for an editorial feel. Google Gemini powers the AI layer: Flash Lite for journaling follow-ups, Flash for transcription and grounding validation, Live for the real-time voice companion, and embeddings for long-term memory. The agent logic runs on LangGraph, structured as a directed graph flowing through input classification, safety gating, persona handling, distress detection, crisis escalation, memory retrieval, reply composition, reflection, and growth tracking. On the data side, Neon Postgres with pgvector stores users, journal entries, mood logs, and embeddings, with Drizzle ORM keeping the schema type-safe end-to-end. Serwist handles the PWA layer — precache, runtime cache, offline fallback, and a branded install prompt. For voice, Web Audio and AudioWorklet capture 16 kHz PCM for Gemini Live, with a custom PCM playback queue on the client to keep responses smooth.

Challenges we ran into

Getting Gemini Live to work on the web was our biggest technical hurdle, WebSocket audio with ephemeral tokens, pause/resume, and teardown took multiple rewrites before it was stable. Caption sync was deceptively hard too; aligning Tala's spoken output with word-by-word captions required gating on audio playback and adding a tail lookahead so text didn't race ahead of the voice. Building the LangGraph agent was its own challenge, we needed enough guardrails to keep responses safe and on-tone, but enough judgment for the system to know when to listen, when to nudge, and when to escalate, without it feeling rigid or over-engineered. Safety calibration sat at the center of all of this: over-tune and users never feel heard; under-tune and we fail the people who need us most. Finding that middle ground took real iteration.

Accomplishments that we're proud of

Tala ships as a real PWA. The voice companion responds in under a second, making it feel like a conversation rather than a query. Our safety layer is built into the agent graph itself, not the prompt, so no clever input can route around it. Tala switches between Tagalog and English based on what the user actually speaks, rather than defaulting to English.

What we learned

Latency is empathy. When an AI companion pauses half a second too long, the user feels ignored; when it responds too fast, it feels like it didn't listen, the sweet spot is surprisingly narrow. We learned that safety is a product feature, not a disclaimer, moving crisis detection from a line in the system prompt to a dedicated node in the agent graph was the single biggest trust upgrade we made. Graph-based agents scale better than mega-prompts; once we broke the brain into LangGraph nodes, adding capabilities like memory, reflection, and growth tracking stopped fighting each other and started composing.

What's next for Tala: Where Your Voice Lights the Way

We want to deepen Tala's memory on the user's terms, opt-in longitudinal reflections with transparent controls to delete, export, or reset at any time. Language support is a priority: beyond Tagalog and English, we're looking at Cebuano and Ilocano, languages millions of Filipinos think in but that mainstream AI still treats as second-class. We're pursuing crisis partnerships and expanding a verified professional network for direct, opt-in warm handoffs to licensed counselors and organizations like In Touch, HOPELINE PH, and the NCMH crisis line, so Tala never has to be the last line of defense. We're also adding a "Scan Your Diary" feature that lets users photograph pages from a physical notebook and bring handwritten entries into Tala, bridging the gap between pen-and-paper journaling and AI-supported reflection. Beyond that, we'll keep improving the AI itself, better context, sharper empathy, and stronger safety, and expand offline-first wellness so the full toolkit works without internet, for users on the jeepney, in the province, at 2 AM with bad signal.