Inspiration
Most food tracking apps are built around a broken premise: that calories are the primary metric of good eating. The reality is that a 200-calorie candy bar and a 200-calorie serving of Greek yogurt have wildly different impacts on hunger. We built SatiateAI around a different question — how long will this meal actually keep you full? The inspiration came from a frustration shared by all of us: logging food is tedious, calorie counts are misleading, and no tool accounts for the biological drivers of satiety — protein, fiber, healthy fats, and glycemic impact. We wanted to fix all three problems at once.
What It Does
SatiateAI is a voice-first food logging web app that combines conversational AI, speech processing, and real-time nutritional visualization to give users a satiety score (0.0–1.0) for every meal — not just a calorie count.
Core loop:
User clicks the mic and describes their meal in natural speech ElevenLabs Scribe transcribes the audio to text Gemini 2.5 Flash asks 1–2 intelligent clarifying questions (e.g. "Was the dressing on the side or mixed in?") to resolve satiety-critical ambiguities The AI finalizes a structured nutritional breakdown: calories, protein, carbs, fat, fiber, sugar, satiety score, and a warm voice summary spoken back via ElevenLabs TTS The dashboard updates in real time — macro progress bars fill, an animated SVG figure colorizes by nutrient zone, and a meal card is added to the log Personalized daily targets are calculated from the user's BMI, BMR, and TDEE using standard exercise-adjusted formulas, so every score is relative to their body, not population averages.
How We Built It
Frontend: Vanilla JavaScript ES6 modules, HTML5, Tailwind CSS. No framework — deliberate choice to keep the bundle lightweight and the state machine fully explicit.
Backend: Node.js + Express 5 serving a REST API with four core endpoints: /api/stt, /api/tts, /api/chat, and /api/voices.
AI pipeline:
Speech-to-Text: ElevenLabs /v1/speech-to-text using the scribe_v1 model, receiving raw WebM/Ogg blobs from the browser's MediaRecorder API Conversational reasoning: Google Gemini 2.5 Flash with two distinct system prompts — a CLARIFY phase that generates context-sensitive follow-up questions based on food category heuristics, and a FINALIZE phase that outputs a strict JSON schema with all macro fields including sugar_g and fiber_g Text-to-Speech: ElevenLabs /v1/text-to-speech/{voiceId} using eleven_turbo_v2 with tuned stability (0.5) and similarity boost (0.75) for natural-sounding meal feedback Satiety scoring algorithm is a weighted composite:
Protein: 30% weight Fiber: 25% Healthy fats: 20% Water content/volume: 15% Glycemic impact (inverted): 10% Auth: Supabase Google OAuth with JWT session management, routing through an /auth/callback.html handler.
Visualization: SVG human figure with four nutrient-mapped fill zones (head = sugar, arms = protein, torso = carbs, legs = fat), animated with cubic-bezier easing. Fill opacity scales linearly from baseline (0.08) to full saturation (0.72) based on consumed-vs-target ratio.
Challenges We Ran Into
TTS silent failure after voice selector was introduced. When activeVoiceId was null at call time (because loadVoices() hadn't resolved), the server was silently constructing https://api.elevenlabs.io/v1/text-to-speech/null and receiving a 400 back from ElevenLabs with no client-side surfacing. Fixed with a sentinel check on the server — explicitly rejecting "null" and "undefined" strings and falling back to the ELEVENLABS_VOICE_ID env var.
Gemini JSON extraction with hallucinated markdown. Gemini would occasionally wrap output in triple-backtick code fences despite explicit instructions not to. We added a regex strip pass on the response before JSON.parse().
MediaRecorder codec negotiation across browsers. Chrome outputs WebM/Opus; Safari outputs MP4/AAC. ElevenLabs Scribe accepts both, but the Content-Type header had to be dynamically set from MediaRecorder.mimeType rather than hardcoded.
Browser autoplay policy. Audio playback initiated outside a direct user gesture chain gets blocked silently by modern browsers. Since the entire voice loop is triggered by the mic button click and awaited end-to-end, the gesture chain stays intact — but we added a .catch(resolve) guard on audio.play() to prevent the state machine from freezing if the policy triggers unexpectedly.
Accomplishments That We're Proud Of
A fully voice-driven food logging experience where the entire interaction — from speech to nutritional analysis to spoken confirmation — completes in under 6 seconds A proprietary satiety scoring system that meaningfully differentiates meals that calorie counters treat as identical A real-time animated nutrition visualization that turns abstract macro data into an intuitive body-mapped display Clean separation of concerns across the codebase: voice.js owns all audio I/O, dashboard.js owns state and UI, server.mjs owns all external API calls — no API keys ever touch the client What We Learned Conversational AI is most useful when the prompts encode domain knowledge, not just instructions. The CLARIFY prompt performs meaningfully better because it includes food-category heuristics, not because Gemini is smarter with more tokens. Voice UX design is fundamentally different from visual UX — latency tolerance, error recovery, and feedback loops all need to be redesigned from scratch Satiety science is genuinely underrepresented in consumer nutrition tooling despite being well-established in the academic literature (protein and fiber are far stronger satiety predictors than total caloric density) Building stateful multi-turn voice interactions without a framework requires rigorous state machine design — implicit state through nested callbacks leads to race conditions that are nearly impossible to debug
What's Next for SatiateAI
Persistent meal history synced to Supabase PostgreSQL so users can track satiety patterns over days and weeks Meal photo input as an alternative to voice — Gemini's vision capabilities can identify foods from a photo and feed the same clarify → finalize pipeline Wearable integration — correlating satiety scores with actual hunger signals from Apple HealthKit or Fitbit data to validate and refine the scoring model Pattern analytics — the patterns.html foundation is already built; the next step is real data powering weekly heatmaps, time-of-day satiety trends, and "your most filling meal this week" summaries Mobile-first redesign and a React Native wrapper to bring the voice-first experience to iOS/Android natively
Log in or sign up for Devpost to join the conversation.