-
-
Hero Page-Start of the Wellness Journey
-
Selection of the track that you want
-
Meal logging and chatting with your live professional and seeing the workout to do
-
Live chatting with your wellness coach using gemini live
-
Squat tracking to make sure you get your reps in with perfect form
-
Google Cloud Deployment Image Proof
-
Architecture Diagram for the code
Inspiration
Most wellness apps are built like dashboards: they expect people to manually log meals, manually interpret plans, manually stay motivated, and manually recover when life gets busy. That breaks down for the exact people who need support the most.
We built Swasthya to feel less like a tracker and more like a live wellness coach. The idea was simple: if Gemini can see, listen, speak, and respond in real time, then wellness support should feel conversational, proactive, and human, not like filling out forms.
We were especially inspired by the gap between knowing what to do and actually doing it. People often know they should eat better, train consistently, or recover properly, but they lose momentum because the tools are too rigid. Swasthya is designed to reduce that friction.
What it does
Swasthya is a live AI wellness coach that helps users with meals, workouts, and recovery in one focused experience.
The app supports four core flows:
Camera-first meal logging
Users can tapMeal, open the camera, and show what they are eating. Gemini analyzes the image and returns a concise estimate of foods, calories, protein, and confidence.Natural-language coaching
Users can type naturally to the coach instead of navigating rigid forms. For example, they can say things like:I ate 4 applesI only have 20 minutes todayI feel soreStart recovery
Laptop-camera squat tracking
Swasthya uses browser-based pose tracking to detect body keypoints and count squat reps, giving users a more interactive workout experience than a static plan.Live recovery coaching with Gemini Live
Users can start a voice session, speak naturally, and receive spoken support back from Gemini in real time. This makes recovery feel like an actual coach check-in rather than a scripted meditation screen.
Swasthya also keeps a lightweight daily calorie and protein log and allows exporting those logs for review.
Architecture
Swasthya is built as a mobile-first Next.js application with a lightweight server layer for Gemini orchestration and state updates.
At a high level:
- the frontend handles the user experience, onboarding, coach chat, camera capture, squat tracking, and the live voice UI
- Next.js API routes handle meal analysis, coaching actions, workout adaptation, onboarding state, and live session bootstrap
- Gemini core models are used for meal understanding, natural-language coaching, and workout/recovery intelligence
- Gemini Live is used for the real-time voice recovery conversation
- MediaPipe Tasks Vision runs in the browser for squat pose tracking and rep counting
- Firestore is available for persistence, while in-memory mode supports the fastest demo path
- the deployed backend runs on Google Cloud Run
System flow:
- the user interacts with the Next.js app
- camera input and text input are sent into the Swasthya experience
- API routes call Gemini for meal and coaching intelligence
- the browser connects to Gemini Live for voice sessions
- workout tracking runs client-side with MediaPipe
- app state is stored in memory or Firestore depending on configuration
How we built it
Frontend
- Next.js 16
- React 19
- TypeScript
- Framer Motion
- A custom mobile-first UI centered on one conversational coach experience
AI and multimodal stack
- Gemini core models for:
- meal image understanding
- natural-language coaching
- meal parsing from text
- workout adaptation and recovery messaging
- Gemini Live for:
- real-time voice recovery conversations
- microphone input + spoken response output
- MediaPipe Tasks Vision for:
- squat pose tracking
- visible keypoints
- browser-side rep counting
Backend / state / cloud
- Next.js API routes for meal analysis, coaching actions, onboarding, live session bootstrap, and workout actions
- Ephemeral Live token bootstrap for browser-direct Gemini Live sessions
- In-memory state for the fastest demo path
- Optional Firestore integration for persistent state
- Google Cloud Run for deployment
Challenges we ran into
One of the biggest challenges was making the Live experience feel real instead of fake. Early versions behaved more like a scripted overlay than a true live call. We had to rework the flow so it used proper Gemini Live session bootstrapping, microphone capture, spoken responses, and interruption handling.
Another major challenge was meal reliability. Multimodal demos feel great when they work, but they break trust fast when image parsing is wrong or chat logging is too brittle. We had to tighten the meal flows so both camera-based logging and text-based logging felt believable.
We also ran into issues around frontend simplicity. The app originally had too many surfaces and too much dashboard energy. Over time, we cut that down into a much more intuitive flow where the coach is the center of the experience.
On the workout side, squat rep counting also needed multiple iterations. A naive threshold approach was too noisy, so we moved to a more stable phase-based model to make counting more believable.
Finally, deployment introduced its own challenges:
- making sure the Cloud Run build worked cleanly
- keeping lockfiles in sync
- validating a real Google Cloud deployment path for submission requirements
Accomplishments that we're proud of
We are proud that Swasthya is not just a concept demo. It is a working multimodal product with real interaction loops.
Highlights we are especially proud of:
- a camera-first meal logging flow that feels natural
- a real Gemini Live voice recovery session
- browser-based squat tracking with visible body keypoints and rep counting
- a cleaner, more premium mobile-first UX than a typical hackathon dashboard
- successful Google Cloud Run deployment
- a product direction that feels like a believable live wellness coach rather than a generic chatbot
What we learned
We learned that building a good live AI product is not mainly about adding more features. It is about reducing friction and making the interaction feel natural.
A few key lessons stood out:
- multimodal only matters if the workflow is intuitive
- voice experiences feel fake immediately if they are scripted
- small UX details matter a lot more than extra feature count
- live agents need strong fallbacks and error visibility
- demo quality depends heavily on reliability, not just ambition
We also learned that browser-based AI experiences can be surprisingly strong when you combine:
- camera input
- live audio
- local pose tracking
- a focused conversational UI
What's next for Swasthya - Live Wellness Coach
The next steps for Swasthya are to deepen the sense of continuity and personalization.
We want to extend the product with:
- persistent user history and coach memory
- richer nutrition tracking and trend views
- broader workout tracking beyond squats
- scheduled proactive check-ins and reminders
- better logging/export integrations
- more polished recovery and adherence coaching loops
Long term, we see Swasthya as a wellness product where the user does not need to constantly manage the app. Instead, the coach becomes the interface: you show, speak, ask, and respond naturally, and the system helps keep you on track.
Built With
- firestore
- framer-motion
- gemini-api
- gemini-live-api
- google-cloud-run
- mediapipe-tasks-vision
- next.js
- node.js
- react
- typescript


Log in or sign up for Devpost to join the conversation.