Inspiration
The idea started with a book — Scarcity by Mullainathan and Shafir. The central finding is that scarcity of any resource — time, money, cognitive bandwidth — taxes the mind in ways that spill over into unrelated decisions. When you're under deadline pressure, your brain tunnels onto the stressor and defaults to avoidance on everything else. Skipping a workout under pressure isn't a discipline failure. It's a bandwidth failure.
That reframing felt important. Most fitness tools respond to low adherence by adding features — more nudges, better plans, streak counters. But if the root cause is cognitive depletion rather than motivation, the solution isn't more friction. It's a system that knows what your day looks like before it tells you what to do.
We also kept coming back to a finding from Marcora et al. (2009): 90 minutes of cognitive work reduces endurance performance by roughly 15% — not through any physiological change, but through elevated perceived exertion alone. The muscles are fine. The brain is lying. That result sat at the center of what we wanted to build: a training assistant that treats mental fatigue as a real variable, not an excuse.
What it does
Adapt is a multi-agent AI system that generates a personalized daily training plan by simultaneously modeling physiological readiness and cognitive load.
Three agents run in sequence:
Schedule Agent authenticates with Google Calendar via OAuth 2.0 and uses Gemini 2.0 Flash to classify each event's cognitive cost on a 1–10 scale — capturing not just time usage but mental demand, stakes, and emotional labor. A client presentation and a routine standup are not the same stressor, and Adapt quantifies the difference. The agent outputs a daily cognitive load score with per-event reasoning.
Biometric Agent pulls live data from Garmin Connect — HRV, sleep score, body battery, resting HR, and 7-day training load history — and uses Gemini to compute a readiness score from 0 to 100 with a signal-by-signal breakdown. The key insight here is that HRV reflects total allostatic load, not just training stress. A packed meeting schedule and an interval session draw from the same physiological reserve.
Synthesis Agent receives both outputs and applies constraint-aware reasoning grounded in the research:
$$\text{Intensity ceiling} = f(\text{readiness score}, \text{cognitive load}, \text{ACWR}, \text{stressor proximity})$$
It generates a concrete plan — what to do, what to skip, how hard to push — with a full reasoning chain: biometric factor, schedule factor, scarcity insight, and long-term training logic.
The full pipeline streams live over SSE so users can watch each agent complete in sequence and inspect the reasoning before they see the final plan. The result includes live-rendered charts (14-day HRV trend, sleep quality, acute vs. chronic training load with ACWR risk zones), a KPI dashboard, an interactive session checklist with "if tired" fallbacks per step, and domain-specific recommendations.
How we built it
Backend: Python + Flask handles orchestration, routing, and SSE streaming. The pipeline is threaded — each agent run is launched asynchronously and emits events into an in-memory store that the SSE endpoint polls and forwards to the client in real time.
AI layer: Gemini 2.0 Flash runs inside all three agents. It isn't used as a text generator — each agent instructs Gemini to return validated JSON only, with strict schema constraints. The schedule agent produces structured cognitive cost objects; the biometric agent produces a readiness assessment with signal-level breakdowns; the synthesis agent produces a plan with typed fields for recommendation, workout structure, decision rationale, week adjustments, and athlete message.
Data integrations: Google Calendar via the official Python client with OAuth 2.0 token persistence. Garmin Connect via the garminconnect library, which provides access to HRV summaries, sleep data, body battery timelines, stress scores, and activity history. All biometric data is fetched fresh on each run — nothing is cached between sessions.
Frontend: Vanilla HTML/CSS/JavaScript with Chart.js for the analytics layer. No framework dependencies, which kept the build fast and deployment simple. Charts render progressively as agents complete — the HRV chart populates while the synthesis agent is still running.
Deployment: Google Cloud Run, which satisfied the "built on Google Cloud" requirement and gave us a clean stateless deployment model compatible with the SSE streaming architecture.
Challenges we ran into
Streaming architecture with blocking agents. SSE requires a persistent connection that emits events as they happen, but each agent call to Gemini is a blocking HTTP request. Getting the threading model right — emitting events from background threads into a store that the SSE generator polls without race conditions — took significant iteration. The naive approach caused events to batch rather than stream, which defeated the purpose of real-time transparency.
Garmin API reliability. Garmin's Connect API is undocumented and changes without notice. Several endpoints that worked during initial testing stopped returning data mid-build. We had to implement per-metric try/catch with graceful null handling throughout the biometric agent so that a single missing field (e.g., body battery unavailable on some devices) didn't cascade into a pipeline failure. The fallback logic — running synthesis on partial or missing biometric data with conservative defaults — became a feature rather than a workaround.
Getting Gemini to return consistent JSON. Early versions of the synthesis agent produced markdown-wrapped JSON, truncated outputs, and occasional schema violations when the reasoning was complex. We solved this by making the schema explicit in the prompt, adding post-processing to strip code fences, and capping max_output_tokens tightly enough to force conciseness without truncation. Temperature at 0.45 for synthesis (lower than default) also significantly improved structural consistency.
Scoping the demo without real data. We wanted the demo to be indistinguishable from the live product — same charts, same pipeline timing, same reasoning quality — without requiring judges to have a Garmin and a Google Calendar. Building three fully realized demo scenarios with complete 14-day biometric histories, realistic chart data, and well-reasoned plans took more time than expected. But it was the right call: the contrast between the exam day scenario (HRV suppressed 29%, rest recommendation) and the green light scenario (HRV 16% above baseline, threshold intervals) is the clearest possible demonstration of what the product does.
Accomplishments that we're proud of
The reasoning chain is genuinely useful. When Adapt tells you to shorten today's ride by 40 minutes, it explains: your readiness is 61/100 because your HRV is slightly suppressed after a moderate-high training week, your cognitive load is 6.8/10 because you have a client presentation this morning, and the post-presentation cortisol crash will make perceived exertion unreliable — so strict Z2 is more valuable than a harder session that doesn't get absorbed. That specificity is what separates an adaptive recommendation from a generic one, and getting Gemini to produce it consistently at that level of precision required careful prompt engineering.
The live analytics dashboard renders data that users can actually interpret and verify. Every recommendation is backed by visible numbers — the HRV trend chart shows you why the readiness score is what it is; the ACWR panel tells you where you sit in the optimal training zone. The system earns trust rather than asking for it.
The architecture degrades gracefully and works end-to-end even in the demo — the full SSE streaming pipeline, chart rendering, interactive checklist, and domain-specific recommendations all function without any external API keys.
What we learned
The most important thing we learned had nothing to do with the code. The real design challenge wasn't building the agents — it was deciding what to do with the outputs. A readiness score of 61 and a cognitive load score of 6.8 could reasonably produce any of several decisions. Getting the synthesis layer to apply those inputs with consistent, principled logic — and to communicate the why in a way that builds trust rather than just issuing commands — required us to think carefully about how behavioral science and exercise physiology actually interact.
We also learned that the Garmin API is a good stress test for graceful degradation. Building a system that's robust to partial data, API failures, and inconsistent schema responses — while still producing a useful output — forced us to think about failure modes as carefully as the happy path.
On the technical side: SSE streaming in Flask with threaded background agents works cleanly once you get the threading model right, but the path to getting it right is not obvious from the documentation.
What's next for Adapt
Longitudinal personalization. Current readiness thresholds are population baselines. With enough user data, the system could learn individual baselines — your HRV range, your typical cognitive load patterns, your recovery signatures after specific stressor types — and calibrate thresholds to the person rather than the average.
Weekly cycle adaptation. Today Adapt adapts a single day. The next version propagates that change forward: if today's session is shortened due to cognitive load, Thursday's quality session gets protected; if this week's load is tracking low due to a stressful schedule, the weekend long run gets a small volume bump. The planning horizon extends from 24 hours to 7 days.
Reinforcement learning on execution data. Did the user complete the adapted session? Did the following day's HRV improve or degrade? A feedback loop trained on these outcomes could optimize for long-term adherence rather than just daily plan quality — which is ultimately the harder and more important problem.
Mobile just-in-time delivery. The plan should arrive when it's decision time, not after the user opens the app. A push notification at 5:30pm that says "your readiness is 84, your calendar cleared at 4 — here's today's threshold session" closes the intention-execution gap at the moment it matters.
Broader wearable coverage. Apple Health and Whoop integration would extend Adapt to users outside the Garmin ecosystem, which is the majority of the amateur athlete market.
Built With
- 2.0
- 2.0)
- ai
- api
- apis
- architecture
- calendar
- chart.js
- cloud
- connect
- css
- events
- flash
- flask
- for
- frameworks
- frontend
- garmin
- garminconnect
- garminconnect)
- gemini
- google-api-python-client
- google-auth-oauthlib
- google-genai
- html
- infrastructure
- integrations
- javascript
- key
- languages
- libraries
- multi-agent
- oauth
- patterns
- pipeline
- python
- python-dateutil
- real-time
- run
- sdk)
- server-sent
- sse)
- vanilla)
- via
Log in or sign up for Devpost to join the conversation.