Inspiration
Over 100 million people wear sleep trackers, yet the data sits underutilized in app dashboards. We asked: what if disrupted sleep isn't just about being tired, but an early biomarker for serious health conditions? Sleep is one of the body's most sensitive indicators — changes in sleep architecture can precede clinical disease by months or years. We built SOMNI AI to transform wearable sleep data into evidence-based clinical intelligence, empowering people to detect health risks early and have informed conversations with their doctors.
What it does
SOMNI AI is a multi-agent clinical evidence synthesis platform that analyzes 30 days of sleep data from Apple Health, Fitbit, or Oura and generates two comprehensive reports:
Statistical Analysis:
- Computes z-scores using standard error of the mean (SEM) to detect deviations from personal baseline
- Calculates Sleep Health Deviation Index (SHDI) — a weighted composite of fragmentation, deep sleep, REM, efficiency, and variability
- Identifies sleep phenotypes (fragmentation-dominant, deep sleep reduction, REM instability, efficiency instability)
Patient Report: Written at 8th-grade reading level, empathetic and non-alarming, designed to be shared with family members. Includes "What This Does NOT Mean" section and clear next steps.
Clinical Report: GRADE-style evidence for healthcare providers with structured tables, effect sizes, evidence strength ratings (strong/moderate/emerging), limitations, and screening recommendations.
How we built it
Multi-Agent Architecture: We built a sub-agent team system powered by Claude as the main orchestrator. Each sub-agent has its own context window and specializes in a specific task:
- Literature Search Agent: Queries PubMed using NCBI E-utilities
- Medical Reasoning Agent: GPT-4.5 with specialized prompts for ranking risk domains and generating structured reasoning traces
- Evidence Assessment Agent: Evaluates consistency across sources and provides refinement instructions
- Guidelines Agent: BrightData integration for CDC/AHA-style clinical guidelines and health disparity data
- Consensus Agent: Perplexity Sonar queries for scientific consensus and controversy
Agents communicate like a real research team — when conflicts are detected, the assessment agent instructs the search agent to refine queries, and the system re-searches before finalizing reports.
Technical Implementation:
- Frontend: Next.js 14 App Router, React, Tailwind CSS
- Backend: Serverless API routes on Vercel (single deployment, no separate backend)
- Statistical Engine: Pure TypeScript implementation of sleep analysis algorithms (30-day baseline, SEM-based z-scores, SHDI calculation, phenotype classification)
- APIs: Claude (orchestration), GPT-4.5 (reasoning), PubMed (literature), BrightData (guidelines), Perplexity Sonar (consensus)
- Safety: Prompt injection protection at every API boundary, context compaction for token efficiency
- Data Parsing: Support for Apple Health XML, Fitbit CSV, and Oura CSV formats
Challenges we ran into
1. Context Window Management Across Sub-Agents: With 30 days of sleep data, multiple research papers, and multi-turn agent conversations, we hit token limits quickly. We solved this with context compaction strategies that preserve critical information while summarizing less relevant details, and by giving each sub-agent its own focused context window.
2. Evidence Synthesis Without Over-Claiming: Medical information requires extraordinary precision. We had to build a feedback loop where the evidence assessment agent evaluates source quality and consistency, flagging conflicts before the system writes reports. This ensures we never make claims stronger than the evidence supports.
3. Balancing Technical Sophistication with Readability: The patient report needed to be accessible to someone with no medical background, while the clinical report needed to meet evidence-based medicine standards. We iterated extensively on tone, reading level, and structure to serve both audiences without compromising either.
4. Autonomous Agent Reliability: Getting Claude to consistently choose the right sub-agent at the right time required careful prompt engineering and tool design. We implemented explicit pipeline tracking so we could debug and optimize the decision-making flow.
Accomplishments that we're proud of
Sub-Agent Team Architecture: We didn't just chain API calls — we built agents that genuinely collaborate, with separate context windows and inter-agent communication protocols.
Evidence-Based Medicine Standards: Our clinical reports follow GRADE methodology with explicit evidence strength ratings, effect sizes, and limitations — this is the standard used by top medical journals.
Human-Centered Design: The patient report opens with validation, not alarm. It's written to empower, not frighten. We tested it with non-technical readers to ensure accessibility.
Production-Ready on Day One: Single Vercel deployment, TypeScript-only analysis engine, full PDF export, parser support for three major wearable platforms. This isn't a demo — it's deployable.
Prompt Injection Protection: We hardened every API boundary to prevent malicious inputs from compromising the integrity of medical analysis.
What we learned
Agent orchestration is harder than it looks. Building a system where multiple AI agents communicate and refine each other's work required deep thinking about tool design, context management, and feedback loops. We learned that autonomous doesn't mean hands-off — it means building the right constraints.
Medical applications demand a different standard. We couldn't treat this like a typical hackathon project. Every claim needed a citation. Every report needed disclaimers. Every edge case needed careful handling. We learned to think like clinical researchers, not just engineers.
Sleep data is incredibly information-rich. Even "simple" metrics like awakenings or REM percentage can correlate with metabolic, cardiovascular, and neurological health. The challenge isn't finding signals — it's interpreting them responsibly.
What's next for SOMNI AI
1. Longitudinal Tracking: Extend analysis beyond 30 days to detect long-term trends and seasonal patterns. Add alerts for significant trajectory changes.
2. Provider Integration: Build FHIR-compliant export so clinical reports can be sent directly to electronic health records with patient consent.
3. Expanded Biomarkers: Integrate heart rate variability, SpO2, and activity data for multi-modal health intelligence.
4. Clinical Validation Studies: Partner with sleep labs and primary care clinics to validate our deviation indices against clinical outcomes.
5. Disparity-Aware Recommendations: Use BrightData more extensively to ensure recommendations account for population-specific risk factors and access barriers.
6. Explainability Dashboard: Add visualizations showing exactly which sleep patterns triggered which research associations, making the "black box" transparent.
SOMNI AI: Detecting tomorrow's health crisis in tonight's sleep.
Built With
- brightdata
- claude
- nextjs
- openai
- openevidence
- perplexity
- typescript
- vercel

Log in or sign up for Devpost to join the conversation.