Inspiration
Parkinson's disease affects over 10 million people worldwide, yet motor state monitoring still largely happens in
15-minute during patience's clinical check. The neurologist asks "how bad was the tremor this week?" and the patient
shrugs — there's no data. There's one plausible solution: Consumer wearables. But they're expensive and quite inaccessible. However, everyone already has a smartphone with a high-frequency accelerometer in their pocket. So the
question became: what if the sensor you already carry could track your motor state, explain its reasoning out loud,
and adapt to you over time — entirely on-device?
What It Does
AuraPD Voice is a fully on-device iOS app that:
- Listens for the wake phrase "Check my condition" (or a tap)
- Captures 10 seconds of accelerometer data via CoreMotion at 50 Hz
- Classifies motor state as ON / OFF / Tremor using an adaptive agent
- Explains the result in plain English and speaks it aloud via TTS
- Learns from thumbs-up / thumbs-down feedback in real time
- Matches the user to a de-identified patient library using GNN-style similarity
Everything runs on-device. No raw sensor data ever leaves the phone.
How I Built It
Signal Processing
Three features are extracted from each 10-second accelerometer window:
$$\mu = \frac{1}{N}\sum x_i, \quad \sigma = \sqrt{\frac{1}{N}\sum (x_i - \mu)^2}, \quad E = \sum x_i^2$$
The agent forms a weighted $\sigma_w$:
$$\sigma_w = w_0 \sigma + w_1 \frac{E}{E_{max}} \sigma \cdot 0.15 + w_2 |\mu| \cdot 0.05$$
Adaptive Threshold
Rather than a fixed cutoff, the agent blends the user's manual slider with a learned base:
$$\tau_{adaptive} = 0.65\tau_{user} + 0.35\tau_{base}$$
Classification:
$$\text{state}(\sigma_w) = {\sigma_w > 2\tau \ \text{OFF}, \tau < \sigma_w \leq
2\tau \ \text{ON}}$$
Confidence is the distance to the nearest decision boundary pushed through a sigmoid:
$$\text{conf} = \frac{1}{1 + e^{-\lambda \cdot \min(|\sigma_w - \tau|,|\sigma_w - 2\tau|)}}$$
where $\lambda$ is a hyperparameter adjusted manually.
Online Learning from Feedback
Each thumbs-up / thumbs-down triggers a three-step online update ($\delta = +1$ confirm, $-1$ reject):
Step 1 — threshold gradient descent:
$$\tau_{base} \leftarrow \tau_{base} + \alpha \delta (\sigma_w - \tau_{base})$$
Step 2 — feature weight update (perceptron rule, then re-normalise):
$$w \leftarrow w + \alpha \delta \frac{\phi}{|\phi|}$$
Step 3 — accuracy EMA ($\beta = 0.6$):
$$\text{acc} \leftarrow \text{acc}(1 - \alpha\beta) + y(\alpha\beta)$$
GNN Patient Matching
Each patient is a node in a feature-vector graph. Three similarity dimensions are computed:
$$S_{cos} = \frac{A \cdot B}{|A||B|}, \quad S_{rbf} = \exp\left(-\frac{|A-B|^2}{2\sigma^2}\right), \quad S_{eucl} = \exp\left(-\sqrt{\sum w_i (a_i - b_i)^2}\right)$$
These fuse into a Treatment Inspiration Value:
$$\text{TIV} = 0.30S_{cos} + 0.45S_{rbf} + 0.25S_{eucl}$$
The 0.45 weight on treatment response reflects the clinical insight that how a patient responds to levodopa is the most actionable reference signal.
Stack
Swift + SwiftUI, CoreMotion, AVFoundation (TTS), Speech framework — all iOS-native. No backend, no model files,
just math running live on the A-series chip.
Challenges
Microphone ownership conflicts. The Speech recogniser for wake-word detection and AVFoundation for TTS both fight over the audio session. I had to build a careful state machine that pauses the wake-word listener before TTS fires, then restarts it after — with a guard against re-triggering mid-speech.
Variance compression in live data. At rest, $\sigma$ sits around 0.01–0.03 g. A moderate tremor is 0.08–0.15 g. The
naive fixed threshold either misses mild tremors or fires constantly during walking. The blended $\tau_{adaptive}$
and per-session calibration fixed this, but tuning $\beta$ so accuracy converged within ~10 feedback taps — not
100 — took a lot of trial and error.
Building something clinically humble. Every string in the UI reminds the user this is a reference tool, not a
diagnosis. Writing hypothesis text that is informative but explicitly non-prescriptive — especially for
medication-timing inferences — was harder than the signal processing.
What I Learned
I came in knowing SwiftUI. I left understanding why on-device ML matters beyond the privacy pitch: latency. A cloud
round-trip for real-time tremor visualisation would be unusable. Doing everything locally means the avatar
responds to the accelerometer within one frame.
Log in or sign up for Devpost to join the conversation.