Inspiration

Parkinson's disease affects over 10 million people worldwide, yet motor state monitoring still largely happens in
15-minute during patience's clinical check. The neurologist asks "how bad was the tremor this week?" and the patient shrugs — there's no data. There's one plausible solution: Consumer wearables. But they're expensive and quite inaccessible. However, everyone already has a smartphone with a high-frequency accelerometer in their pocket. So the question became: what if the sensor you already carry could track your motor state, explain its reasoning out loud, and adapt to you over time — entirely on-device?


What It Does

AuraPD Voice is a fully on-device iOS app that:

  1. Listens for the wake phrase "Check my condition" (or a tap)
  2. Captures 10 seconds of accelerometer data via CoreMotion at 50 Hz
  3. Classifies motor state as ON / OFF / Tremor using an adaptive agent
  4. Explains the result in plain English and speaks it aloud via TTS
  5. Learns from thumbs-up / thumbs-down feedback in real time
  6. Matches the user to a de-identified patient library using GNN-style similarity

Everything runs on-device. No raw sensor data ever leaves the phone.


How I Built It

Signal Processing

Three features are extracted from each 10-second accelerometer window:

$$\mu = \frac{1}{N}\sum x_i, \quad \sigma = \sqrt{\frac{1}{N}\sum (x_i - \mu)^2}, \quad E = \sum x_i^2$$

The agent forms a weighted $\sigma_w$:

$$\sigma_w = w_0 \sigma + w_1 \frac{E}{E_{max}} \sigma \cdot 0.15 + w_2 |\mu| \cdot 0.05$$

Adaptive Threshold

Rather than a fixed cutoff, the agent blends the user's manual slider with a learned base:

$$\tau_{adaptive} = 0.65\tau_{user} + 0.35\tau_{base}$$

Classification:

$$\text{state}(\sigma_w) = {\sigma_w > 2\tau \ \text{OFF}, \tau < \sigma_w \leq
2\tau \ \text{ON}}$$

Confidence is the distance to the nearest decision boundary pushed through a sigmoid:

$$\text{conf} = \frac{1}{1 + e^{-\lambda \cdot \min(|\sigma_w - \tau|,|\sigma_w - 2\tau|)}}$$
where $\lambda$ is a hyperparameter adjusted manually.

Online Learning from Feedback

Each thumbs-up / thumbs-down triggers a three-step online update ($\delta = +1$ confirm, $-1$ reject):

Step 1 — threshold gradient descent:

$$\tau_{base} \leftarrow \tau_{base} + \alpha \delta (\sigma_w - \tau_{base})$$

Step 2 — feature weight update (perceptron rule, then re-normalise):

$$w \leftarrow w + \alpha \delta \frac{\phi}{|\phi|}$$

Step 3 — accuracy EMA ($\beta = 0.6$):

$$\text{acc} \leftarrow \text{acc}(1 - \alpha\beta) + y(\alpha\beta)$$

GNN Patient Matching

Each patient is a node in a feature-vector graph. Three similarity dimensions are computed:

$$S_{cos} = \frac{A \cdot B}{|A||B|}, \quad S_{rbf} = \exp\left(-\frac{|A-B|^2}{2\sigma^2}\right), \quad S_{eucl} = \exp\left(-\sqrt{\sum w_i (a_i - b_i)^2}\right)$$

These fuse into a Treatment Inspiration Value:

$$\text{TIV} = 0.30S_{cos} + 0.45S_{rbf} + 0.25S_{eucl}$$

The 0.45 weight on treatment response reflects the clinical insight that how a patient responds to levodopa is the most actionable reference signal.

Stack

Swift + SwiftUI, CoreMotion, AVFoundation (TTS), Speech framework — all iOS-native. No backend, no model files,
just math running live on the A-series chip.


Challenges

Microphone ownership conflicts. The Speech recogniser for wake-word detection and AVFoundation for TTS both fight over the audio session. I had to build a careful state machine that pauses the wake-word listener before TTS fires, then restarts it after — with a guard against re-triggering mid-speech.

Variance compression in live data. At rest, $\sigma$ sits around 0.01–0.03 g. A moderate tremor is 0.08–0.15 g. The naive fixed threshold either misses mild tremors or fires constantly during walking. The blended $\tau_{adaptive}$ and per-session calibration fixed this, but tuning $\beta$ so accuracy converged within ~10 feedback taps — not
100 — took a lot of trial and error.

Building something clinically humble. Every string in the UI reminds the user this is a reference tool, not a
diagnosis. Writing hypothesis text that is informative but explicitly non-prescriptive — especially for medication-timing inferences — was harder than the signal processing.


What I Learned

I came in knowing SwiftUI. I left understanding why on-device ML matters beyond the privacy pitch: latency. A cloud round-trip for real-time tremor visualisation would be unusable. Doing everything locally means the avatar
responds to the accelerometer within one frame.

Built With

Share this project:

Updates