Inspiration

Parkinson’s disease reaches into every corner of a person’s life. It often begins quietly, with a slight tremor in the hand or a soft change in speech that others barely notice. Over time, those small signs can grow into stiffness, imbalance, and fatigue that make even the simplest tasks feel exhausting. It does not only affect the body, it also burdens the mind and spirit.

When Parkinson’s is recognized early, there is real hope. Early diagnosis allows doctors to start treatments that slow its progress and ease its symptoms before they grow worse. Medication, exercise, and therapy can help people stay active, independent, and connected to what they love.

Catching Parkinson’s early can protect precious time. Time to move with freedom, to speak with clarity, and to continue living with purpose. Awareness and early action give people the chance to hold on to more of their life, their identity, and their strength.

What it does

ParkinSense records a steady “aaahh,” rips out two sets of 768 self-supervised speech embeddings (Wav2Vec2-Base and WavLM-Base-Plus), stitches them into a 1,536 feature fingerprint, and feeds that into a calibrated XGBoost model. The app spits out a Parkinson’s probability, a confidence band, and a mel-spectrogram with clear annotations so the result feels real, not abstract.

How we built it

  • Verified every Figshare Parkinson/healthy vowel asset (audio + demographics) by checksum
  • Applied a clinical hygiene chain: silence trim, spectral denoise, loudness normalization, deterministic clip/pad, and generated deterministic augmentations (noise, gain, pitch, shift, soft clipping) to mimic real microphones.
  • Extracted SSL embeddings, fused them with MFCC/spectral and Praat perturbation features, scaled/PCA’d the vectors, and trained XGBoost with Platt scaling under stratified group K‑fold CV plus a 10 % grouped holdout to eliminate leakage from duplicate utterances.
  • Logged fold metrics, ROC/PR, calibration curves, demographics, feature importance, and embedding projections into FastAPI-served artifacts for the dashboard.
  • Built a Next.js + ShadCN frontend that records 16 kHz WAV in-browser, shows waveform quality tips, calls the FastAPI endpoints, and renders the insights with exportable PDF reports so both clinicians and patients can trust the context.

How we built it

  • Pulled the Parkinson vs healthy vowel dataset plus demographics from Figshare and checked every checksum.
  • Scrubbed the audio like a clinic visit: silence trim, spectral noise reduction, loudness normalization, duration control, then hammered it with noise, gain, pitch, shift, and gentle clipping augmentations.
  • Extracted both SSL embeddings, concatenated them, standardized the vectors, trained XGBoost with Platt scaling, and forced stratified group folds plus an 80/20 grouped holdout so no duplicated utterance cheated into validation.
  • Logged cross-validation, holdout metrics, ROC, PR, calibration, demographics, and feature importances into artifacts served by FastAPI.
  • Built a Next.js + ShadCN interface that records true WAV audio in the browser, sends it to the API, and renders everything with exportable reports that still feel human.

Challenges we ran into

  • The dataset was messy and unforgiving, so getting silence trimming and noise reduction right without killing tremor cues took a lot of trial.
  • Ensuring grouped splits meant rewriting the pipeline so augmented versions of the same clip never leaked; fake accuracy numbers were unacceptable.
  • Browsers love giving you random codecs, so forcing consistent 16 kHz WAV capture required custom audio plumbing on both client and server.

Accomplishments that we're proud of

  • Metrics: holdout accuracy about 72 percent and AUROC about 0.84 after fixing leakage, so every probability feels trustworthy.
  • A dashboard that protects the original design but adds everything patients and researchers need, including detailed diagnostics and PDF exports.
  • One command now rehydrates data, retrains, refreshes artifacts, and updates the live API so iteration is painless.

What we learned

  • XGBoost is useless if you do not respect group boundaries; once I did, the results finally matched realistic results.

What's next for ParkinSense

I'm going to try sharing it on Reddit and Parkinson’s support communities to let them try it. Add quality checks so people know instantly if their recording is too noisy or too short. Offer opt-in data donations so the model keeps improving without exploiting anyone.

Built With

Share this project:

Updates