Kynedge — real-time IMU motion-event recognition, TinyML-style
Inspiration
Micromobility (e-scooters, e-bikes) has exploded, but understanding what is actually happening to a vehicle in real time — a hard brake, a crash, a fall — usually takes dedicated hardware or heavy models. We wanted to show that 6 axes of IMU at 50 Hz (accelerometer + gyroscope) plus a handful of classic features are enough to classify motion events in real time, with a model small enough to run "at the edge." And we wanted to prove generalization: the exact same code should work on a completely different domain — a "horizontality test" with three arm gestures — by changing only the data, not a single line of logic.
What it does
An end-to-end pipeline:
$$ \text{raw IMU} \;\to\; \text{sliding window} \;\to\; \text{feature extraction} \;\to\; \text{RandomForest} \;\to\; \text{real-time inference} \;\to\; \text{dashboard} $$
- Domain 1 (riding):
normal_riding/hard_braking/crash - Domain 2 (arm): three gestures, with a 3D reconstruction of arm orientation
- Live dashboard over WebSocket: acc/gyro signals, predicted class, confidence, per-class probability bars, and a red alert on crash.
How we built it
We sample at $f_s = 50\,\text{Hz}$ and segm of 1 s at 50% overlap:
$$ W = f_s \cdot 1\,\text{s} = 50 \quad\text{samples}, \qquad H = \tfrac{W}{2} = 25 \quad\text{(hop)} $$
From each window $\mathbf{X}\in\mathbb{R}^{50\times 6}$ we extract a frozen 43-feature vector: per-axis statistics (mean, std, min, max, RMS, energy), accelerometer and gyroscope magnitudes, plus two dynamics features — the jerk
$$ \text{jerk}_{\max} = \max_t \left| \Delta \lVert \mathbf{a}_t \rVert \right| $$
and the zero-crossing rate, which separates the periodic (riding) from the transient (crash). We classify with a **RandomForest*arning.
Three architectural choices protected us from the classic production-ML bugs:
- A single feature implementation, imported by both training and inference → no train/serve skew.
- Grouped split (
GroupShuffleSplitonrecording_id): windows from the same recording never land in train and testy is inflated by leakage. - A single bundle (
model + scaler + feature_names + labels + params): inference reads everything from it, and we enfor $\texttt{labels} = \texttt{model.classes_}$ so thatpredict_probacolumns can never drift out of alignment with the labels.
For the 3D arm we reconstruct orientation from the direction of gravity measured by the accelerometer — therefore *drift-fre
$$ \phi_{\text{roll}} = \operatorname{atan2}(a_y, a_z), \qquad \theta_{\text{pitch}} = \operatorname{atan2}!\left(-a_x, \sqrt{a_y^2 + a_z^2}\right) $$
Challenges
- Synthetic data that was too easy. Our generators are synthetic (a declared PoC). Domain 2 was hitting 100% accuracy — e would rightly suspect the generator was encoding the labels. We found that adding sensor noise wasn't enough (window features aggregatpattern stays separable). The right lever was simulating sloppy human execution: blending a fraction of another gesture into each recrlap* brought us down to a credible 0.94, with physically sensible confusions.
- The temptation of a "crash" 3D reconstrmpact trajectory from IMU means double integration, $\;p(t) = \iint \mathbf{a}\,dt^2\,$, and the error grows as $t^2$: massive drift, pure fiction. We took the honest path: the 3D shows only **orientation, derived from gravitthis explicitly on the page.
- Integration reliability. A skew-proof bundle contract, server-side file validation (no path traversal, no crash on invalid its` version so the server API doesn't break.
What we learned
- In production ML the worst bugs are *silerve skew, misaligned labels. You beat them with *contracts and invariants, not bigger models.
- A synthetic 100% is a red flag, not a comes from calibrated difficulty, not perfect numbers.
- Classic features + RandomForest remaiadable*: the feature importances tell a story (gyroscope for gestures, acceleration magnitude for crashes) that a black-box model wouldn't
Results
| Domain | Accuracy (grouped split) |
|---|---|
| Domain 1 (riding) | 0.99 |
| Domain 2 (arm) | 0.94 |
Same pipeline, same 43 features, same code — just different data.
Built With
- claude-code
- codex
- javascript
- python
Log in or sign up for Devpost to join the conversation.