finna crashout

we're finna crashout

Inspiration

You know that feeling when you’ve been stuck on a bug for a long time. You open a file, stare at it, close it, open it again. At that point, it’s barely debugging, you’re just making noise in the codebase hoping something changes. The annoying thing is that git can't help you here, because nobody commits mid-spiral. By the time you snap out of it, you've made a dozen changes you can't account for and there's no clean state to go back to. We wanted something that would protect your code specifically during the moments when you're least capable of protecting it yourself. What got us started was that a crashout isn’t something magical. You can find traces of it in the way you type, how often you move around in files, and even in your face and behaviours. If you could learn those patterns automatically you could catch the spiral before it does real damage.

What it does

FinnaCrashout sits in VS Code and watches your coding behavior in real time. When it detects a sustained spiral, it automatically snapshots your code non-destructively, so nothing you have gets moved or overwritten. Then you can restore to that snapshot in one click.

It also shows a live risk meter in the status bar and a full dashboard with per-signal breakdowns and a rolling behavior timeline. And there's an opt-in Ragebait Mode. Think of it as a way to prepare for a high-stress environment (like a job!)

How we built it

The stack is a TypeScript VS Code extension talking to two local Python services: a FastAPI scorer on :8765 and a Flask camera/audio service on :5000. Raw video, audio, and source code stays local.

The detection pipeline. We used River's HalfSpaceTrees, an online anomaly detector. "Normal" is different for every developer, there are no labels for "crashout," and we can't ask people to collect training data. So the model just watches how you work, learns your baseline, and scores deviation from you.

The problem is that a raw anomaly score is too noisy. A fast, clean burst of productive typing is statistically unusual, but it's not a crashout. So we built four stages on top of the raw score: $$risk_input = pressure * (0.75+0.25*score_{raw})$$ $$s_t = \alpha * {riskInput}_{y} + (1-\alpha)*s_t-1, [\alpha=0.4]$$ $$anomaly = (t \geq 12) ∧ (streak \geq 3) ∧ (st \geq 0.9)$$

Editor features come from the extension itself: 14 features per 30-second window including typing rate, delete/insert ratio, backspace rate, pause count, file switches, and an editor_crashout_score that blends them.

*Multimodal signals *are optional and layered on top. We used MediaPipe Face Landmarker for eyebrow frown, jaw tension, forehead creases, and face redness. Each of these are calibrated against a personal 60-frame baseline so ambient lighting doesn't trigger it. For audio, we only capture a sounddevice RMS envelope: volume, no recording, no transcription. The mic signal is useful because a frustrated desk-slap or sigh is good indicator of a crashout.

The checkpoint. This one's pretty cool. The obvious implementation is git stash, which moves your work around and can create conflicts you have to untangle later. Ours writes a commit into a throwaway index under a private ref, completely outside your branch history and stash stack.

Your working tree and staged changes are never touched. Nothing moves. You can keep coding and restore any time, and the restore itself takes a safety snapshot first before it does anything.

Ragebait Mode uses ElevenLabs Instant Voice Cloning to make a clone of your voice, then a roast agent fires probabilistically every 5 seconds with risk-weighted speak-chances (base 0.92 at confirmed anomaly, down to 0.40 when things are mild). An LLM writes one short line with hard rails: it can roast the mess, the panic, the bug and more. It only ever sees numeric summaries. No code, no filenames, no audio is passed to the model.

Challenges we ran into

The biggest challenge were cold starts and false positives. An online anomaly detector has to warm up before it can trust itself. During warmup, everything's suppressed. After warmup, we still had the problem that "anything unusual" was firing. That’s why we have the entire pressure, EWMA, persistence pipeline. Tuning the numbers there was a game of trial and error

Accomplishments that we're proud of

The checkpoint is not destructive and we’d trust it in our own repos. The risk meter is stable enough that you actually believe it when it starts climbing, which took a lot more work than the detection itself. The whole pipeline runs locally, which means no API calls for the core loop and no data leaving your machine.

What we learned

Through building finna crashout, we learned how to bridge the gap between Python based ML and a TypeScript VS Code extension. We got hands on experience with MediaPipe, Isolation Forest, and the challenges of personalizing a model to individual behavior. But most importantly, we walked away with a much deeper appreciation for how much user behavior varies — and how difficult it is to build a system that adapts to every single person