Inspiration
What it does
How we built it
Challenges we ran into
Accomplishments that we're proud of
What we learned
What's next for SŌMA
The Moment This Became Real
I was reading my own journal from six months ago.
Every entry had the same themes. The same goals. The same promises to myself. The same frustrations.
Six months apart. Word for word identical.
That was not a motivation problem. That was not a discipline problem. That was a visibility problem. I could not see my own patterns because I was inside them. No tool I had ever used — journaling, habit tracking, productivity systems, therapy — had ever shown me my behavior objectively. They all waited for me to report what I thought was happening. And I was wrong, every time.
That is why I built SŌMA.
The Problem Is Bigger Than One Person
Research by organizational psychologist Tasha Eurich found that while 95% of people believe they are self-aware, only 10–15% actually are. There is less than a 30% correlation between how competent people believe they are and how competent they actually are.
This is not rare. This is the default human condition.
Every self-improvement tool ever built has one fatal flaw: it relies on your self-report. You tell it your goals. You log your mood. You describe your decisions. But you are the least reliable narrator of your own life. You misremember. You rationalize. You omit the embarrassing parts.
$$\text{Self-Report Accuracy} \approx 0.3 \times \text{Actual Behavior}$$
The data you give every tool about yourself is 70% noise.
What SŌMA Actually Is
SŌMA is the first objective observer of a human life.
Not a second brain. Not a memory tool. Not a habit tracker.
An always-on, local-first behavioral intelligence system that:
- Captures raw behavioral signal across audio, screen, and text — entirely on your device, never uploaded
- Builds a model of who you actually are from what you do, not what you say
- Surfaces your real patterns, your blind spots, and the gap between your stated values and actual behavior
- Intervenes at the exact millisecond you are about to repeat a mistake
It does not ask you anything. It watches what you do.
How I Built It
The entire stack runs locally. Zero cloud. Zero data leaving the device. Zero infrastructure cost.
| Layer | Tool |
|---|---|
| Audio Transcription | Whisper.cpp (real-time, on-device) |
| Screen Understanding | Moondream2 (local vision model) |
| Behavioral Inference | Llama 3.3 via Ollama |
| Semantic Embeddings | nomic-embed via Ollama |
| Memory + Pattern Store | LanceDB + NetworkX |
| Intervention Engine | Custom temporal pattern detection |
| Frontend | Electron (local desktop app) |
| Encryption | AES-256, user holds the only key |
The core architecture is a three-layer pipeline:
$$\text{Raw Signal} \rightarrow \text{Semantic Embedding} \rightarrow \text{Behavioral Graph} \rightarrow \text{Pattern Detection} \rightarrow \text{Intervention}$$
Every piece of raw data — audio, screen frames, keystrokes — is processed locally and immediately converted into semantic embeddings. The raw data is deleted after processing. What persists is meaning, not content. What is stored is mathematically irreversible back to the original signal.
What I Learned
Three things surprised me during the build:
1. The hardest problem is not technical — it is trust. An always-on system that watches your behavior will face immediate skepticism. The only answer is radical transparency: open-source the data pipeline, let anyone verify no exfiltration occurs, and make local-first architecture non-negotiable from day one.
2. Pattern detection across time is harder than pattern detection across content. Most AI systems find patterns in what is there. SŌMA needs to find patterns in sequences across days and weeks — what follows what, how often, under what conditions. That required building a custom temporal graph rather than relying on standard vector similarity.
3. The intervention timing problem is everything. Surfacing a pattern after the fact is useful. Surfacing it at the exact moment of deviation is transformative. Getting that timing right — early enough to matter, not so early it becomes noise — is the core product insight that separates SŌMA from every existing tool.
The Challenges
Privacy and trust — solved through uncompromising local-first architecture and open-source transparency.
Cold start problem — the behavioral model needs time to become meaningful. Solved by combining heuristic pattern detection in week one with ML-based detection after sufficient data accumulates.
Intervention fatigue — too many nudges and users ignore them all. Solved by a strict relevance threshold: SŌMA only intervenes when pattern confidence exceeds 80% and the stakes of the current action are above baseline.
Why Now
Four technical constraints that made this impossible broke simultaneously in the last 18 months:
- Local real-time transcription → Whisper v3
- On-device vision understanding → Moondream2
- Consumer-grade local LLMs → Llama 3.3, Mistral
- Local vector databases → LanceDB, ChromaDB
$$\text{2026} = \text{First year the entire stack fits on a consumer device}$$
The technology was waiting for someone to assemble it into this.
The Vision
In three years: 1,000,000 people have replaced their therapist, coach, or self-help routine with SŌMA as their primary tool for self-understanding — measured by active subscribers who have used the system for a minimum of six consecutive months.
The six-month threshold matters. It is the point at which the behavioral model becomes dense enough to surface patterns invisible to any shorter observation window.
It is also the point at which no user has ever voluntarily left.
Because leaving means losing the most honest record of yourself that has ever existed.
SŌMA. The first system that tells you the truth about yourself.
Log in or sign up for Devpost to join the conversation.