Inspiration
There is a quiet, painful divide in technology today. The people who need speech-to-text the most: people with disabilities, students in remote villages, elders who struggle to type, journalists in sensitive environments, patients recovering from injury, are the same people who cannot rely on expensive, cloud-dependent, subscription-based tools.
They are asked to trust servers they have never seen, buy devices they cannot afford, or maintain internet connections they do not have.
We refused to accept that.
We built Nugget because your ability to use your voice should not depend on your income, your bandwidth, your hardware, or your geography. Voice is the most human interface we have. But today, access to it is unequal.
Nugget’s mission is simple and deeply personal: Make high-quality speech-to-text available to anyone, anywhere, entirely offline, with complete privacy, on hardware they already own.
And in many parts of the world, the only affordable computing devices are Arm-based laptops and single-board computers making offline, low-power, privacy-first speech technology not just helpful, but essential.
What it does
Nugget gives users a frictionless, almost magical experience:
Hold a shortcut → speak → Nugget transcribes your voice → it types into any input field on your device.
- Google Docs? Works.
- WhatsApp? Works.
- Email? Works.
- VS Code? Works.
Every text box becomes voice-enabled instantly, no integrations required. The UI is intentionally minimal: one button, one action, zero learning curve. Users can transcribe instantly after a 30-second setup.
And all of it happens fully offline, powered by optimized Whisper and Parakeet models running directly on Arm CPUs.
- No cloud.
- No subscription.
- No server logs.
- No waiting for WiFi.
No compromise on privacy.
For people with limited mobility, this means independence.
For people in rural communities, this means access.
For journalists and field workers, this means safety.
For students and creators, this means speed.
Nugget gives back control one sentence at a time.
How we built it
We engineered Nugget to run efficiently on Arm-powered devices, from modern ARM laptops to single-board computers like the Raspberry Pi 5.
Under the hood:
- Rust for secure, low-latency native performance
- Tauri + React for a lightweight, cross-platform UI
- whisper-rs for Whisper INT8 inference with NEON acceleration
- transcription-rs (Parakeet V3) for CPU-optimized Arm performance
- Silero VAD to filter silence and reduce compute cost
- cpal for real-time, cross-platform audio capture
- rdev to support global hotkeys and accessibility workflows
- rubato for efficient multi-rate audio resampling
Arm-specific optimizations
This is where Nugget stands out deeply:
- INT8 models reduce model size by up to 60%, crucial for low-memory Arm devices
- NEON vectorization gives 1.4× – 2.1× speedups in decoding
- On Raspberry Pi 5 (Arm Cortex-A76), Parakeet V3 runs at ~1.8× real-time, enabling usable offline STT even in remote educational settings
- Dynamic buffering reduces CPU spikes on thermally constrained Arm devices
- Designed to run inside 4–6 GB RAM environments common in low-cost Arm laptops and SBCs
Nugget isn’t just compatible with Arm. It is engineered for Arm from the ground up.
Challenges we ran into
Building a tool for everyone means facing the problems most AI tools avoid.
- Whisper crashes on fragmented GPU memory in Linux and older Windows machines
- Wayland compositors broke paste behavior in unpredictable ways
- Ensuring consistent global hotkeys for accessibility across all OSes
- Keeping memory usage stable when users load large Whisper models
- Guaranteeing real-time performance on low-power Arm CPUs
- Making the app intuitive enough that a non-technical user can use it without reading documentation
Every challenge made Nugget more stable, more inclusive, and more technically refined.
Accomplishments we're proud of
- Achieved fully offline, real-time transcription on mainstream Arm devices
- Created a universal push-to-talk interface that works in any application
- Designed a privacy-first STT engine that never touches the cloud
- Built an open-source foundation other developers can fork, extend, and build on
- Developed a reliable accessibility workflow for users with limited mobility
- Enabled offline transcription on $60–$100 Arm boards, making education & accessibility tools viable in developing regions
- Delivered a cross-platform app with a tiny Tauri footprint, safe and production-ready
The most meaningful accomplishment?
A beta-tester told us:
“For the first time, my disability doesn’t slow my thoughts down.”
That one sentence validated the entire project.
For many users in low-connectivity regions, Nugget is the first time speech-to-text has ever been accessible to them, because it finally works without internet, without subscriptions, and on the hardware they already have.
What we learned
- AI-for-Good requires empathy, not just accuracy.
- Privacy isn’t just a feature; it’s a form of protection.
- Arm devices are powerful enough to democratize offline AI globally when optimized properly.
- Users value trust and predictability more than fancy features.
- Open-source matters most when the project is simple, elegant, and hackable.
- The biggest barrier to accessibility is not technology, it’s neglect.
What’s next for Nugget
We’re turning Nugget into a long-term mission:
- Android Arm build for low-cost smartphones
- Streaming (token-by-token) real-time transcription
- Timestamped subtitles (SRT) for teachers & creators
- Plugin system for new languages and workflows
- Better Wayland-native interactions
- Education-focused version for rural schools
- Ultra-light models optimized specifically for Arm Cortex-series CPUs
- Integration with Google Classroom and Sheets
Our goal is bold but clear: Make private, offline, accessible speech technology a global default, not a luxury.
Nugget is not just a tool. It is a commitment to equity. And on Arm, Nugget doesn’t just transcribe voice, it redefines what on-device AI can be, turning every text box into a private, offline, universal voice interface with no integrations required.
Built With
- arm
- armneon
- cpal
- neon
- onnx
- parakeet
- rdev
- react
- rubato
- rust
- silerovad
- tauri
- typescript
- whisper
Log in or sign up for Devpost to join the conversation.