Voice Notes

🧠 Inspiration

In a world flooded by distractions, jotting down a thought before it disappears can be harder than it sounds. We wanted to capture those fleeting ideas effortlessly, to let speech flow naturally into structured, editable notes. The inspiration behind Voice Notes was to bridge intuition and code, to turn the act of thinking aloud into a seamless workflow built from the ground up in low-level C++.

🎙️ What it does

Voice Notes listens, transcribes, and remembers. It records audio directly from your microphone, converts it into text using a native C++ implementation of OpenAI’s Whisper model, and displays synchronized voice–text notes inside a minimalist, SFML-powered interface. Each note exists as a pair of .wav and .txt files - editable, replayable, and fully offline. With a single hotkey, your voice becomes structured memory.

🧩 How we built it

Voice Notes was built entirely in C++, combining:

SFML for real-time graphics, window management, and microphone input
Whisper.cpp (GGML) for efficient on-device transcription
Low-level file I/O and threading to synchronize recording, saving, and UI rendering
Custom state management for settings, hotkeys, and data persistence

No frameworks, no web stack, just system-level code designed for raw performance and control.

⚙️ Challenges we ran into

The main challenge was working close to the metal: managing multiple threads for recording, processing, and rendering without crashes or deadlocks. Integrating Whisper at a native level meant navigating memory alignment, audio resampling, and cross-platform quirks. We also faced the complexity of building a GUI from scratch in SFML, handling inputs, focus, and asynchronous behavior without the safety net of modern UI libraries.

🏆 Accomplishments that we're proud of

Running Whisper transcription natively in C++ with no Python bindings
Building a complete offline voice-to-text workflow with real-time audio capture
Designing a responsive and minimal UI from the ground up
Creating a system that’s both low-level and user-friendly, where efficiency meets simplicity

📚 What we learned

We learned how much depth lies beneath “simple” applications. From managing buffers and sample rates to handling Unicode input and multithreaded state synchronization, Voice Notes forced us to think like systems engineers. We also gained a deeper appreciation for how low-level optimization and thoughtful UX can coexist beautifully.

🚀 What's next for Voice Notes

Next, we plan to:

Integrate speaker diarization and sentiment detection
Add cloud synchronization (optional, privacy-respecting)
Introduce searchable transcriptions and note tagging
Port the core engine to a cross-platform mobile version
Optimize inference using GPU acceleration via Vulkan / CUDA

Ultimately, we want Voice Notes to evolve into a personal companion for thought, one that listens, understands, and organizes, all while keeping your data yours.

Built With

c++
sfml
whisper

Updates

Danton Soares started this project — Nov 09, 2025 02:13 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.