Nottaflow
Inspiration
The inspiration for Nottaflow came from observing how learning naturally happens through conversation and dialogue, yet traditional note-taking tools remain static and disconnected from this organic process. We recognized that students and researchers often struggle to capture the dynamic flow of ideas that emerge during discussions with AI or while thinking out loud.
The gap between conversational learning and structured knowledge needed bridging through technology that could seamlessly transform spoken thoughts into organized, editable content.
What it Does
Nottaflow transforms natural conversations into structured knowledge through:
- Voice Call Interface: Users can initiate professional voice calls with AI, complete with user and AI profiles, live transcription, and call duration tracking.
- Real-time Note Generation: Every conversation—whether typed or spoken—automatically generates structured workspace blocks using our Gemini-powered processing pipeline.
- Continuous Dialogue: The system maintains context across multiple exchanges, creating a natural back-and-forth learning experience.
- Live Editing: Generated notes can be immediately edited, rearranged, and refined by users.
- Multi-modal Input: Supports text chat, voice calls, and document uploads for comprehensive knowledge capture.
How We Built It
Our architecture follows a clear processing pipeline:
Input → Processing → AI Analysis → Structured Output
Frontend Stack
- Next.js 16 with TypeScript for type-safe development
- Custom React Hooks for Web Speech API integration
- Tailwind CSS for responsive, theme-aware styling
Voice Processing Pipeline
Speech Input → Web Speech API → Continuous Recognition → Auto-submission → Gemini 3.0 Processing → Live Note Generation
Key Implementation Details
- Built
useVoiceRecognitionhook with continuous listening and auto-restart capabilities - Implemented state synchronization between voice input, AI processing, and workspace updates
- Created professional voice call modal with theme-aware logos and real-time status updates
- Designed drag-and-drop workspace with live block editing
Challenges We Ran Into
- Voice Recognition Accuracy: Achieving reliable continuous speech recognition while preventing false triggers and handling ambient noise.
- State Synchronization: Managing complex state flow from voice input through transcript processing to AI analysis and UI updates.
- Hydration Mismatches: Resolving SSR/client-side conflicts caused by browser extensions injecting DOM attributes.
- Real-time UX: Creating smooth transitions between listening, processing, and note generation states.
- Auto-submission Logic: Balancing automatic note creation with user control and preventing unwanted submissions.
Accomplishments We're Proud Of
- Seamless Voice Integration: Achieved natural voice-to-note workflow with sub-500ms response time.
- Professional Call Interface: Built enterprise-grade voice call modal with live transcription and user profiles.
- Continuous Learning Loop: Implemented automatic listening restart after AI responses for uninterrupted dialogue.
- Theme-Aware Design: Created comprehensive dark/light mode system with dynamic logo switching.
- Live Note Generation: Successfully transformed abstract conversations into concrete, editable workspace content.
- Mobile Responsiveness: Ensured full functionality across desktop and mobile devices.
What We Learned
Technical Insights
- Advanced Web Speech API integration patterns and continuous recognition strategies
- Complex React state management for real-time multi-modal applications
- Gemini 3.0 API optimization for conversational AI processing
- SSR hydration best practices for browser extension compatibility
UX/Design Lessons
- Voice interface design requires different mental models than traditional UI
- Users need clear visual feedback during voice processing states
- Auto-generation must balance convenience with user control
- Professional voice call aesthetics significantly impact user trust and engagement
AI Integration
- Conversation context preservation across multiple exchanges
- Balancing AI creativity with structured output requirements
- Optimizing prompt engineering for consistent note generation quality
What's Next for Nottaflow AI
Short-term Enhancements (≤ 3 months)
- Multi-language Support: Expand voice recognition to support 10+ languages
- Advanced Note Types: Implement specialized blocks for equations, diagrams, and code snippets
- Collaboration Features: Enable shared voice sessions and real-time collaborative editing
Medium-term Vision (3-12 months)
- AI Voice Synthesis: Add AI voice responses for true conversational experience
- Smart Summarization: Implement intelligent conversation summarization algorithms
- Integration Ecosystem: Connect with popular tools like Notion, Obsidian, and Google Drive
Long-term Goals (12+ months)
- Personalized Learning Paths: Develop AI tutoring capabilities based on conversation history
- Knowledge Graph Visualization: Transform conversations into interactive concept maps
- Enterprise Solutions: Scale for team-based knowledge management and organizational learning
The ultimate vision is to create an AI learning companion that makes knowledge acquisition as natural as having a conversation, where Nottaflow becomes the perfect learning partner.
Built With
- clerk-authentication-api
- css
- eslint
- git
- google-gemini-3.0-api
- html
- javascript
- lucide-react
- next.js-16
- node.js
- npm
- react-19
- speech
- swr
- tailwind-css
- typescript
- vercel
- web
- web-speech-api
- zustand
Log in or sign up for Devpost to join the conversation.