Inspiration
During our studies and professional meetings, we were constantly torn between active participation and taking comprehensive notes. This struggle led us to create NotedAI, a solution that leverages AI to handle note-taking so users can be fully present in their conversations.
Our team observed that while audio recording tools exist, they often lack intelligent processing capabilities to make the content immediately useful. Additionally, enterprise users require heightened security for sensitive discussions. These insights shaped our vision for NotedAI.
What it does
NotedAI is an AI-powered web application that:
Captures audio through direct recording or file upload.
Transcribes speech into text using Google Speech-to-Text.
Generates concise, bullet-pointed summaries using Gemini AI.
Enables users to ask questions about the content with AI-powered answers.
Secures private conversations with enterprise-grade encryption (Midnight integration).
Provides a personalized dashboard to manage and search all sessions.
Offers wellness tips based on usage patterns to promote better meeting habits.
Automatically detects meeting scheduling mentions in conversations and creates Google Calendar events.
How we built it
NotedAI is built with a modern tech stack focusing on security, scalability, and user experience:
Backend: We developed a Node.js/Express server with RESTful APIs for session management, transcription, and AI processing.
Frontend: We created a responsive React application with Tailwind CSS for a clean, intuitive interface.
Database: MongoDB Atlas provides flexible document storage with search capabilities.
Authentication: We implemented Google OAuth for secure, seamless user authentication.
AI Integration: We leveraged Google's Speech-to-Text API for accurate transcription and Gemini 1.5 Pro for summarization and question-answering.
Security: For enterprise users, we integrated with Midnight for encrypted storage of sensitive transcripts.
Wellness: We developed an analytics system that tracks usage patterns and provides personalized wellness recommendations.
Calendar Integration: We integrated with Google Calendar API to automatically extract meeting times from transcripts and create calendar events without user intervention.
Challenges we ran into
Real-time Audio Processing: Implementing efficient audio capture and processing directly in the browser required overcoming various browser-specific limitations and permissions.
Summary Quality: Generating concise yet comprehensive summaries required careful prompt engineering with Gemini API to ensure the most important information was captured.
Enterprise Security: Simulating the Midnight secure storage integration required implementing proper encryption while maintaining a seamless user experience.
Contextual Question Answering: Teaching Gemini to answer questions specifically based on the transcript context, rather than its general knowledge, required careful parameter tuning.
User Experience: Balancing feature richness with simplicity was challenging - we wanted to offer powerful capabilities without overwhelming users.
Natural Language Understanding: Accurately detecting and parsing meeting scheduling information from natural conversation required sophisticated pattern recognition and contextual analysis.
Accomplishments that we're proud of
Creating a fully functional end-to-end solution that addresses a real problem we face.
Successfully integrating Google Speech-to-Text for accurate transcription across various accents and audio qualities.
Implementing the Gemini AI integration that delivers remarkably useful summaries and contextual answers.
Designing an intuitive UI that makes complex functionality accessible.
Building enterprise-grade security features with proper encryption.
Implementing intelligent automation that turns conversation into calendar events with no manual steps.
What we learned
Through this project, we gained valuable experience with:
Modern authentication patterns and security best practices.
Working with audio processing in web applications.
Prompt engineering for generative AI models.
Designing systems with both consumer and enterprise use cases.
Creating responsive web interfaces that handle complex workflows.
What's next for NotedAI
We have an exciting roadmap ahead:
Browser extension for capturing audio directly from video calls.
Mobile application for on-the-go recording.
Team collaboration features for shared access to transcripts.
Advanced analytics to identify meeting patterns and suggest improvements.
Integration with calendar applications for automatic meeting recording.
Additional language support for global accessibility.
Log in or sign up for Devpost to join the conversation.