MeetingMind AI

home page

Inspiration

Meetings are crucial for collaboration, but manually taking notes is tedious and error-prone. We wanted to create an intelligent assistant that doesn't just transcribe—it understands, organizes, and makes meeting content actionable.

What it does

MeetingMind AI transforms meeting audio into clean, structured minutes using Gemini 3 Pro's native multimodal capabilities. Key features include:

Dual Input Modes: Live recording with real-time audio visualization or file upload (MP3, WAV, M4A, WEBM)
Intelligent Analysis: Generates executive summaries, key discussion points, and action items with priorities
Multilingual Support: Seamlessly switches between languages (e.g., Chinese/English) for both input and output
AI Q&A: Ask questions about specific meeting details without re-listening to the entire recording
Professional Delivery: Export as PDF or copy formatted text for immediate sharing
Complete Transcription: View full meeting transcripts alongside structured summaries

How we built it

Frontend: React with Tailwind CSS for a modern UI design
AI Engine: Gemini 3 Pro API with native audio processing—no traditional STT pipeline needed
Audio Processing: Browser's MediaRecorder API and Web Audio API for real-time visualization
Structured Output: responseSchema in Gemini API ensures consistent JSON formatting

Challenges we ran into

Handling microphone permissions across different browsers and secure contexts
Optimizing the prompt engineering to get reliable multilingual responses
Balancing detail in transcriptions vs. concise action items
Implementing smooth PDF generation from complex HTML layouts

Accomplishments

Successfully leveraged Gemini's multimodal capabilities for direct audio understanding
Built a production-ready tool that saves hours of manual work per meeting
Created an intuitive bilingual interface that adapts to user preferences
Achieved professional-grade document formatting suitable for client delivery

What we learned

The power of native multimodal AI vs. traditional two-step STT+LLM approaches
Importance of constraint relaxation for audio device handling
User experience design for AI-powered productivity tools
Advanced prompt engineering for structured JSON outputs

What's next

Speaker diarization to identify different participants
Calendar integration for automatic meeting scheduling
Team collaboration features for shared meeting repositories
Custom templates for different meeting types (standup, review, planning)

Built With

apimediarecorder
apireacttypescripttailwind
audio
cssweb
gemini
google

Updates

Janka Rong started this project — Feb 07, 2026 06:57 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.