MeetingCopilot

Inspiration

MeetingCopilot was born from the frustration of "meeting fog"—the mental drain of searching for local documents while trying to stay engaged in a live conversation. I wanted an "Iron Man HUD" that could stay sharp throughout the meeting, and promptly surface facts from my knowledge base.

What it does

It is a native macOS agent that provides real-time transcription and intelligence. It uses ASR, RAG, and prompt engineering to answer questions in realtime based on your knowledge base, displaying insights through a non-intrusive, stealth overlay that is invisible in a meeting screen share.

How we built it

Python powers the ASR, OCR, RAG, LLM prompt, and Gemini QA pipeline. The UI is built with Swift for the macOS UI.

Challenges we ran into

The primary hurdle was navigating macOS system audio permissions to capture multi-channel output, and using it intelligently for diarization — mic for us vs system audio for other meeting attendees. Balancing real-time diarization (speaker detection) with the latency of LLM responses required an optimized concurrency model. Stealth mode UI was interesting — this does not distract the viewers while intelligently assisting the user in realtime.

Accomplishments that we're proud of

We successfully created a seamless "Intelligence Loop" where the AI can intelligently reason across audio, text, and vision modalities. Seeing the app pull a specific figure from a 50-page PDF the moment it was asked in a meeting was a wow factor for attendees.

What we learned

I learned that the true power of AI isn't just in the model, but in the data orchestration. Building this project deepened my expertise in ASR, diarization, OCR, the power of Google Gemini, RAG, and prompting.

What's next for MeetingCopilot

The next step is implementing automated action-item extraction that syncs directly with project management tools. We also plan to implement real-time ticketing and task management, which gives us an edge over other meeting assistants who take actions after the meeting has ended. We also plan to expand the "Visual Context" engine to support real-time video analysis of shared screens for even deeper meeting insights and prompt guidance. Another item on the wishlist is real-time sentiment analysis to guide the user for successful meetings.

Built With

Share this project:

Updates