Speak deck - Landign page
Speak deck - Slide Editor
Speak deck - Slides generation by AI
Speak deck - Full screen
Speak deck - Text+Illustration
Speak deck - Real time image editing/updating by AI

SpeakDeck: The Presentation That Listens 🎙️✨

Inspiration

In meetings and lectures, you usually have to choose: engage in the moment or take detailed notes. We wanted to eliminate that trade-off. SpeakDeck allows professionals to just speak and sketch, while AI builds the visual narrative and structured notes in real-time.

What it does

SpeakDeck is a real-time, multi-agent whiteboard that instantly transforms conversational speech and rough sketches into professional, structured presentations.

Powered by Gemini 3, it orchestrates 5 specialized AI agents to listen, understand context, and visualize ideas instantly. It features a "Silent Observer" mode for passive note-taking and a "Conversational" mode for active brainstorming.

Real-World Use Cases

🎓 Education (Lecture to Slides): A professor delivers a lecture on biology. SpeakDeck listens silently, detecting the topic, and automatically generates a structured slide deck with anatomical diagrams and key bullet points—ready for students to download immediately.
🏥 Healthcare (Clinical Assistant): A doctor examines a patient, describing symptoms aloud. SpeakDeck's "Medical Mode" captures the dialogue, structures it into a clinical report, and even visualizes the described condition for the patient to see, acting as a real-time scribe and visual aid.
🏗️ Architecture (Sketch-to-Reality): An architect roughly sketches a site map on the canvas while talking about "sustainable materials." SpeakDeck interprets the stroke data and voice context to generate a realistic rendering of the building layout instantly.

How we built it

We utilized the Gemini Multimodal Live API for ultra-low latency audio streaming and function calling. The frontend is built with React and Vite, featuring a custom Canvas Board that captures raw stroke data. We implemented a sophisticated Agent Orchestrator where independent agents (Transcriber, Visual Artist, Slide Architect) communicate via an event bus to modify the shared state without race conditions.

What's next

We plan to integrate Veo to generate video summaries of the presentation and add multi-speaker support for panel discussions.

Built With

gemini-2.5-flash
gemini-2.5-flash-image
gemini-multimodal-live-api
lucide-react
pptxgenjs
react
tailwind-css
typescript
vite

Updates

Abdul Basit started this project — Feb 09, 2026 12:27 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.