Slidate

Inspiration

Most interactions with AI today happen through large blocks of text in chat interfaces. While this works for quick answers, it is a poor medium for learning, storytelling, and explanation. Humans understand complex ideas better through visuals, narration, and structured progression,the same reason explainer videos, diagrams, and whiteboards are so effective.

We asked a simple question:

What if AI didn’t answer questions with paragraphs, but instead created a live visual learning experience?

Instead of generating text, an AI could behave like a creative director and teacher, dynamically building a visual canvas that combines narration, diagrams, images, and interactive exploration.

That idea became Slidate - an AI-powered canvas where explanations unfold as a multimodal story, powered by Gemini’s interleaved output.

What it does

Slidate is an LLM-powered learning canvas that turns AI responses into interactive visual experiences.

Users interact with Slidate using voice, text, or images, and the AI generates a live canvas-based explanation combining narration, diagrams, images, animations, and structured slides.

Instead of reading long text responses, users watch and interact with a dynamic AI-generated explainer session.

Key features include:

Multimodal AI explanations

Using Gemini’s interleaved output, Slidate seamlessly combines:

text narration
AI-generated visuals
diagrams and SVG illustrations
animations and transitions
voiceover synced with visual elements

This allows the AI to tell a story visually, similar to an explainer video, but, generated in real time.

Interactive learning canvas

The core of Slidate is an AI-driven canvas workspace where the agent constructs explanations as visual slides.

The AI can:

create diagrams
generate charts and illustrations
highlight concepts
animate transitions
narrate explanations

This creates a visual-first AI experience instead of a chat interface.

Adjustable depth meter

Users can control how deeply the AI explains a topic using a depth slider.

The same query can produce:

quick overview
detailed conceptual explanation
advanced technical deep dive

This makes Slidate useful for beginners and experts alike.

Interactive learning detours

Users can click any term or sentence within a slide to explore it further.

Slidate opens a stacked canvas detour, allowing the AI to explain that concept in depth while preserving the context of the original topic.

Users can then navigate back to the main explanation seamlessly.

This creates a non-linear exploration experience similar to how humans naturally learn.

Visual problem solving

Users can upload photos or screenshots, such as homework problems or diagrams.

Slidate analyzes the image and generates a step-by-step visual explanation directly on the canvas, helping users understand the solution process.

How we built it

Slidate is built as a live multimodal agent system powered by Google AI technologies.

AI models

We use multiple Gemini models for different stages of the experience:

Gemini 2.0 Handles query summarization, structuring, and depth calibration.
Gemini 3.1 Pro Drives the live learning experience by generating interleaved multimodal outputs including narration, visual instructions, and canvas content.

These outputs are interpreted by our canvas renderer to construct the interactive explanation.

Agent architecture

Slidate is implemented as a learning agent using the Google Agent Development Kit (ADK).

The agent acts as a creative director, orchestrating:

narration
visual layout
diagrams
slide sequencing
interactive detours

This allows the AI to construct a structured learning experience rather than returning static responses.

Frontend

The interactive learning interface is built with:

React.js for UI architecture
HeroUI for interface components
HTML Canvas for rendering dynamic visual content and animations

The canvas acts as the core storytelling surface where the agent populates slides, diagrams, and transitions.

Backend

The backend stack includes:

Hono.js for lightweight server infrastructure
Google Agent Development Kit (ADK) for building and orchestrating the Slidate learning agent along with needed tools/MCP integrations
Drizzle ORM for type-safe database access
PostgreSQL / SQLite for storing sessions, canvas state, and interaction history
Google App Engine for scalable hosting on Google Cloud
Google App Engine for scalable hosting on Google Cloud

This architecture enables the agent to generate and stream interleaved multimodal content in real time.

Challenges we ran into

Giving AI something beyond a chat thread to express itself

LLMs naturally generate text, but Slidate required the AI to describe visual structures and layout instructions that could be rendered dynamically on the canvas.

We had to design a structured format that allowed Gemini to generate a rich experience on the canvas.

Synchronizing narration and visuals

Another challenge was aligning voice narration with visual elements so explanations felt natural and cohesive.

This required careful orchestration between AI output streams and frontend rendering logic.

Maintaining coherence in multimodal output

Using Gemini’s interleaved output meant handling responses that included multiple content types in a single stream.

We built parsing and rendering layers to ensure the AI’s mixed outputs translated into clean, understandable visual experiences.

What we learned

Building Slidate taught us that AI interfaces are evolving beyond chat.

While chat-based responses are useful, they are not the best format for:

education
storytelling
complex explanations

We also learned that multimodal AI becomes far more powerful when combined with structured interfaces like canvases, where the AI can express ideas visually rather than purely through text.

Most importantly, we discovered that AI agents can function as creative directors, orchestrating multiple forms of media into a cohesive narrative.

What's next for Slidate

Our vision is to push Slidate beyond a hackathon prototype into a full AI learning platform.

Future directions include:

Live AI whiteboard drawing

Allow the agent to draw diagrams step-by-step in real time, similar to a human teacher explaining on a whiteboard.

Collaborative learning

Enable multiple users to explore the same AI-generated canvas session together, allowing classrooms or teams to learn interactively.

Richer multimodal generation

Expand the agent’s capabilities to generate:

richer animations
dynamic simulations
video segments
interactive visualizations

Broader applications

While Slidate is powerful for learning, the same system could power:

product explainers
technical documentation walkthroughs
marketing storytelling
onboarding guides
interactive knowledge bases

Slidate reimagines how humans interact with AI, replacing static answers with living explanations.

Instead of reading responses, users experience them.

Built With

drizzle-orm
gemini
google-adk
google-cloud
honojs
react
sqlite/postgresql

Updates

Akshat Batra started this project — Mar 16, 2026 05:53 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.