Spatial Summarizer - Devpost Submission

Inspiration

Traditional documents are barriers for millions of students with ADHD, dyslexia, and autism. We wanted to transform how people learn by making educational content visual, interactive, and multi-sensory—turning flat PDFs into explorable 3D worlds.

What it does

Spatial Summarizer converts any document (PDF, PowerPoint, text) into an interactive 3D experience with AI narration. Upload your biology notes, and explore a 3D brain where you can click on regions to hear explanations. The output is a single HTML file—shareable, offline-capable, and accessible.

How we built it

Backend: FastAPI with Python for document processing (PyMuPDF, python-pptx)
AI: Google Gemini analyzes content and designs 3D scenes using LangChain
3D Rendering: Three.js primitives (no external models needed)
Audio: ElevenLabs TTS generates narrations
Frontend: Premium glassmorphism UI with vanilla CSS
Three generation modes: Model matching (V1), anatomical generator (V2), and LLM-generated primitives (V3)

Challenges we ran into

Getting the LLM to design meaningful spatial relationships instead of random placements
Balancing visual appeal with accessibility requirements
Generating complete, self-contained HTML files with embedded interactions
Creating content-aware particle systems for anatomical visualizations