Inspiration

We wanted a spatial study flow that feels less like a flat web app and more like having materials around you in XR: a main “command center” for review, with flashcards and a tutor as separate panels you can place and focus on. The spark was mixing quick capture from a page (scan) with AI that turns noise into structured study assets so you spend time learning, not reformatting notes.

What it does

parse lets you pick a preset study image (we are working on image upload), runs a vision + language pipeline to extract title, summary, concepts, flashcards, key terms, and takeaways, then opens a dashboard with quiz + breakdown. It automatically launches WebSpatial windows for flashcards and an AI tutor that answers from the scanned content. The tutor supports voice inputand spoken replies via ElevenLabs TTS. It’s packaged as a Vite PWA with a FastAPI backend.

How we built it

Frontend: React + TypeScript + Vite, routed parse flows, WebSpatial initScene + window.open for satellite scenes, session/BroadcastChannel sync, and a proxy to the API under /api.

Backend: FastAPI with Google Gemini for /scan, /scan-url, /explain, and /ask, plus httpx for safe image fetch; ElevenLabs behind POST /tts so keys stay server-side.

learn more here https://opendeep.wiki/amanibobo/parse/overview

Challenges we ran into

WebSpatial today mainly exposes size hints, not full 3D placement, so we leaned on UI layout and relative window sizes to suggest left / center / right. Setting up PICO emulator, PWA sites, and Android studio

Accomplishments that we're proud of

Shipping an end-to-end loop: scan → structured data → quiz + XR panels → voice tutor with TTS. Keeping the architecture clear: thin SPA, one API surface, explicit allowlist for URL scan. Making spatial multi-window feel intentional rather than three random tabs.

What we learned

Generative models are great for structuring messy inputs into flashcards and summaries, but you still need validation, fallbacks, and guardrails.

What's next for parse

In the future, a real camera / file upload polish, streaming tutor responses, better offline/PWA caching, user accounts and saved sessions.

Built With

Share this project:

Updates