Anee Explainee
Turn any coding task into a narrated, animated walkthrough in seconds — powered by Gemini 3.
Every coding tutorial has the same problem: a screen recording you can't pause to copy the code, crafted over hours of recording and editing. Anee Explainee eliminates all of that. Describe any task. Get a fully narrated, animated code walkthrough. Share the link. Done.
Who is this for?
- Developers who want to document or share a solution without recording a video
- Teachers and bootcamp instructors turning coding exercises into interactive lessons
- Tech content creators who produce walkthroughs but hate the editing grind
- Teams onboarding new engineers with living, shareable code explanations
The magic moment
Type a coding task. Within seconds, a multi-agent AI pipeline:
- Decomposes your task into a structured spec
- Generates themed, syntax-highlighted code — split into logical segments
- Writes and voices a narration for each segment, synchronized to the typing animation
- Produces a YouTube-style thumbnail and a technical diagram
- Writes a full story article with an embedded interactive player
- Publishes a shareable permalink — no login required to watch
The code in the player is real, selectable text — not a video frame. Viewers can pause, copy, and replay any segment.
Why it wins
- Fast: the full pipeline — spec, code, audio, visuals, story — streams in real time
- Multimodal output: code + CSS theme + PCM audio + PNG visuals + HTML article, all in one run
- Embeddable: one
<script>tag drops the interactive player into any page - Multi-language: narration in English, German, Spanish, Italian, Chinese
- Agentic: seven specialized agents run autonomously; the orchestrator adapts the execution path based on user intent (task prompt vs. paste-your-own-code mode)
Live examples
These are real walkthroughs generated by the app — each in a different language:
| Task | Narration | Story & player |
|---|---|---|
| Nginx-like reverse proxy native to Kubernetes with annotations | English | View story |
| User events DB churn analysis with JSON output | German | View story |
| Users database scaffold and simple admin panel | Italian | View story |
| Image and video carousel with vertical and horizontal modes | Chinese | View story |
Architecture
The backend is a multi-agent orchestration pipeline (Go, Hexagonal / Ports & Adapters) where seven specialized agents execute autonomously, each backed by a purpose-selected Gemini model:
| Agent | Model | Role |
|---|---|---|
| Spec Agent | Gemini 3 Flash + Google Search | Task decomposition — grounded in up-to-date docs and best practices |
| Style Agent | Gemini 3 Flash | Dynamic CSS theming for the code viewer |
| Code Agent | Gemini 3 Flash | Schema-constrained segmented code generation |
| Narrator Agent | Gemini 3 Flash | Per-segment voiceover scripts in the selected language |
| Voice Agent | Gemini TTS | Batched TTS + LLM-driven audio timestamp detection for sync |
| Visual Agent | Gemini 3.1 Flash Image | Thumbnail + technical diagram (parallel goroutines) |
| Story Agent | Gemini 3 Flash + Google Search | Blog-style article — grounded in current documentation |
Agents chain through an event-driven EventSink (typed events: stage, spec, css,
segment, audio, story, visuals, code_done) — streaming each result to the browser
over WebSocket as it arrives. Image generation runs concurrently with sync.WaitGroup;
batched TTS uses rate-limited concurrency for segment alignment.
Why Gemini 3?
Gemini is great for reasoning and multimodal generation.
Gemini 3.1 Flash Image generates both visual assets (thumbnail + diagram) in a single parallel call from the narration summary — no human prompt engineering needed per job.
Gemini TTS produces the narration, and a second Gemini 2.5 Flash call detects audio timestamps to align each audio chunk with its code segment — keeping voice and typing animation in sync.
Gemini 3 Flash's structured JSON output (genai.Schema + ResponseMIMEType: "application/json")
is what makes reliable agentic data flow possible between pipeline stages — each agent's output is
a schema-validated contract for the next. Without deterministic structured output, chaining seven
agents without hallucination leaking between steps would be impossible.
Responsible AI
Every generation request across all seven agents runs with explicit SafetySettings
(Harassment, HateSpeech, SexuallyExplicit, DangerousContent — BlockMediumAndAbove).
Unsafe content is filtered at the model level before it can enter the pipeline or reach the user.
The Spec Agent and Story Agent additionally use the Gemini Google Search grounding tool,
so task specifications and story articles can reference current library documentation and
best practices — not just the model's training cut-off knowledge.
Built With
- cloud-run
- firestore
- gcr
- gemini
- gemini-sdk
- golang
- next.js
- s3
- typescript
- websockets

Log in or sign up for Devpost to join the conversation.