Anee Explainee

Turn any coding task into a narrated, animated walkthrough in seconds — powered by Gemini 3.

Every coding tutorial has the same problem: a screen recording you can't pause to copy the code, crafted over hours of recording and editing. Anee Explainee eliminates all of that. Describe any task. Get a fully narrated, animated code walkthrough. Share the link. Done.


Who is this for?

  • Developers who want to document or share a solution without recording a video
  • Teachers and bootcamp instructors turning coding exercises into interactive lessons
  • Tech content creators who produce walkthroughs but hate the editing grind
  • Teams onboarding new engineers with living, shareable code explanations

The magic moment

Type a coding task. Within seconds, a multi-agent AI pipeline:

  1. Decomposes your task into a structured spec
  2. Generates themed, syntax-highlighted code — split into logical segments
  3. Writes and voices a narration for each segment, synchronized to the typing animation
  4. Produces a YouTube-style thumbnail and a technical diagram
  5. Writes a full story article with an embedded interactive player
  6. Publishes a shareable permalink — no login required to watch

The code in the player is real, selectable text — not a video frame. Viewers can pause, copy, and replay any segment.


Why it wins

  • Fast: the full pipeline — spec, code, audio, visuals, story — streams in real time
  • Multimodal output: code + CSS theme + PCM audio + PNG visuals + HTML article, all in one run
  • Embeddable: one <script> tag drops the interactive player into any page
  • Multi-language: narration in English, German, Spanish, Italian, Chinese
  • Agentic: seven specialized agents run autonomously; the orchestrator adapts the execution path based on user intent (task prompt vs. paste-your-own-code mode)

Live examples

These are real walkthroughs generated by the app — each in a different language:

Task Narration Story & player
Nginx-like reverse proxy native to Kubernetes with annotations English View story
User events DB churn analysis with JSON output German View story
Users database scaffold and simple admin panel Italian View story
Image and video carousel with vertical and horizontal modes Chinese View story

Try it yourself →


Architecture

The backend is a multi-agent orchestration pipeline (Go, Hexagonal / Ports & Adapters) where seven specialized agents execute autonomously, each backed by a purpose-selected Gemini model:

Agent Model Role
Spec Agent Gemini 3 Flash + Google Search Task decomposition — grounded in up-to-date docs and best practices
Style Agent Gemini 3 Flash Dynamic CSS theming for the code viewer
Code Agent Gemini 3 Flash Schema-constrained segmented code generation
Narrator Agent Gemini 3 Flash Per-segment voiceover scripts in the selected language
Voice Agent Gemini TTS Batched TTS + LLM-driven audio timestamp detection for sync
Visual Agent Gemini 3.1 Flash Image Thumbnail + technical diagram (parallel goroutines)
Story Agent Gemini 3 Flash + Google Search Blog-style article — grounded in current documentation

Agents chain through an event-driven EventSink (typed events: stage, spec, css, segment, audio, story, visuals, code_done) — streaming each result to the browser over WebSocket as it arrives. Image generation runs concurrently with sync.WaitGroup; batched TTS uses rate-limited concurrency for segment alignment.


Why Gemini 3?

Gemini is great for reasoning and multimodal generation.

Gemini 3.1 Flash Image generates both visual assets (thumbnail + diagram) in a single parallel call from the narration summary — no human prompt engineering needed per job.

Gemini TTS produces the narration, and a second Gemini 2.5 Flash call detects audio timestamps to align each audio chunk with its code segment — keeping voice and typing animation in sync.

Gemini 3 Flash's structured JSON output (genai.Schema + ResponseMIMEType: "application/json") is what makes reliable agentic data flow possible between pipeline stages — each agent's output is a schema-validated contract for the next. Without deterministic structured output, chaining seven agents without hallucination leaking between steps would be impossible.

Responsible AI

Every generation request across all seven agents runs with explicit SafetySettings (Harassment, HateSpeech, SexuallyExplicit, DangerousContent — BlockMediumAndAbove). Unsafe content is filtered at the model level before it can enter the pipeline or reach the user. The Spec Agent and Story Agent additionally use the Gemini Google Search grounding tool, so task specifications and story articles can reference current library documentation and best practices — not just the model's training cut-off knowledge.

Built With

Share this project:

Updates