High-level architecture scheme
Generation page
German story top image
Chinese story top image
English story top image
English story live-coding player and explanatory diagram
Italian story top image

Anee Explainee

Turn any coding task into a narrated, animated walkthrough in seconds — powered by Gemini 3.

Every coding tutorial has the same problem: a screen recording you can't pause to copy the code, crafted over hours of recording and editing. Anee Explainee eliminates all of that. Describe any task. Get a fully narrated, animated code walkthrough. Share the link. Done.

Who is this for?

Developers who want to document or share a solution without recording a video
Teachers and bootcamp instructors turning coding exercises into interactive lessons
Tech content creators who produce walkthroughs but hate the editing grind
Teams onboarding new engineers with living, shareable code explanations

The magic moment

Type a coding task. Within seconds, a multi-agent AI pipeline:

Decomposes your task into a structured spec
Generates themed, syntax-highlighted code — split into logical segments
Writes and voices a narration for each segment, synchronized to the typing animation
Produces a YouTube-style thumbnail and a technical diagram
Writes a full story article with an embedded interactive player
Publishes a shareable permalink — no login required to watch

The code in the player is real, selectable text — not a video frame. Viewers can pause, copy, and replay any segment.

Why it wins

Fast: the full pipeline — spec, code, audio, visuals, story — streams in real time
Multimodal output: code + CSS theme + PCM audio + PNG visuals + HTML article, all in one run
Embeddable: one <script> tag drops the interactive player into any page
Multi-language: narration in English, German, Spanish, Italian, Chinese
Agentic: seven specialized agents run autonomously; the orchestrator adapts the execution path based on user intent (task prompt vs. paste-your-own-code mode)

Live examples

These are real walkthroughs generated by the app — each in a different language:

Task	Narration	Story & player
Nginx-like reverse proxy native to Kubernetes with annotations	English	View story
User events DB churn analysis with JSON output	German	View story
Users database scaffold and simple admin panel	Italian	View story
Image and video carousel with vertical and horizontal modes	Chinese	View story

Try it yourself →

Architecture

The backend is a multi-agent orchestration pipeline (Go, Hexagonal / Ports & Adapters) where seven specialized agents execute autonomously, each backed by a purpose-selected Gemini model:

Agent	Model	Role
Spec Agent	Gemini 3 Flash + Google Search	Task decomposition — grounded in up-to-date docs and best practices
Style Agent	Gemini 3 Flash	Dynamic CSS theming for the code viewer
Code Agent	Gemini 3 Flash	Schema-constrained segmented code generation
Narrator Agent	Gemini 3 Flash	Per-segment voiceover scripts in the selected language
Voice Agent	Gemini TTS	Batched TTS + LLM-driven audio timestamp detection for sync
Visual Agent	Gemini 3.1 Flash Image	Thumbnail + technical diagram (parallel goroutines)
Story Agent	Gemini 3 Flash + Google Search	Blog-style article — grounded in current documentation

Agents chain through an event-driven EventSink (typed events: stage, spec, css, segment, audio, story, visuals, code_done) — streaming each result to the browser over WebSocket as it arrives. Image generation runs concurrently with sync.WaitGroup; batched TTS uses rate-limited concurrency for segment alignment.

Why Gemini 3?

Gemini is great for reasoning and multimodal generation.

Gemini 3.1 Flash Image generates both visual assets (thumbnail + diagram) in a single parallel call from the narration summary — no human prompt engineering needed per job.

Gemini TTS produces the narration, and a second Gemini 2.5 Flash call detects audio timestamps to align each audio chunk with its code segment — keeping voice and typing animation in sync.

Gemini 3 Flash's structured JSON output (genai.Schema + ResponseMIMEType: "application/json") is what makes reliable agentic data flow possible between pipeline stages — each agent's output is a schema-validated contract for the next. Without deterministic structured output, chaining seven agents without hallucination leaking between steps would be impossible.

Responsible AI

Every generation request across all seven agents runs with explicit SafetySettings (Harassment, HateSpeech, SexuallyExplicit, DangerousContent — BlockMediumAndAbove). Unsafe content is filtered at the model level before it can enter the pipeline or reach the user. The Spec Agent and Story Agent additionally use the Gemini Google Search grounding tool, so task specifications and story articles can reference current library documentation and best practices — not just the model's training cut-off knowledge.

Built With

cloud-run
firestore
gcr
gemini
gemini-sdk
golang
google
next.js
s3
typescript
websockets

Updates

Vasil Kulakov posted an update — Mar 16, 2026 03:44 PM EDT

We are live and ready!

Log in or sign up for Devpost to join the conversation.

Vasil Kulakov started this project — Mar 16, 2026 03:09 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.