Inspiration

Algorithms are fundamental to computer science, yet many learners struggle to understand how they actually execute. Platforms like LeetCode provide excellent problems, but most explanations rely on static code and textual walkthroughs.

The biggest challenge for beginners is visualizing how data structures evolve step-by-step during execution, especially for complex algorithms like Dynamic Programming or graph problems.

This inspired us to build Algo Vision, an AI-powered platform that converts algorithmic code into dynamic visual explanations.

To power reasoning and explanations, Algo Vision leverages Google’s Gemini model via the Google GenAI SDK, enabling the system to understand code, generate structured reasoning, and guide users through algorithm execution.


What it does

Algo Vision transforms algorithm problems into interactive visual learning experiences.

Key features include:

  • Gemini-powered reasoning to analyze algorithm logic
  • Step-by-step visualization of algorithm execution
  • Dynamic visualization of arrays, stacks, trees, graphs, and DP tables
  • A Live AI Tutor powered by Gemini that explains algorithm behavior
  • Animated visual explanations generated from real code execution
  • Interactive exploration of algorithm states

For example, consider a classic Dynamic Programming recurrence:

Inline math:

\(F(n) = F(n-1) + F(n-2)\)

Display math:

$$ DP[i] = DP[i-1] + DP[i-2] $$

Instead of manually tracing recursive calls, Algo Vision visually shows how the DP table evolves step-by-step, helping learners understand overlapping subproblems and optimal substructure.

Gemini analyzes the code and generates explanations describing how each state is computed.


How we built it

Algo Vision uses a modern AI-powered full-stack architecture deployed on Google Cloud.

Frontend

  • Next.js
  • TypeScript
  • Interactive UI for visualization playback

Backend

  • Python FastAPI
  • Code execution tracing
  • Visualization state generation

AI System

  • Gemini models via Google GenAI SDK
  • Agent orchestration using Agent Development Kit (ADK)
  • Multi-agent reasoning pipeline for code analysis
  • Retrieval pipelines for algorithm documentation and context

Visualization Engine

  • Manim for algorithm animation generation
  • FFmpeg for rendering animation videos

Google Cloud Infrastructure

  • Vertex AI for Gemini model inference
  • Cloud Run for scalable backend deployment
  • Cloud Storage for storing generated visualization videos

When a user submits code, Gemini analyzes the algorithm, extracts logical steps, and generates structured execution traces that are later converted into animations.

Many algorithms visualized in Algo Vision follow computational models such as:

$$ DP[i] = \max(DP[i-1], DP[i-2] + value_i) $$

which appears in optimization problems like House Robber.


Challenges we ran into

Building Algo Vision involved several technical challenges:

  • Extracting deterministic execution traces from arbitrary user code
  • Designing a flexible system capable of visualizing multiple data structures
  • Ensuring explanations generated by Gemini accurately match program execution
  • Optimizing rendering performance for algorithm animations
  • Managing distributed workloads for animation generation

Accomplishments that we're proud of

  • Built a complete AI-powered algorithm visualization platform
  • Successfully integrated Gemini with agent-based orchestration
  • Generated animated explanations directly from user code
  • Created a Live AI Tutor capable of explaining algorithm behavior
  • Deployed the system on Google Cloud infrastructure

What we learned

During development we learned:

  • How Gemini models can assist algorithm education
  • Techniques for converting execution traces into visual representations
  • How agent-based systems improve reasoning workflows
  • Challenges involved in building multimodal AI systems combining code, text, and visualization

For example, dynamic programming problems often involve computing optimal states:

$$ DP[i][j] = \max(DP[i-1][j],\; DP[i-1][j-w_i] + v_i) $$

Visualizing these transitions makes it easier to understand how optimal solutions are built.


What's next for Algo Vision

We plan to expand Algo Vision into a full AI-powered algorithm learning platform.

Future improvements include:

  • Support for multiple programming languages
  • Visualization of advanced algorithms and graph problems
  • More advanced reasoning powered by Gemini multimodal models
  • Personalized AI learning paths
  • Real-time visualization while coding
  • Interactive algorithm simulations

Our long-term vision is to build a platform where any algorithm can be instantly analyzed, explained, and visualized using AI.

Built With

Share this project:

Updates