Inspiration
Gallium was born from my observation / opinion that LLM agent development is tracing a path similar to video game AI. In the early days, game AI was often hardcoded and static, much like early prompt-engineered LLM responses. Today, we're seeing a push toward more sophisticated external logic loops and state tracking. I think that extrapolating to modern game AI architectures; Goal-Oriented Action Planning, Utility AI, and Behavior Trees, to the LLM space can create more robust, predictable, and manageable agents. Gallium is my prototype for a visual scripting harness layer that makes one of the oldest ones (Behaviour state machines) accessible to LLM agent workflows.
What it does
Gallium is a visual agent editor and runtime that allows you to:
- Design Visual State Machines: Define agent behavior using states (like "Planning" or "Executing") and transitions.
- Visual Programming for Logic: Use a node-based editor to build complex logic flows, function calls, and LLM interactions without writing code.
- Workflow Orchestration: Connect multiple agents together to collaborate on complex tasks.
- Real-time Simulation: Chat with your workflows in real-time and watch the state machines and logic graphs execute live.
- Multi-Model Support: Seamlessly connect to OpenAI, Anthropic, Gemini, or local models (via llama.cpp).
How we built it
The project features a modular architecture:
- Backend: A Python-based engine that manages simulation state, executes node graphs, and handles LLM communications.
- Frontend: A responsive web application built with vanilla HTML/JS and CSS, featuring custom-built node and agent editors.
- Communication: Real-time synchronization between the frontend and backend is handled via WebSockets.
- AI-Native Development: In the spirit of a hackathon, Gallium was built almost entirely using Google Gemini (3 Flash and Pro).
Challenges we ran into
Building a flexible, real-time visual programming environment presented several hurdles:
- Architectural Pivots: I restarted the project twice in the first week as I iterated on the core concept of how visual state machines should interact with LLM evaluations.
- Model Agnosticism: Ensuring that tool-calling and message formatting worked reliably across diverse providers like Gemini and local Llama APIs took a few tries to get right.
- Free API Quotas I ran into free rate limits very often building the tool so I had to get setup for a local model served using llama.cpp. I ended up using gemma3 for all of the protyping until the end when I needed to test with a smarter gemini 3 model.
Accomplishments that we're proud of
- It works: My core goal was make it work fully for one simple "agentic" workflow thats well known, ralph loops. This was to have a base to build off of.
- Visual node graph language The node graph interpterer and editor turned out exteremly nice for how little I guided the LLM.
- Rapid Prototyping: Leveraging Gemini allowed me to move from an idea to a multi-featured prototype in roughly one week of work, starting over 3 different projects.
What we learned
- Visual Representation Matters: Seeing an agent's logic as a state machine and graphs for the states makes Agentic Editor feel less like black boxes that I can only effect one part (the context) input to the single shot of thread your given. No control over how and what planning stages, what acceptance criteria is "done".
What's next for Gallium
Gallium is a "working idea" designed for experimentation. Roadmap:
- Sub agent spawning: I was intending on having teh ability for an agent to spawn a sub agent which would run the entire state machine graph and eventually feed a result back up to the top level graph/state that spawned it. Ran out of time.
- Visual scripting breakpoints: I really would live for the ability to step through the graphs and break point to a node that throws an error. Wasn't critical to make it work so it would be nice to revisit and add.
Log in or sign up for Devpost to join the conversation.