Inspiration

The ideation for BonsAI came from a few core ideas: "Git for LLMs" enabling context engineering, data transformation (PII Scrubbing, Meta Agents) & privacy, and "25 Hours," an app idea for scheduling LLM interactions to work while you sleep.

What it does

BonsAI provides businesses with a living laboratory to test AI projects and regular LLM use. It addresses the problem of AI pilot projects failing to deliver ROI by offering an efficient, low-cost way to test, compare, and validate different AI approaches before full-scale builds. It allows for rapid, on-the-fly prototyping of complex workflows directly within a chat interface, enabling data-driven decisions by comparing competing workflows (e.g., linear vs. agentic) based on performance, cost, and accuracy. Proven workflows can be commoditized and shared as reusable assets across an organization.

BonsAI supports various workflow types: Linear (Non-generative): Simple, rules-based sequences (e.g., filtering emails, adding to CRM). Linear (Generative): Incorporates AI-powered content creation and analysis (e.g., hyper-personalizing draft emails). Circular (Iterative & Self-Improving): Creates feedback loops for continuous refinement (e.g., AI drafting knowledge base articles based on support chats). Branching (Conditional & Divergent): Follows different paths based on AI-driven analysis and decision-making (e.g., routing support tickets based on intent). Agentic (Goal-Driven & Autonomous): Non-deterministic workflows where an AI agent plans and executes steps to achieve a high-level goal (e.g., securing meetings with clients).

In a future iterations we have planned to include: ML & Fine-Tuning: The platform also supports ML and fine-tuning approaches.

How we built it

This project was built with numerous different AI Coding Platforms and Agents including Firebase Studio, Bolt.new, GitHub Copilot, Gemini CLI, Codex and Kiro. The architecture, classes and adherence to OOP was baked into the prompts throughout and the feature sets were ideated with ChatGPT, Gemini and Anthropic models, often cross-referenced and included in GitHub as issues and features. Additionally through Kiro the spec-development additionally scaffolded the development. The playwright MCP server and chrome-devtools MCP servers were often used to verify the UI functionality.

Challenges we ran into

I ran into a number of features that went in a unintended direction and needed to be reverted and approached differently. In many cases the AI Agents would test and verify via the console or codebase, but miss the connection to the frontend UI. For this, as mentioned, the playwright MCP server was invaluable as well as sharing images in multi-modal chats - often highlighting the problem areas and successful items within implementations.

Accomplishments that we're proud of

The foundational differentiating features are working and the advanced multi-participant flow was a particularly challenging, but rewarding feature to build out.

What we learned

I have learned and reinforced learnings about AI interactions, API calls and how the API shape can allow for collaboration, debate and more. This learning excites us as the agnostic and flexible approach we took empowers non-technical users to craft different prompts that enable coordination of agents in different ways.

What's next for Bonsai

The next step will be to get the application in front of customer and build out the analytics and A/B testing features that can prove out the performance gains and quantify the value delivered through the platform.

Built With

Share this project:

Updates