Approximate agentic flow
Approximate Deployement schema
Frontend

🚨🚨🚨USAGE WARNINGS!🚨🚨🚨

🚨🚨🚨
IF YOU WERE GIVEN ACCESS TO HIGHER MODELS PLEASE DO NOT OVERUSE THE TOOL
🚨🚨🚨

Understand that this tool can make a huge amount of calls: if you were given access to higher models (eg: gemini 2.5 pro) understand that a single call could get extremely expensive. I beg you to use it wisely!

This tool is not a chat.

This tools is not a chat. Is not gonna provide you any guided -AI- assistance toward its usage.
This tool is a simple thin UI shim used to initiate a call to a pretty large Agentic System in the backend.
The Agentic system requires a set of technical/business requirements to be given as input, even for a simple software. For understanding better the concept, please see the examples at https://github.com/albertomarabini/Kahuna-Kid/tree/main/Test%20Prompts
As output, it will return a zip file with the developed software, a complete breakdown of the system objects it contains and a report of the generation process.
This tool was made to be extremely resilient. If you see an hanging spinner, it doesn't necessarily mean that the process itself has failed. Most likely is the backend fighting against LLM timeouts or failures. If the process fails you will be notified in the ui
Once the process has started do not refresh the UI, let

About the project

Inspiration

LLMs are the reason I decided to return to this job — this truly feels like the next frontier after years of stagnation in this professional line. Among all the challenges in the realm of prompt engineering, Software Development stands out as one of the most canonical.

What it does

Kahuna Kid isn’t just a coding machine that spits out snippets one after another. Instead, by parallelizing the development process, it’s capable of architecting fully integrated systems across multiple technologies — potentially saving developers weeks or even months of boilerplate work.

It’s built as a Multi-Agent System that executes sequences of agents in sequential, parallel, or even graph-like structures, using knowledge synthesized from previous stages.

This approach can be applied to generate everything from simple video games, utilities, and mobile apps to end-to-end Web3 applications and full-stack enterprise systems.

The algorithm performs best when paired with high-end models. In our preliminary tests, it produced up to 17,000 lines of coherent code in one go.

I chose the name Kahuna Kid because the most compelling proof of concept for this agentic system to me, lies in its coordinated use with apparently unpretentious models like Gemini Flash or Gemini Flash Lite. The final result produced by this tiny bundle is quite remarkable.

How I built it

The Prompt Orchestrator backbone coordinates multiple AI agents, routes work intelligently, and guarantees structured, ready-to-use outputs — all with enterprise-grade observability. It spins up specialized LLM agents and workflow steps on demand, then stitches their results into one clean, consistent answer. It enforces JSON/typed schemas and auto-corrects malformed model output to prevent downstream system failures. Prompts are run concurrently, with per-call sessions and timeouts for high throughput. Context is shared across steps via session state, enabling conditional logic and smart retries.

Each single agent operates within an orchestrated end-to-end pipeline that reuses a shared session/context, allowing every agent to access the same evolving state. The system self-heals with controlled retries, logs all errors, summarizes them, and pins them to the session state for instant post-mortems. It also offers live status tracking and a full audit trail — including percentage updates, structured logs, and a per-step execution table stakeholders can easily share.

The Google Cloud backend ties everything together:

Cloud Run hosts the frontend.
Compute Engine powers the backend heavy lifting.
Both connect securely to Cloud SQL (PostgreSQL) for persistent data and Secret Manager for credential safety.
Assets and data are stored in Cloud Storage, while all AI-driven features are managed through Vertex AI.

Challenges I ran into

Definitely the setup on Google Cloud 😅 — not exactly my strongest area!

On the prompting side… well, there are still no canonical ways to build this kind of software. I had to improvise a lot. Plus, every prompt must include strong boundaries that LLMs shouldn’t cross (for example, avoiding overengineering, underengineering and a number of LLM specific bias).

Accomplishments I'm proud of

It works really well — even with smaller models! And... I saw things that marketing doesn't say :)

What’s next for Kahuna Kid

A deep integration with a RAG framework.
Have the whole development pipeline gated by a system capable of determining how even the most complex (or simplest) set of requirements should be handled — whether in one go or broken down.
Additionally, there should be human-in-the-loop touchpoints, allowing users to refine project details through direct interaction.

Built With

adk
google-cloud
langchain

Updates

Alberto Marabini started this project — Nov 10, 2025 05:16 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.