Inspiration

Across multiple hackathons, we noticed a consistent trend: the best results don’t come from raw model output alone. They come from structured planning, modular prompt chains, and the right tools at the right time. After building several apps, we realized most solutions eventually boil down to an agent (or a group of agents) with a purpose-specific toolset. That insight led to one big idea: build the agent builder that can reliably assemble those agents for others.

What it does

Agent Forge is a comprehensive platform for building, testing, and deploying AI agents with support for multiple LLM providers, deployment tiers, and deep MCP integration.

Users select a model, tools, and prompt, and the platform helps generate and configure purpose-built agents using our modular @agent decorator approach. Our automated builder assistant uses Amazon Bedrock’s Claude Haiku 4.5 for structured agent assembly, while the core agents themselves are selected from a broader model registry that includes Bedrock models such as Claude, Titan, and other foundation models.

Agents can be validated in-platform, tested through a dedicated test flow, and routed to the appropriate deployment path depending on tier and environment.

How we built it

We combined Strands-Agents, Amazon Bedrock, and Kiro to build a guided and automated agent-creation pipeline. Our @agent decorator enables template-based construction with structured pre- and post-processing patterns to support observability and repeatable builds.

The application implements:

  • A Three Chat System that separates agent building, automated agent construction, and testing.
  • A Model Registry with 49 models across AWS Bedrock and early validated local execution options.
  • A Tool Registry with 50+ tools and MCP-backed discovery across 11+ configured MCP servers.
  • Clear testing vs deployment separation with dedicated AgentCore testing/sandbox deployment and user AWS deployment logic.

For security, we implemented Web Identity Federation using STS AssumeRoleWithWebIdentity and avoided static AWS keys, supported by Cognito + OAuth integrations (GitHub, Google).

Challenges we faced

Designing deterministic behavior for dynamically generated agents was a major challenge. Aligning consistent results across Bedrock-first automation, optional local execution validation, and container-based deployment required careful standardization of configuration, tooling, and lifecycle workflows.

Accomplishments

We built a unified agent-creation pipeline with:

  • Modular agent construction using the @agent decorator.
  • Meta-tooling foundations that support dynamic tool creation and reuse.
  • A tiered build/test/deploy experience with AgentCore testing and ECS Fargate execution paths.
  • Deep MCP integration for scalable real-world agent capability.

What we learned

We learned that AI systems improve dramatically when you apply software engineering principles: single-purpose modules, structured orchestration, and reproducible build steps create more stable agent outcomes than monolithic “do everything” designs. We also deepened our understanding of OAuth 2.0, JWTs, temporary role security, and real-world federated identity deployment patterns.

What’s next

We’re evolving Agent Forge into a more capable meta-orchestrator that can:

  • Generate specialized sub-agents when tasks exceed a single agent’s capacity.
  • Dynamically create and link MCP servers and toolchains.
  • Deliver dashboards for prompt flows, performance metrics, and self-optimization.
  • Expand production-hardening for multi-environment execution with clearer stability tiers.

Built With

Share this project:

Updates