ZEN - Intelligent Planner Agent

🎯 Inspiration

As developers and product managers, we've all been there—staring at a 50-page requirements document, manually extracting user stories, estimating story points, mapping dependencies, and creating Jira tickets one by one. This tedious process can take weeks and is prone to human error, missed dependencies, and inconsistent story quality.

We asked ourselves: What if AI could do this in minutes with better accuracy?

That question sparked ZEN - an intelligent planning agent that transforms the chaos of requirements planning into a zen-like automated experience.

💡 What It Does

ZEN is an AI-powered planning agent that:

  • Analyzes BRD/FRD documents using RAG (Retrieval-Augmented Generation)
  • Generates production-ready user stories with acceptance criteria
  • Validates dependencies and identifies circular references
  • Estimates story points using Fibonacci sequence
  • Creates PI (Program Increment) plans with sprint timelines
  • Syncs everything directly to Jira automatically
  • Audits the entire process with real-time metrics and reporting

The result? Weeks of manual planning work reduced to 3-5 minutes of automated execution.

🏗️ How We Built It

Architecture

We built ZEN on AWS using a multi-agent orchestration pattern:

┌─────────────────────────────────────────────────┐
│           Supervisor Agent (Orchestrator)        │
└──────────────┬──────────────────────────────────┘
               │
       ┌───────┴───────┬──────────┬──────────┐
       │               │          │          │
   ┌───▼────┐    ┌────▼───┐  ┌───▼────┐ ┌──▼─────┐
   │Document│    │  User  │  │   PI   │ │  Audit │
   │Analysis│    │ Story  │  │Planning│ │ Agent  │
   │ Agent  │    │  Gen   │  │ Agent  │ │        │
   └────────┘    └────────┘  └────────┘ └────────┘

Tech Stack

  • AWS Bedrock Agents: Multi-agent orchestration with supervisor/sub-agent pattern
  • Claude 3.5 Sonnet: Primary LLM for intelligent reasoning
  • AWS Bedrock Knowledge Bases: RAG implementation for document context
  • OpenSearch Serverless: Vector database for semantic search
  • AWS Lambda: Serverless compute for action groups
  • Amazon S3: Document storage and processing
  • DynamoDB: Results and audit data storage
  • Jira API: Automated backlog creation
  • Langfuse: LLM observability and tracing
  • Serverless Framework: Infrastructure as Code

Key Innovations

1. Parallel Batched Processing

For large documents with 20+ requirements, we implemented parallel sub-agent spawning:

# Process requirements in batches with parallel execution
batches = chunk_requirements(requirements, batch_size=2)
with ThreadPoolExecutor(max_workers=3) as executor:
    futures = [executor.submit(process_batch, batch) for batch in batches]
    results = [f.result() for f in futures]

This reduced processing time from 15+ minutes to 3-5 minutes for 26 requirements.

2. Intelligent Dependency Validation

We built a dedicated validation agent that:

  • Identifies circular dependencies using graph algorithms
  • Validates cross-story references
  • Suggests dependency corrections
  • Classifies dependency types (technical, functional, external)

3. Audit Agent with Real-Time Metrics

Every agent execution is tracked with:

  • Performance metrics (latency, token usage)
  • Cost analysis per document
  • Quality scores (story completeness, dependency accuracy)
  • Bottleneck identification

The audit data is stored in DynamoDB and generates comprehensive reports in S3, ready for dashboard integration.

4. RAG-Powered Context Awareness

Using Bedrock Knowledge Bases, ZEN compares new requirements against existing documentation to:

  • Identify similar past implementations
  • Suggest reusable components
  • Flag potential conflicts
  • Maintain consistency across projects

🚧 Challenges We Faced

Challenge 1: Agent Orchestration Complexity

Problem: Coordinating multiple specialized agents while maintaining context and handling failures.

Solution: Implemented a robust supervisor pattern with:

  • Retry logic with exponential backoff
  • State management across agent invocations
  • Error recovery and graceful degradation
  • 90%+ success rate even with complex documents

Challenge 2: Story Point Estimation Accuracy

Problem: AI tends to over-estimate or create stories that are too large (>8 points).

Solution:

  • Fine-tuned prompts with explicit Fibonacci constraints
  • Implemented automatic story splitting for large estimates
  • Added validation layer to ensure stories ≤ 8 points
  • Achieved 95% accuracy in story sizing

Challenge 3: Dependency Graph Complexity

Problem: Large documents create complex dependency graphs that are hard to validate and visualize.

Solution:

  • Built graph-based validation using topological sorting
  • Implemented cycle detection algorithms
  • Created dependency classification system
  • Generated visual dependency maps for PI planning

Challenge 4: Cost Optimization

Problem: LLM API calls can get expensive with large documents and multiple agent invocations.

Solution:

  • Implemented intelligent batching to reduce API calls
  • Used prompt caching where possible
  • Optimized context window usage
  • Achieved ~$0.29 per document processing cost

Challenge 5: Jira Integration Reliability

Problem: Jira API rate limits and network issues causing failures.

Solution:

  • Implemented exponential backoff retry logic
  • Added request queuing and throttling
  • Built idempotent operations to handle duplicates
  • Achieved 99% success rate in Jira sync

📚 What We Learned

Technical Learnings

  1. Multi-agent systems require careful orchestration and state management
  2. Prompt engineering is critical—small changes can dramatically improve output quality
  3. RAG implementation needs proper chunking strategy and embedding model selection
  4. Serverless architecture scales beautifully but requires careful cold start optimization
  5. LLM observability (via Langfuse) is essential for debugging and optimization

Product Learnings

  1. Automation doesn't mean zero human input—the best results come from AI + human validation
  2. Dependency validation is more valuable than we initially thought
  3. Audit trails are critical for enterprise adoption and trust
  4. Integration with existing tools (Jira) is non-negotiable for real-world usage

AWS Bedrock Insights

  1. Bedrock Agents provide powerful orchestration capabilities out of the box
  2. Knowledge Bases make RAG implementation significantly easier
  3. Claude 3.5 Sonnet excels at structured output and reasoning tasks
  4. Action Groups enable seamless integration with external APIs

🎓 Key Takeaways

Building ZEN taught us that AI agents are most powerful when they augment human expertise rather than replace it. The goal isn't to eliminate product managers—it's to free them from tedious manual work so they can focus on strategy, stakeholder management, and creative problem-solving.

We also learned that multi-agent systems are the future of complex AI workflows. By breaking down the planning process into specialized agents (analysis, generation, validation, planning), we achieved better results than any single monolithic model could provide.

Finally, we discovered that observability and auditability are just as important as functionality. Enterprise teams need to trust and understand AI decisions, which is why we built comprehensive tracking and reporting from day one.

🚀 What's Next

We're excited to expand ZEN with:

  • Visual dependency graphs and interactive PI planning boards
  • Multi-project portfolio planning across teams
  • Historical learning from past sprint performance
  • Integration with GitHub for automatic technical task creation
  • Slack/Teams notifications for real-time updates
  • Custom agent training on company-specific planning patterns

ZEN represents our vision for the future of agile planning—intelligent, automated, and effortless. From requirements to backlog in minutes, not weeks.

Built With

Share this project:

Updates