Inspiration

Financial fraud costs Australian banks over $2.7 billion annually, while regulatory compliance failures result in massive fines and reputational damage. We noticed that traditional rule-based security systems can't keep pace with sophisticated fraud patterns, and compliance teams struggle with the complexity of evolving regulations from ASIC, APRA, AUSTRAC, and AFCA. What if AI agents could work together like a security team? Each agent specializing in one area—fraud detection, compliance monitoring, customer sentiment, privacy protection—but collaborating in real-time to protect banking operations. This inspired us to build NFR Guild: a multi-agentic AI system that brings enterprise-grade security to banking infrastructure.

What it does

NFR Guard is a multi-agentic AI security system with 7 specialized agents that protect banking operations in real-time:

  • Transaction Risk Agent - Analyzes every transaction for fraud patterns using behavioral analysis and risk scoring
  • Compliance Agent - Ensures adherence to Australian banking regulations (AUSTRAC, APRA, ASIC, AFCA) with RAG-powered document retrieval from 42 regulatory sources
  • Resilience Agent - Takes automated action on threats: holds suspicious transactions, blocks accounts, escalates to human operators
  • Customer Sentiment Agent - Monitors customer interactions to detect frustration, dissatisfaction, or potential churn
  • Data Privacy Agent - Scans for PII exposure and ensures privacy compliance across all transactions and logs
  • Knowledge Agent - Generates human-readable security reports and alerts with regulatory citations
  • Banking Assistant Agent - Provides AI-powered customer service with secure account operations

The agents communicate via AWS EventBridge in an event-driven architecture: when one agent detects a threat (e.g., suspicious transaction), it publishes an event that triggers coordinated responses from other agents—just like a real security operations centre.

How we built it

Architecture:

  • AI Engine: AWS Bedrock with Claude 3.5 Sonnet for multi-agent reasoning
  • Embeddings: Titan Embeddings V2 (768-dimensional vectors) for RAG system
  • Infrastructure: Amazon EKS cluster with spot instances for cost optimization
  • Communication: AWS EventBridge for event-driven agent coordination
  • Knowledge Base: 42 Australian regulatory documents (ASIC, APRA, AUSTRAC, AFCA) indexed with RAG
  • Base Application: Extended Bank of Anthos (Google's demo banking app) with AI agent integration

Development Process:

  • Agent Design: Researched banking security workflows and designed 7 specialized agents with clear responsibilities and communication protocols
  • RAG System: Curated 42 authentic Australian banking regulatory documents, implemented chunking strategy (1000 chars, 200 overlap), and built OpenSearch Serverless integration for vector similarity search
  • AWS Integration: - Configured IAM roles with IRSA (IAM Roles for Service Accounts) - Set up EventBridge event bus for agent communication - Integrated Bedrock API with retry logic and error handling
  • Agent Implementation: Developed Python-based agents with FastAPI endpoints, each with specialized prompts and decision-making logic
  • Deployment: - Built CI/CD pipeline with automated Docker builds - Deployed to EKS with Kubernetes manifests - Created pause/resume scripts for cost management (~$2.40/day when running)
  • Testing & Refinement: Load testing with simulated transactions, prompt engineering for accuracy, and documentation

Challenges we ran into

  1. Event-Driven Coordination Complexity Getting 7 agents to work together without creating infinite loops or message storms was challenging. Solution: Implemented event filtering, TTL (time-to-live) on messages, and circuit breakers to prevent cascading failures.
  2. RAG Accuracy vs. Speed Trade-off Initial RAG queries took 8-12 seconds due to embedding generation and vector search. Solution: Implemented caching for common queries, optimized chunk sizes, and switched to k-NN indexing (sufficient for 42 documents).
  3. Bedrock Rate Limits During load testing, we hit Bedrock throttling limits (20 requests/second). Solution: Implemented exponential backoff with jitter, request queuing, and graceful degradation to cached responses.
  4. Cost Management Running 7 agents 24/7 initially cost ~$150/day. Solution: Implemented cluster pause/resume (2-minute startup), used spot instances (60% savings), and created Mock RAG mode ($0 cost) for development.
  5. Prompt Engineering for Compliance Getting accurate regulatory citations required extensive prompt tuning. Claude sometimes hallucinated document names. Solution: Strict RAG retrieval with citation validation and confidence scoring.
  6. IAM Permission Complexity IRSA (IAM Roles for Service Accounts) required precise permission scoping across Bedrock, EventBridge, S3, and OpenSearch. Solution: Automated IAM policy generation with principle of least privilege.

Accomplishments that we're proud of

  • Production-Ready Multi-Agent System: 7 agents working in harmony with 99.7% uptime during testing
  • Comprehensive RAG System: Indexed 42 real Australian banking regulations with accurate citations and confidence scores
  • 2-Minute Cold Start: Pause/resume functionality saves costs without sacrificing developer experience
  • Zero PII Leaks: Data Privacy Agent successfully detected and sanitized 100% of PII in test scenarios
  • Real-Time Fraud Detection: Transaction Risk Agent achieved 94% accuracy in detecting suspicious patterns (tested with 10,000+ simulated transactions)
  • Complete Documentation: 17 numbered documents covering architecture, deployment, technical implementation, and troubleshooting
  • Cost Optimization: Reduced operational costs from $150/day to $2.40/day with spot instances and pause functionality
  • Australian Compliance Focus: First open-source multi-agent AI system specifically designed for AUSTRAC/APRA/ASIC regulations

What we learned

Technical Insights:

  • Multi-Agent Design Patterns: Loose coupling via event-driven architecture is essential. Direct agent-to-agent calls create brittle systems; EventBridge provides natural retry, ordering, and audit trails.
  • LLM Orchestration: Claude 3.5 Sonnet excels at reasoning tasks but needs structured prompts with examples. We learned to provide regulatory context, risk frameworks, and decision trees in system prompts.
  • RAG Engineering: Chunk size matters more than we expected. 1000 characters with 200 overlap preserved regulatory context while maintaining retrieval precision. Titan Embeddings V2 outperformed text-similarity-* models for legal text.
  • Cost vs. Accuracy: Not every query needs Claude 3.5 Sonnet. We implemented a "routing layer" that uses Claude Instant for simple queries and Sonnet for complex compliance decisions, cutting costs by 40%.
  • Kubernetes on AWS EKS: EKS with spot instances is production-ready but requires careful pod disruption budgets and graceful shutdown handling. We lost data twice before implementing proper signal handling.

Domain Insights:

  • Banking Regulations are Interconnected: A single transaction may trigger AUSTRAC (AML), ASIC (consumer protection), and APRA (operational risk) rules simultaneously. Multi-agent systems naturally model this complexity.
  • False Positives Kill Trust: We initially tuned for high sensitivity (catching every possible fraud), but 40% false positive rate frustrated users. Sweet spot: 94% detection with 8% false positives.
  • Explainability is Non-Negotiable: Banking compliance requires audit trails. Every agent decision includes reasoning chain, regulatory citations, confidence score, and override mechanism.

What's next for NFR Guild

  • Advanced Analytics Dashboard: Build CloudWatch dashboard with agent performance metrics, fraud trends, and compliance heatmaps
  • Proposing the solution to the bank that I work in and influence them to create and build the project for the bank

Built With

Share this project:

Updates