Customer Retention Agent

Inspiration 💡

Customer churn costs telecom companies billions every year. The worst part? Most companies only react when customers are already trying to cancel. We wanted to flip this around - what if an AI agent could identify at-risk customers early and proactively offer them personalized deals?

With AWS Bedrock AgentCore launching, we saw the perfect opportunity to build a truly autonomous agent that reasons, makes decisions, and takes actions across multiple tools. This hackathon pushed us to explore what's possible when you combine Runtime, Gateway, and Memory.

What It Does 🎯

The Customer Retention Agent analyzes customer churn risk in real-time and generates personalized retention offers. When a customer asks for a discount, the agent autonomously:

Queries their data from Amazon Athena (using the Kaggle Telco dataset with 7,000+ customers)
Analyzes their churn risk score and usage patterns
Generates a personalized discount code based on their risk level
Remembers the conversation for future interactions using AgentCore Memory

The agent uses Claude 3.7 Sonnet for reasoning and makes multi-step decisions without explicit instructions - true autonomous behavior.

How We Built It 🛠️

Architecture: Five layers working together

Frontend: Next.js chat interface deployed on Vercel
Authentication: Dual Cognito setup - one for users, one for agent-to-Gateway communication
AI Agent: AgentCore Runtime + Memory (with 3 strategies: preferences, semantic, summarization)
Tools: Three Lambda functions via AgentCore Gateway (MCP protocol):
- Churn Data Query (queries Athena)
- Retention Offer (generates discount codes)
- Web Search (DuckDuckGo API)
Data: Athena + S3 for customer data, Bedrock Knowledge Base for policies

Key Implementation Details:

Synthetic Churn Model: We use cancel_intent from the dataset as our "pretend ML model" - it gives us realistic churn predictions without training a separate model
JWT Mapping: Cognito user IDs get mapped to customer IDs in the dataset for queries
RAG Implementation: User Query → Agent → Knowledge Base → Retrieved Context → Enhanced Response
Local Development: Used agentcore invoke --local for testing before deploying
Serverless Runtime: AgentCore Runtime handles auto-scaling and deployment

Challenges We Faced 🚧

1. Dual Authentication Complexity Setting up two Cognito clients (web + M2M) was tricky. The web client authenticates users, while the M2M client lets the agent call Gateway tools. Getting the OAuth flows and token scopes right took several iterations.

2. Cold Start Issues The first request to our chat app often failed with timeouts. Classic cold start problem with serverless - the Runtime takes time to spin up. We learned to handle this gracefully with better error messages and retry logic.

3. Multi-Step Tool Calling Getting the agent to call churn_data_query first, then pass that data to retention_offer required very explicit prompt engineering. LLMs need clear instructions about sequential workflows.

4. Bedrock Model Throughput We initially wanted on-demand throughput but discovered it's not supported for all models. Had to adjust our model selection and provisioning strategy.

5. SSM Permissions The auto-created Runtime execution role didn't have SSM Parameter Store access. Quick fix with an inline IAM policy, but it caught us off guard during deployment.

What We Learned 📚

Technical:

AgentCore primitives (Runtime, Gateway, Memory) work incredibly well together
MCP protocol makes Lambda functions easily accessible as agent tools
Memory strategies matter - use USER_PREFERENCE for explicit data, SEMANTIC for conversation context
Always use boto3 sessions properly in Lambda functions for AWS SDK calls
Local testing with agentcore invoke --local is a game-changer for development

Architectural:

Serverless doesn't mean instant - plan for cold starts
Dual authentication is complex but necessary for production-grade agents
RAG + tools + memory = powerful combination for real-world applications
Synthetic data works great for demos when you don't have real ML models

Business:

AI agents can solve concrete problems, not just answer questions
Proactive customer engagement beats reactive support every time

What's Next 🚀

If we continue this project:

Security: Make Lambdas private, set up VPC and subnets for better isolation
Human Oversight: Add content moderation, safety filters, and policy checks for responsible AI
Real-Time Alerts: Notify customer service teams when high-risk customers are detected
Analytics Dashboard: Track churn prevention effectiveness and ROI
Confluence Integration: Use Bedrock Knowledge Base's Confluence connector for live policy updates

Conclusion 🎉

Building this agent taught us that autonomous AI is production-ready today. With AgentCore, we went from idea to working demo in record time. The hardest parts weren't the AI - they were the authentication, cold starts, and getting all the AWS services to play nicely together.

This agent demonstrates real business value and shows what's possible when you combine reasoning LLMs with proper infrastructure. We're excited about the future of autonomous agents!

Built With

amazon-athena
amazon-cognito
amazon-web-services
aws-bedrock-agentcore
aws-lambda
bedrock-knowledge-base
claude-3.7-sonnet
duckduckgo-api
kaggle
mcp-protocol
next.js
python
react
strands-framework
tailwind-css
typescript
vercel

Updates

ajith manmadhan started this project — Oct 14, 2025 11:21 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.