Inspiration

This project was born from a deeply personal place. Friends and family — especially older relatives — have fallen victim to phone scams. Watching someone you love lose their savings to a stranger pretending to be from the bank is devastating. The worst part? By the time they realize it's a scam, the money is gone.

Every 11 seconds, someone in the US falls victim to a phone scam. Seniors lose an average of $34,200 per incident. Current defenses (call blockers, carrier filters) only catch known numbers — they can't understand what's being said. We built Guardiano — Italian for "guardian" — to listen in real-time and catch scams as they happen, before the victim sends a single dollar.

What it does

Guardiano is a real-time AI guardian that listens to live phone calls, transcribes them instantly, and detects fraud patterns as they happen.

  • Live Audio Mode: Stream a phone call through your browser microphone → get instant transcription + scam alerts with risk levels, detected patterns, and recommended actions.
  • Text Analysis Mode: Paste any transcript → get an instant multilingual fraud assessment.
  • Multilingual: Automatically detects the language being spoken and responds in the same language (English, Spanish, Portuguese, and more).
  • Alert System: When threat level exceeds medium, Guardiano barks — literally, a synthesized guard dog bark — to alert the user. Because every family deserves a digital guard dog watching over them.

How we built it

Amazon Nova Sonic handles real-time speech-to-text via bidirectional streaming. Audio is captured from the browser microphone, downsampled from 44100Hz to 16kHz PCM16, and streamed over WebSocket to our FastAPI backend. The backend opens a bidirectional stream with Nova Sonic, sending audio chunks and receiving transcription events in real-time.

Amazon Nova Lite powers the fraud reasoning engine. Each transcription chunk is analyzed against 7+ scam pattern categories (urgency pressure, impersonation, gift card requests, threats, personal data requests, too-good-to-be-true offers, isolation tactics). It returns structured JSON with risk level, confidence score, detected patterns, and actionable recommendations — all in the transcript's detected language.

The fraud analysis runs as a fire-and-forget async task so it never blocks the real-time transcription flow.

Infrastructure: Deployed on AWS ECS Fargate with CloudFront for HTTPS. CI/CD via GitHub Actions builds a Docker image, pushes to ECR, and deploys automatically on every push.

Tech stack: Python 3.13, FastAPI, WebSocket, Vanilla JS, Web Audio API, Amazon Bedrock Runtime (Smithy SDK for Nova Sonic, boto3 for Nova Lite), Docker, Terraform, GitHub Actions.

Challenges we ran into

  • Nova Sonic SDK: It uses a Smithy-based SDK (aws-sdk-bedrock-runtime) instead of boto3, requiring Python 3.12+. Getting bidirectional streaming working with proper SigV4 authentication and the correct audio format took significant debugging.
  • Audio pipeline: Browser microphones capture at 44100/48000 Hz, but Nova Sonic expects 16kHz. We implemented client-side downsampling in JavaScript using the Web Audio API.
  • Async orchestration: Running Nova Sonic's bidirectional stream, audio capture from the browser WebSocket, transcription delivery, and fraud analysis concurrently without blocking required careful asyncio task management with create_task for fire-and-forget patterns.
  • Fraud analysis timing: Nova Lite takes 2-3 seconds per analysis. Initially this blocked transcription delivery. We solved it by running analysis as background tasks that send results to the browser independently.

Accomplishments that we're proud of

  • Real-time transcription appears in under 3 seconds from speech
  • Fraud analysis runs in parallel without ever blocking transcriptions
  • Multilingual detection — speak in Spanish, get the full explanation in Spanish
  • The guard dog bark alert is surprisingly effective at getting attention
  • Clean, decoupled architecture: STT module, fraud detector, and WebSocket handler are independent components
  • Full CI/CD pipeline: push to GitHub → Docker build → ECR → ECS deploy

What we learned

  • Nova Sonic's bidirectional streaming is incredibly powerful for real-time voice applications — the latency is impressive
  • Nova Lite's reasoning capabilities are strong enough to detect nuanced scam patterns with high confidence (0.95+ on obvious scams)
  • The combination of Nova Sonic + Nova Lite creates a compelling real-time AI pipeline that could genuinely protect people
  • Building production-grade async Python with WebSocket, streaming APIs, and concurrent tasks requires careful orchestration but Python's asyncio is up to the task

What's next for Guardiano

  • Mobile app: Android/iOS call listener that runs Guardiano in the background during every phone call
  • Carrier integration: Deploy at the carrier level for automatic protection — no app needed
  • Elder-care mode: Automatic alerts sent to family members when scams are detected, so loved ones can intervene
  • Multi-language expansion: Full support for Mandarin, Hindi, Arabic, French
  • Call recording: Save flagged calls as evidence for reporting to authorities
  • Community scam database: Aggregate detected patterns to warn others about emerging scam tactics

Built With

  • amazon-bedrock
  • amazon-nova-lite
  • amazon-nova-sonic
  • aws-cloudfront
  • aws-ecr
  • aws-ecs-fargate
  • docker
  • fastapi
  • github-actions
  • javascript
  • pydantic
  • python
  • terraform
  • web-audio-api
  • websocket
Share this project:

Updates