Inspiration

During my psychology studies, I noticed a critical bottleneck in how clinical micro-skills (like active listening, Socratic questioning, and managing resistance) are taught. Trainees are heavily reliant on peer role-play, which is predictable, safe, and entirely lacks the genuine emotional volatility of a real clinical session. Conversely, practicing on real patients when inexperienced risks damaging the therapeutic alliance.

I wanted to bridge this gap. I was inspired to create a "flight simulator" for therapists—a low-stakes, highly realistic environment where trainees can face genuine clinical resistance and practice their Cognitive Behavioral Therapy (CBT) skills before they ever sit across from a real patient.

What it does

Resonance CBT is a real-time, voice-to-voice clinical training simulator.

Instead of a standard text chatbot, users interact via voice with a hyper-realistic AI patient persona (e.g., "John," a 35-year-old struggling with severe catastrophizing and work anxiety). The AI is explicitly engineered with specific cognitive distortions and defense mechanisms. It reacts dynamically to the trainee's vocal pacing and choice of interventions—softening when the trainee uses a good "reflection of feeling," and becoming defensive if the trainee asks blunt, judgmental questions.

How I built it

I designed Resonance CBT with a focus on ultra-low latency and clinical accuracy, utilizing a modern Python/AWS stack:

  • The Interface: A minimalist, vanilla JavaScript and HTML frontend featuring a push-to-talk interface that captures the user's microphone inputs as raw audio bytes.
  • The Backend Pipeline: A Python/FastAPI server handles the audio routing. It uses the boto3 SDK to instantly pass the audio data to the AWS cloud.
  • The AI Engine: * Amazon Nova Sonic: Powers the voice-to-voice interaction. By leveraging Sonic's native multimodal capabilities, the system bypasses traditional, slow speech-to-text-to-speech pipelines, allowing the "patient" to respond with natural pacing, sighs, and defensive tones.
    • Amazon Nova Lite: Operates silently in the background as the "Clinical Supervisor." It analyzes the session transcript against a strict CBT evaluation rubric, scoring the trainee's application of micro-skills and outputting structured JSON feedback.

Challenges I ran into

  1. Audio Byte Routing: Passing raw .wav audio chunks directly from a browser microphone through a FastAPI backend and into the AWS Bedrock API required precise handling of multipart form data and base64 encoding to prevent corruption.
  2. Minimizing Latency: Maintaining the illusion of a real human conversation meant fighting latency at every step. The total latency constraint had to account for network transmission, model inference, and audio synthesis: $$L_{total} = t_{network} + t_{inference} + t_{synthesis}$$ Using Nova Sonic was the breakthrough here, as its low Time-to-First-Token (TTFT) kept $L_{total}$ low enough to prevent awkward, immersion-breaking pauses.
  3. Reverse Prompt Engineering: Standard LLMs are trained to be helpful and compliant assistants. Engineering the system prompt to force the model to be resistant, evasive, and clinically challenging required deep psychological profiling within the system instructions.

Accomplishments that I'm proud of

I am incredibly proud of successfully translating abstract psychological concepts—like cognitive distortions and defense mechanisms—into strict machine-readable parameters. I didn't just build a fast voice bot; I built an AI that genuinely feels like a frustrated, catastrophizing patient. Getting the model to actively "punish" bad therapeutic questions with defensive audio responses while rewarding good Socratic questioning was a massive win for the project's real-world utility.

What I learned

  • Cloud Architecture: I gained hands-on experience configuring AWS Identity and Access Management (IAM) and securely interacting with the AWS Bedrock runtime via boto3.
  • Agentic AI: I learned how to orchestrate multiple foundation models within a single application—using a high-speed audio model for the frontend experience and a cost-efficient text model for backend analytical processing.
  • Prompting for Persona: I learned how to translate clinical heuristics into highly steerable system prompts that override a foundation model's default "helpful assistant" behavior.

What's next for Resonance CBT

The immediate next step is building out the Supervisor Dashboard to visually render the JSON feedback from Nova Lite, giving students timestamped insights into their performance. From there, I plan to expand the persona library to include patients presenting with personality disorders, trauma-informed care needs, and complex behavioral resistance, eventually pitching the platform to university counselor training programs.

Built With

Share this project:

Updates