EchoMind
What I built
EchoMind is a real-time AI speaking coach for English learners (with Vietnamese/English code-switch tolerance).
Users speak in a live room, get streaming transcript updates, instant coaching hints, and a finalized session report.
Core experience
- Live microphone capture in the frontend and WebSocket streaming to backend.
- Segment-based speech-to-text using VALSEA transcription.
- Real-time coaching signals (vocabulary, rewrites, hints, sentiment snapshots).
- Session finalize flow that returns structured feedback for summary view.
Technical architecture
- Frontend: React + Vite (
MVP/frontend). - Backend: Django + DRF (
MVP/backend). - Realtime transport: WebSocket session channel (
/ws/sessions/{id}/transcript/). - AI integrations:
- VALSEA for transcription and optional text enrichments.
- AWS Bedrock for live coaching + deeper final analysis.
Important implementation details
- Fast-speech robustness tuning:
- Configurable segment timing/size thresholds.
- Optional overlap stitching between audio segments.
- Retry/cooldown behavior for low-yield chunks.
- No-DB runtime mode:
- Replaced runtime session persistence with in-memory TTL store.
- Keeps API and WS contracts stable for quick local/hackathon operation.
- Data resets on backend restart (intentional tradeoff for speed).
What is configurable
- STT + coaching behavior through
.envflags (models, language, retries, overlap, min chunk thresholds). - In-memory session retention with
SESSION_STORE_TTL_SECONDS.
Known constraints
- In-memory session store is not durable and not shared across multiple backend processes.
- Realtime quality depends on microphone quality, speaking pace, and segment tuning values.
Why this build is practical
We focused on delivering a working end-to-end coaching loop with low setup friction:
- quick startup,
- real-time feedback,
- and useful final coaching artifacts from the same conversation session.
Built With
- agora
- amazon-web-services
- python
- react
- valsea
Log in or sign up for Devpost to join the conversation.