This project pushed me to bring together multiple advanced AI components and make them work in harmony, all within a clean, testable cloud-native setup.
Full Breakdown: • LLM Feedback: GPT-4o on Azure OpenAI for resume and interview feedback • RAG Search: Azure Cognitive Search + FastAPI backend for contextual resume Q&A • Speech-to-Text: OpenAI Whisper for real-time audio/video transcription • Emotion Detection: pyannote.audio (speaker embedding pipeline) for tone analysis • Infra: PostgreSQL Flexible Server, Blob Storage, GitHub Actions CI/CD, Azure CLI • Frontend: React + Vite with dark/light mode toggle • Testing: Unit + Integration tests, GitHub workflows, LFS-managed audio tests
Challenges Faced: • Whisper & pyannote emotion models required optimization for CPU • RAG integration had to be fast, accurate, and light on token usage • Getting CI/CD + testing to work with real audio input took iteration and care • All of it had to fit inside the free Azure tier limits, with no compromises
Next Steps: • Open-sourcing emotion classification extension for interviews • Adding support for multi-language transcripts • Launching a minimal hosted version for public trial
If you’re interested in collaborating, improving emotion feedback models, or just curious how it all fits together under the hood, drop me a message or comment below.
Let’s connect and build cool things with AI.
Log in or sign up for Devpost to join the conversation.