What it does Our solution listens to live calls, converts speech to text in real time, and uses Generative AI models to generate actionable insights and suggestions based on the ongoing conversation. It enables agents to respond faster and more accurately, improving customer experience and operational efficiency.
How we built it We used AWS services like Amazon Transcribe for real-time speech-to-text conversion and integrated it with a Generative AI model to process the text and generate contextual recommendations. The system is built with a microservices architecture to ensure scalability and resilience, and includes a front-end dashboard for agents to view suggestions in real time.
Challenges we ran into Ensuring low latency during real-time transcription and AI processing
Maintaining context and relevance in AI-generated suggestions
Integrating multiple AWS services seamlessly
Handling different accents and noisy environments during speech recognition
Accomplishments that we're proud of Successfully built a pipeline capable of handling live audio streams
Generated meaningful and relevant insights within seconds of speech input
Created a modular and scalable system using cloud-native architecture
Delivered a prototype with a live demo-ready user interface
What we learned How to integrate and fine-tune Generative AI models with real-time data streams
Advanced usage of AWS services for speech and AI processing
Best practices for building low-latency, event-driven architectures
The importance of UX when surfacing AI suggestions during live interactions
What's next for shiTey Expand multilingual support and improve speech recognition accuracy
Integrate with popular CRM systems for seamless workflow adoption
Train the AI models with industry-specific datasets (e.g., healthcare, finance)
Enhance the UI with visual cues and live sentiment analysis
Prepare for full production deployment with robust security and compliance features
Log in or sign up for Devpost to join the conversation.