What it does Our solution listens to live calls, converts speech to text in real time, and uses Generative AI models to generate actionable insights and suggestions based on the ongoing conversation. It enables agents to respond faster and more accurately, improving customer experience and operational efficiency.

How we built it We used AWS services like Amazon Transcribe for real-time speech-to-text conversion and integrated it with a Generative AI model to process the text and generate contextual recommendations. The system is built with a microservices architecture to ensure scalability and resilience, and includes a front-end dashboard for agents to view suggestions in real time.

Challenges we ran into Ensuring low latency during real-time transcription and AI processing

Maintaining context and relevance in AI-generated suggestions

Integrating multiple AWS services seamlessly

Handling different accents and noisy environments during speech recognition

Accomplishments that we're proud of Successfully built a pipeline capable of handling live audio streams

Generated meaningful and relevant insights within seconds of speech input

Created a modular and scalable system using cloud-native architecture

Delivered a prototype with a live demo-ready user interface

What we learned How to integrate and fine-tune Generative AI models with real-time data streams

Advanced usage of AWS services for speech and AI processing

Best practices for building low-latency, event-driven architectures

The importance of UX when surfacing AI suggestions during live interactions

What's next for shiTey Expand multilingual support and improve speech recognition accuracy

Integrate with popular CRM systems for seamless workflow adoption

Train the AI models with industry-specific datasets (e.g., healthcare, finance)

Enhance the UI with visual cues and live sentiment analysis

Prepare for full production deployment with robust security and compliance features

Share this project:

Updates