Inspiration In today’s world, many people struggle to express their ideas confidently, even though they are great at coding or possess deep knowledge. At the hackathon, we noticed that while over 1500 participants attended, many felt hesitant to speak up due to the fear of judgment. This inspired us to create "AI Talks"—a solution that enables users to practice speaking without the anxiety of being evaluated. We wanted to build a friend, mentor, and mock interview partner powered by AI to help anyone improve their speaking skills.
What it does AI Talks allows users to select a context they want to talk about and practice conversations. The user inputs their voice, which is converted into text by Deepgram. This text is processed using Llama LLM on Groq, leveraging its LPU for faster and more efficient AI inferences. The output is then passed through Cartesia, which uses FFmpeg to provide a response in real-time. This creates an interactive, supportive environment where users can practice talking, whether it’s for an interview, a pitch, or general communication skills.
How we built it We built AI Talks using several key tools:
Deepgram for speech-to-text conversion. Groq and its Llama LLM to provide fast, accurate text processing, supported by Groq's LPU for better AI inferences. Cartesia for processing the final output response with FFmpeg. We designed the app to be easy-to-use, allowing users to choose different conversation contexts and practice in a judgment-free space.
Challenges we ran into One of the key challenges was integrating multiple tools seamlessly. We tried to integrate fetch ai into our project for the claude ai agent. We wanted multiple AI agents to form a multi-panel interview or a group discussion, but dropped the idea as it was a difficult to understand how to integrate fetch ai . Handling speech-to-text conversion and ensuring smooth AI inference with minimal latency required optimizing the flow between Deepgram, Groq, and Cartesia. We also faced difficulties ensuring the responses felt natural and relatable, which was crucial for users to feel comfortable practicing with the AI.
Accomplishments that we're proud of We are proud of creating a tool that can genuinely help people overcome the fear of speaking up. The integration of cutting-edge tools like Deepgram, Groq, and Cartesia made it possible to build a system that responds quickly and accurately, providing meaningful practice for users. We also felt a sense of achievement knowing our solution could help others who, like us, sometimes struggle to showcase their knowledge because of communication barriers.
What we learned Throughout the process, we learned the importance of combining multiple technologies to achieve a seamless user experience. We gained deeper insights into handling speech data, improving AI inference speed, and managing real-time outputs. Moreover, understanding the nuances of communication barriers reinforced our belief in the importance of empathy-driven AI solutions.
What's next for AI Talks: Practice how to build conversations Moving forward, we plan to expand AI Talks with more conversation contexts, enhanced voice modulation feedback, and multilingual support to reach a broader audience. We also aim to integrate personalized coaching features, where users can receive tips based on their speaking patterns. Additionally, we want to offer more use cases like mock interviews, presentations, and group discussion practice, making AI Talks a versatile tool for improving communication skills across various scenarios.
Built With
- cartesia
- deepgram
- groq
- javascript
- llama
- python
Log in or sign up for Devpost to join the conversation.