Parley

par·ley verb /ˈpärlē/ to speak with another

Inspiration

I was inspired by the experiences of people with speech difficulties, whether due to conditions like ALS, stroke recovery, or language barriers. Communication is fundamental to human connection, yet for many, traditional speech methods present significant obstacles. I wanted to create a tool that bridges this gap, providing a seamless way for everyone to express themselves clearly.

What It Does

Parley is an AI-powered speech assistant that helps people with speech difficulties communicate more effectively. Users can speak into their device, and Parley will transcribe their speech, clarify any unclear portions, and provide an audio response using text-to-speech technology. The app also identifies medical terminology and offers translation capabilities, making it particularly useful in healthcare settings and multilingual environments.

How I Built It

I built Parley using Next.js and React for the frontend, with TypeScript to ensure code reliability. The application uses several APIs working in concert:

The Web Speech API for real-time speech recognition
Claude API for text clarification and medical terminology extraction
ElevenLabs API for natural-sounding text-to-speech
Whisper API for advanced transcription capabilities

Challenges I Ran Into

One significant challenge was managing the multiple API interactions seamlessly. Each API has its own requirements, rate limits, and error handling needs, which required careful integration.

Audio handling was another complex area - particularly making sure audio playback worked correctly across different devices and browsers, and handling issues with auto-play restrictions in modern browsers.

Accomplishments That I'm Proud Of

I'm particularly proud of creating a genuinely useful tool that can make a real difference in people's lives. The integration of multiple AI services into a coherent, user-friendly experience was technically challenging.

The app's ability to identify and explain medical terminology is especially valuable for patients understanding their diagnoses. The simple, straightforward UI that makes the technology accessible and intuitive to more users.

What I Learned

I gained expertise in working with modern AI APIs and learned how to combine them effectively to create new capabilities. I also deepened my understanding of accessibility needs and how to design interfaces that accommodate various users. Lastly, I learned about the challenges of real-time audio processing.