Inspiration

Parents and doctors routinely track baby height and weight over time...so why not speech development? Realistic speech AI and emotional AI reduces costs and increases personalization of learning.

There are over 3.5M babies born each year in the US and nearly 1 in 10 are affected by some speech, voice, or language disorder. The US market sizes for baby products and tracking apps are over $33B and $11B, respectively.

As a new mother, I already use apps to track my baby's sleep and foods/feeding daily.

What it does

  • We made an app that does a one-time onboarding where a parent enters information about their baby.
  • The main interface allows a parent to facilitate voice AI based learning activities, create lullabies and stories with specific words, flashcards, and interactive realtime conversations.
  • The speech is then transcribed to text and parsed for utterances, word count, and total talking time.
  • This data is mapped against a Stanford study of word usage over time during babies’ development.
  • To benefit most from the app, parents return daily to record their babies and can track progress over time. These powerful insights quantify babies’ speech rather than relying on anecdotes.

How we built it

Three AI agents help parents, with the possibility of visuals (fal) or just audio (ElevenLabs, Suno) for those who are against screen time:

  1. Lullabies: Songs are a great medium for word exposure. We prompt an LLM to generate the text with guidelines on what letters to include and Suno to create the lullaby.
  2. Flashcards: Research has shown that repetition of words and sounds that the baby has not mastered is helpful for speech development. We use fal to generate images that supplement repetitive word pronunciation (ElevenLabs).
  3. Stories: We used text-to-speech (ElevenLabs) to read short LLM-generated stories tailored to the infant.
  4. RealTime Chat: We used LiveKit to power a pipeline based RealTime agent using ElevenLabs for TTS provider.

Challenges we ran into

It was hard to execute everything we had in mind with enough sophistication.

Accomplishments that we're proud of

From strangers to working together in a short amount of time to build a minimally viable product!

What we learned

Super cool to try Elevenlabs, fal and some other tools for the first time.

What's next for F1RST WORDS

Even though for this hackathon, our scope was limited to the US and English, one could easily extend this project to other major markets and languages such as Spanish and Chinese. Furthermore, even though the target consumers here are parents, one could imagine that teachers/daycare operators or pediatricians/speech therapists would find this app useful as well.

Built With

  • cursor
  • deepgram
  • elevenlabs
  • fal
  • suno
Share this project:

Updates