Inspiration

What it does

How we built it

Python 3 Fast API+ + PyQT, React/NextJS + Google gemini + Deepgram SDK + Hume AI SDK

Challenges we ran into

  1. Finalizing an initial roadmap
  2. Internet connection disrupted our flow
  3. NextJS integrations and endpoints

Accomplishments that we're proud of

  1. We made reading an ebook file (EPUB) into engaging live-speaking characters with their traits (Gender, Tone etc) with whom the listener can interact using their voice.
  2. We wrote a cross-platform app (Desktop and Web) and utilized multiple services: Google Gemini for the intelligence, Deepgram Voice API for Text to Speech, and HumeAI for Live Interaction with the characters.
  3. We used multimodality of text and speech (with tone and expressions) to generate an engaging experience.
  4. Live highlighting of live speech with our own code.

What we learned

  1. Got familiar with the potential of modern voice agents.
  2. We understood how we can leverage the intelligence of LLMs and expressiveness capturing the power of the various voice APIs.
  3. A ton about Web Application architectures, UI framework, and integration of speech APIs with modern apps.
  4. Future of entertainment and education technologies.

What's next for wonderlands.

The goal for wonderlands is to become a multimodal, cross platform and engaging service leveraging GENAI for entertainment, knowledge and human prosperity, bringing you art from the abstract.

Built With

Share this project:

Updates