wonderlands.

Live speech highlighting
EPUB file upload

Inspiration

What it does

How we built it

Python 3 Fast API+ + PyQT, React/NextJS + Google gemini + Deepgram SDK + Hume AI SDK

Challenges we ran into

Finalizing an initial roadmap
Internet connection disrupted our flow
NextJS integrations and endpoints

Accomplishments that we're proud of

We made reading an ebook file (EPUB) into engaging live-speaking characters with their traits (Gender, Tone etc) with whom the listener can interact using their voice.
We wrote a cross-platform app (Desktop and Web) and utilized multiple services: Google Gemini for the intelligence, Deepgram Voice API for Text to Speech, and HumeAI for Live Interaction with the characters.
We used multimodality of text and speech (with tone and expressions) to generate an engaging experience.
Live highlighting of live speech with our own code.

What we learned

Got familiar with the potential of modern voice agents.
We understood how we can leverage the intelligence of LLMs and expressiveness capturing the power of the various voice APIs.
A ton about Web Application architectures, UI framework, and integration of speech APIs with modern apps.
Future of entertainment and education technologies.

What's next for wonderlands.

The goal for wonderlands is to become a multimodal, cross platform and engaging service leveraging GENAI for entertainment, knowledge and human prosperity, bringing you art from the abstract.

Built With

deepgram
epub
fastapi
gemini
javascript
llm
next
pyqt
python

Updates

Aryamun Narayan Das started this project — Oct 20, 2024 01:25 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.