Inspiration
What it does
How we built it
Python 3 Fast API+ + PyQT, React/NextJS + Google gemini + Deepgram SDK + Hume AI SDK
Challenges we ran into
- Finalizing an initial roadmap
- Internet connection disrupted our flow
- NextJS integrations and endpoints
Accomplishments that we're proud of
- We made reading an ebook file (EPUB) into engaging live-speaking characters with their traits (Gender, Tone etc) with whom the listener can interact using their voice.
- We wrote a cross-platform app (Desktop and Web) and utilized multiple services: Google Gemini for the intelligence, Deepgram Voice API for Text to Speech, and HumeAI for Live Interaction with the characters.
- We used multimodality of text and speech (with tone and expressions) to generate an engaging experience.
- Live highlighting of live speech with our own code.
What we learned
- Got familiar with the potential of modern voice agents.
- We understood how we can leverage the intelligence of LLMs and expressiveness capturing the power of the various voice APIs.
- A ton about Web Application architectures, UI framework, and integration of speech APIs with modern apps.
- Future of entertainment and education technologies.
What's next for wonderlands.
The goal for wonderlands is to become a multimodal, cross platform and engaging service leveraging GENAI for entertainment, knowledge and human prosperity, bringing you art from the abstract.
Built With
- deepgram
- epub
- fastapi
- gemini
- javascript
- llm
- next
- pyqt
- python
Log in or sign up for Devpost to join the conversation.