Inspiration
What it does🌍 AI Global Voyager: The Story
💡 Inspiration Travel is beautiful, but language barriers and boring, text-heavy guides often ruin the experience. I wanted to create something that feels like a personal human guide in your pocket—someone who speaks your language and shows you the beauty of a place through sound and visuals.
🛠️ How I Built It I used a modern tech stack to ensure speed and immersion:
Brain: I integrated Google Gemini 1.5 Flash for its lightning-fast response time to generate destination insights.
Voice: I used ElevenLabs Multilingual v2 to turn text into high-quality, emotional speech.
Frontend: Built with Streamlit, using custom CSS for a "Glassmorphism" effect and a cinematic video background.
Logic: Python was the backbone, connecting the APIs and handling the multilingual toggle.
🧠 What I Learned During this hackathon, I mastered:
API Orchestration: How to sync Large Language Models (LLMs) with Voice AI seamlessly.
User Experience (UX): The importance of a "Cinematic UI" in keeping users engaged.
Prompt Engineering: Refining prompts to get concise and culturally rich travel data.
⚠️ Challenges I Faced Latency: Initially, generating text and then voice took too long. I optimized this by using the Gemini 1.5 Flash model, which reduced the wait time significantly.
UI Constraints: Streamlit is great for data but tricky for custom designs. I overcame this by injecting custom HTML/CSS to add the video background and transparent cards.
Multilingual Accuracy: Ensuring the Hindi narration sounded natural and not like a robot was a challenge, which I solved by fine-tuning the ElevenLabs voice settings.
Log in or sign up for Devpost to join the conversation.