-
Phase 1: Upload any travel photo. Gemini detects the location while we auto-detect your origin for a seamless start.
-
Phase 2: Don't like the results? Just say it. Refine destinations using Gemini's native voice processing for 100% accuracy.
-
Phase 3: Explore live flights on an interactive map. Side panel includes Wikipedia info, hotel prices, and local weather.
Inspiration
We’ve all been there: scrolling through social media, seeing a stunning sunset over a Mediterranean cliff or a hidden neon street in Tokyo, and thinking, "I want to be there."
But then, reality hits. Planning a trip is often synonymous with friction: identifying the landmark, finding the nearest airport, comparing dates, and hunting for flights. That "planning paralysis" usually kills the magic of the moment.
View it, Visit it was born from a simple desire: to eliminate the distance between visual inspiration and reality. We wanted to build a tool that allows you to take any image, say what you want, and have the cheapest flights ready before the excitement fades.
What it does
Our project is a multi-modal travel engine that guides the user through three seamless phases:
View it (Vision): You upload an image. Using Gemini 2.5 Flash, the app identifies the location and provides context (weather, landmarks, history). Refine it (Voice): AI isn't perfect, and travel is personal. You can talk to the app to correct or confirm destinations using ElevenLabs Scribe v1, creating a "Human-in-the-Loop" experience where your voice directs the AI. Visit it (Flights): The app automatically detects your origin via IP and searches the Skyscanner API for real-time deals, showing them on an interactive map with flight paths and price markers.
How we built it
The project is built on a high-performance FastAPI backend that orchestrates the entire intelligence layer:
AI Core: Gemini 2.5 Flash for image/video analysis and Gemini 1.5 Pro for intent-based text refinement. Voice Interface: Integrated ElevenLabs for industry-leading voice-to-text accuracy. Data layer: Real-time flight data via Skyscanner and destination enrichments from the Wikipedia API. Frontend: A premium Single Page Application (SPA) built with Vanilla JS and modern CSS features like Glassmorphism and CSS variables. We used Leaflet.js for the interactive map visualization.
Challenges we ran into
Bridging the Multimodal Gap: Connecting unstructured data (a photo of a beach) to structured API requirements (IATA airport codes). We solved it by building a translation layer with Gemini that maps landmarks to the nearest commercial airports. Real-Time State Management: Handling a multi-step process in a SPA without page reloads, ensuring the map, sidebar, and indicators stayed in sync. API Concurrency: Fetching flights, hotel estimates, and weather data simultaneously for multiple destinations while maintaining low latency.
Accomplishments that we're proud of
Hybrid AI Orchestration: Successfully combining ElevenLabs' industry-leading transcription with Gemini's visual intelligence to create a cohesive and fast experience. Zero-Friction UI: We are proud of the "Aha!" moment when a user uploads a photo and, 5 seconds later, sees a flight route crossing the globe on a beautiful dark-mode map. Voice-to-Global-Search: Implementing a system where you can verbally correct an AI and see the results update in real-time.
What we learned
This hackathon taught us that the future of travel isn't a search bar—it's intent. We learned how to:
Orchestrate multiple AI models to perform specialized tasks. Implement geofencing and IP-based logic to personalize the experience. Design interfaces that feel "alive" through micro-animations and dynamic mapping.
What's next for View it, Visit it
Multi-City "Vibe" Planning: Uploading multiple photos to generate an optimized itinerary. Social AI Integration: Identifying locations from social media reels and immediately finding the best prices. Direct Booking: Closing the loop by allowing users to purchase deals directly from the dashboard.
Built With
- api
- css-vanilla
- css3
- dotenv
- fastapi
- gemini
- git
- google-maps
- google-vision
- html5
- ipapi.co
- javascript
- leaflet.js
- lucide-icons
- python
- python-multipart
- render
- skyscanner-api
- uvicorn
- wikipedia-api
Log in or sign up for Devpost to join the conversation.