Project Story: POV — An AI-Powered Landmark Discovery App

Inspiration

The idea for POV (Point of View) came from our shared experiences while traveling. We often found ourselves standing in front of buildings or landmarks without fully understanding their history, cultural meaning, or relevance. Traditional travel tools required manual searching and disrupted the moment.

We wanted to create an app that starts from a human point of view:

What are we looking at right now, and why does it matter?

That question became the foundation of POV.


What It Does

POV follows a streamlined and intuitive workflow.

User Authentication

  • Account creation and login to personalize the experience
  • Tracks user activity and progress

Landmark Detection

Users can:

  • Take photos directly within the app
  • Upload existing images

These images are processed using the Gemini API to identify landmarks.

Contextual Information Display

Once a landmark is detected, POV presents:

  • Historical background
  • Cultural and architectural significance
  • Stories and meaningful labels related to the location

Conversational Exploration

POV suggests follow-up questions so users can:

  • Dive deeper into history
  • Explore cultural context
  • Learn about current events happening near the landmark

This creates a natural, ongoing chat experience.

Travel Wrap (Progress Tracking)

Every landmark a user visits is recorded. POV generates a personalized wrap showing how many landmarks have been explored, encouraging continued discovery.


How We Built It

Building POV taught us how artificial intelligence can connect the physical world to digital knowledge in real time. Through this project, we learned:

  • Integrating computer vision with real-world image input
  • Using the Gemini API for landmark recognition, contextual understanding, and multi-turn conversations
  • Designing conversational AI as interactive support rather than a static information source
  • Prioritizing user experience design in AI-powered applications

We also learned that AI is most effective when it enhances curiosity instead of replacing it.


Challenges We Ran Into

  • Ensuring accurate landmark recognition despite lighting, angles, and partial views
  • Maintaining conversation context across multiple user questions
  • Balancing technical ambition with usability and focusing on a clean, intuitive core experience

Accomplishments We're Proud Of

POV transforms travel from passive observation into active exploration. By enabling users to scan landmarks, ask questions, and learn about both historical and real-time context, we encourage deeper engagement with the world from our own point of view.

Ultimately, POV is about turning what we see into what we understand.


What We Learned

Through building POV, we gained experience in:

  • Computer vision integration
  • Gemini API usage for recognition and context
  • Multi-turn conversational design
  • Crafting user experiences that make AI feel like a companion

And again, we learned that AI works best when it sparks curiosity.


What's Next

POV will continue evolving to deepen the connection between people and the places they encounter. By helping users scan landmarks, ask questions, and learn about both historical and real-time context, we aim to turn everyday travel into meaningful discovery.

POV is ultimately about transforming seeing into understanding.

Built With

Share this project:

Updates