About the Project: TerraFind

Inspiration

This project started with a simple question: why is it so hard to search for things in satellite imagery?

I’ve always been fascinated by how much we can see from above — farmland patterns, forest changes, environmental shifts. But making sense of it all usually requires a background in remote sensing or GIS. I wanted to change that. I wanted to make it possible for anyone to ask a question like, "Where’s the best place to grow crops?" and get meaningful answers powered by real satellite data.

That’s what inspired TerraFind, a natural language search engine for the Earth.

The Journey

I didn’t come into this as a geospatial expert or AI researcher. I had to figure it out step by step, and it was a lot more difficult that I expected. I spent hours trying to understand how vegetation indexes work, how satellite imagery is interpreted, and how to balance the logic in the backend with a clean image display in the frontend.

One of the biggest learning curves was embedding. How do you take a description of land and turn it into something a computer can search through? I learned to use OpenAI’s models to turn image descriptions into vectors, and FAISS to build a fast index that could return the most relevant results to any question.

I also had to dive deep into prompt engineering — figuring out exactly how to ask the model to describe what’s in an image in a way that’s both structured and flexible. It was a lot of trial and error.

How It Works

I gathered satellite images of various locations using the USGS m2m api (this contains satellite imagery or the entire planet and is probably one of the US government's greatest contributions). Each filename contained coordinates. For each image, I used GPT-4o to generate a detailed description — covering vegetation, water, terrain, human activity, and more. Each description was embedded using a language model and stored in a FAISS index. When a user types a natural language query, like "areas with high moisture and low development," the system finds the most relevant images.

Challenges

I struggled a lot with the details. Converting coordinates from degrees-minutes-seconds to decimal form was surprisingly tricky. Getting consistent image analysis out of the model required lots of prompt tweaking. Sometimes the descriptions were too vague; other times they were too repetitive. I also had to learn how to store and query high-dimensional embeddings — which was a new world to me.

There were many points where I felt stuck, and honestly, I questioned if it would work at all. But I kept going, solving one piece at a time, and eventually it came together.

What I Learned

This project taught me not just about AI or geospatial data, but about problem-solving — how to work through ambiguity, how to learn things on the fly, and how to keep building even when the outcome isn’t clear yet. It’s one of the most challenging things I’ve worked on, and one of the most rewarding.

Built With

Share this project:

Updates