Image to Image

Main Function Page
Main Function Page————Get History Record
Home Page, you can make comments and surf the images
MySQL Database Description

Inspiration

TOtoTO began with a simple observation: when exploring a large campus, people often take photos of buildings or scenes but have no idea what they’re looking at or what’s nearby. Existing navigation tools don’t help much with “visual discovery.” I wanted to build something that lets users upload a photo and instantly know where they are, find similar scenes, and even get an AI-generated explanation or “tour guide” about the location.

What I Built

TOtoTO is a lightweight system that combines:

Image embedding for scene similarity search
Vector indexing to find the closest visual matches
FastAPI backend for simple deployment and integration
A clean interface that returns top-k similar campus scenes along with optional AI commentary

The core idea is: $$\text{query_img} \xrightarrow{\text{encoder}} \mathbf{v} \xrightarrow{\text{index}} {\mathbf{v}_1,\mathbf{v}_2,\dots}$$ …then return the closest matches and generate descriptive context.

What I Learned

How to build a minimal but efficient image-retrieval pipeline
Managing embeddings and vector indexes for real-time search
Handling image uploads, preprocessing, and inference in a clean backend
Integrating LLM-based descriptions in a controlled and lightweight way
Keeping the entire project small, understandable, and easy to extend

How I Built It

Collected campus images and processed them into embeddings
Built a vector index that supports fast similarity search
Implemented the FastAPI backend (upload → encode → search → respond)
Added optional LLM output to generate user-friendly explanations
Packaged everything into a simple repo that anyone can run locally

Challenges

Balancing speed vs. accuracy of the image encoder
Keeping dependencies slim so deployment wouldn’t become a mess
Handling noisy or low-quality photos while still returning reasonable matches
Integrating LLM outputs without making the pipeline slow
Making the system robust enough to handle different lighting conditions and angles

What’s Next

Add more datasets beyond campus scenes
Improve the UI so the experience is smoother
Support map-based visualizations and location refinement
Build a demo site so users can try it without running anything locally
Add 3D campus' models to support visualization

Built With

ai
api
css
database)
faiss
flask
html
javascript
llm
main
mysql
node.js
npm
python
typescript
vite
vue

Updates

new-tonAA Zhang posted an update — Dec 09, 2025 09:35 PM EST

“Added FAISS-based vector search for image similarity.” “Integrated AI-generated campus tour guide descriptions.” “Released full frontend–backend separation with Vue 3 + Flask.”

Log in or sign up for Devpost to join the conversation.

new-tonAA Zhang started this project — Dec 09, 2025 09:31 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.