Inspiration

Anyone who has searched for an apartment in New York City knows the pain: six tabs open across Zillow, StreetEasy, Craigslist, and Facebook Marketplace, manually cross-referencing prices and neighborhoods, clicking through dead links. The search experience hasn't kept up with the way people actually think about housing. We wanted to replace the filter dropdowns with a conversation. You shouldn't have to translate "close to a park, dog-friendly, not too far from the L train" into a grid of checkboxes — you should just be able to say what it is.

What It Does

House Finder is a natural-language apartment search tool. Instead of juggling a dozen real-estate sites with rigid filters, you just describe what you want — "sunny 1-bed in Brooklyn under $2,500 with laundry" — and the app finds real, live listings, scrapes them, extracts structured data with AI, and renders them on an interactive map.

## How We Built It

The project is a three-stage pipeline sitting beneath a React frontend:

Stage 1 — Web Search A natural-language query is passed directly to the Linkup API, which performs a deep web search and returns a ranked list of real listing URLs. Results are written to results.json.

Stage 2 — Scrape & Extract For each URL, a BrowserUse Agent loads the page, strips boilerplate (scripts, navbars, footers), and captures up to 8,000 characters of visible text. That raw text is sent to GPT-4o-mini with a structured prompt, which extracts:

$$\text{listing} = {\ \text{title},\ \text{address},\ \text{price},\ \text{beds},\ \text{baths},\ \text{sqft},\ (\text{lat},\ \text{lng}),\ \text{features}\ }$$

The results are serialized into a Listings.ts file consumed directly by the frontend.

Stage 3 — Frontend A React + TypeScript app (Vite, Tailwind, shadcn/ui) with three panes:

  • Chat panel — a conversational input that fires new searches
  • Listings panel — scrollable cards that highlight on map hover
  • Map view — Google Maps with markers that pan and zoom to active listings

The two-runtime design (Node.js for search, Python for scraping + AI) lets each stage use the best tool for the job. A single node find.js "" orchestrates the full pipeline.

## Challenges

Structured extraction from noisy text. Real estate pages are full of boilerplate, ads, and inconsistent formatting. Getting GPT-4o-mini to reliably output clean JSON — especially numeric fields like lat/lng and sqft — required careful prompt engineering and a regex-based JSON extraction fallback for when the model wrapped its response in markdown code fences.

What We Learned

  • LLMs are surprisingly good at schema extraction from messy HTML-stripped text, as long as you're explicit about types and nullability in the prompt.
  • Linkup's search API dramatically reduces the need to maintain a curated list of sources — it handles source discovery, leaving us to focus on the extraction layer.
  • The hardest part of an AI pipeline isn't the AI — it's the glue: error handling across process boundaries, graceful degradation when scrapes fail, and keeping the frontend reactive to an async backend that writes files instead of serving JSON.

What's Next

Future implementations include adding a roommate matching feature, directly adding leases onto the website, and adding a subletting option. We would also like to improve the chatbot searching capabilities and adding a rating system based on previous reviews/tenant experiments.

Built With

Share this project:

Updates