A-list Housings: The AI Real Estate Analyst

Inspiration

The Canadian housing market is overwhelming. First-time buyers are juggling volatile interest rates, complex budgets, and messy listing data. We realized that existing tools are just "calculators"—they tell you the monthly payment, but they don't tell you the risk.

We wanted to answer the hard questions: Is this house actually worth the asking price? and Will it hold its value in 5 years?

What it does

A-list Housings is an intelligent housing analytics engine that combines Generative AI with Statistical Forecasting.

  • The "Crystal Ball" (ML Valuation Engine): Unlike simple trend lines, our system predicts a property's value 5 years into the future by separating "Market Hype" from "Intrinsic Value."
  • The AI Advisor: We integrated Gemini 2.5 Flash to act as a financial guardian. It analyzes the property against your stored financial profile (Income, Savings, Risk Tolerance) to give personalized warnings.
  • Live Listing Scraper: Users can drop a URL from a real estate site, and our custom scraper extracts the messy HTML into clean structured data instantly.

How we built it

We built a "Hybrid Intelligence" pipeline:

1. The Machine Learning Core (The "Brain")

This is the most technically complex part of SafeHouse. We rejected standard linear regression. Instead, we built a Statistically Decomposed Prediction System:

  • Macro-Economic Layer (SARIMA): We trained a Time-Series model on 40 years of Statistics Canada New Housing Price Index (HPI) data (1981–2025) to forecast the national market baseline.
Growthmarket = SARIMA(HPIhistory)


  • Micro-Economic Layer (XGBoost): We trained a Gradient Boosting Regressor on 44,000+ active listings to determine the "Hedonic Value" of specific features (e.g., bedrooms, finished basements, location).

Valuefeatures = XGBoost(xbedrooms, xsqft, xlocation...)

  • The Fusion Algorithm: We combine these using a Gradual Correction Formula. If a house is overpriced based on its features, our model assumes the market will "correct" this inefficiency over 3 years rather than instantly.

$$ P_{\text{future}} = P_{\text{current}} \times (1 + r_{\text{market}})^t \times (1 + \delta_{\text{correction}}) $$

2. The Agentic Backend

  • Context-Aware AI: We engineered a system prompt in Flask that injects the user's Financial Risk Profile directly into Gemini's context window. This forces the AI to output math-based advice rather than generic real estate fluff.
  • Dynamic Scraping: We built a robust scraper that parses unstructured HTML from listing sites into the JSON format our ML models require.
  • State Management: We used Auth0 for identity management, syncing users to a local SQLite database to persist their saved listings and risk preferences.

Challenges we faced

  • The "Crystal Ball" Fallacy: Predicting 2031 prices is inherently noisy. Training one huge model on all data failed. We solved this by decomposing the problem: one model for Time (SARIMA) and one model for Space (XGBoost).
  • Data Scarcity: Real-time Canadian housing data is locked down. Building a scraper that could handle dynamic CSS classes and missing data fields without crashing the app was a significant engineering hurdle.

What we learned

  • Price ≠ Value: A cheap house in a dying market is a bad asset; an expensive house in a growing market is a good one. Our ML model learned to distinguish between "Market Momentum" and "Feature Quality."
  • Prompt Engineering is Logic: We learned that we could replace complex Python logic with well-structured system prompts. By explicitly telling Gemini how to weigh "User Priorities," we got highly personalized analysis.

What's next for A-list Housings

  • Visualizing the Forecast: Currently, our ML model outputs raw numbers. The next step is to render interactive charts showing the "Confidence Interval" of our 5-year predictions.
  • Bank API Integration: Replacing manual salary input with Plaid to pull real-time budget constraints directly from bank accounts.

Built With

Share this project:

Updates