Inspiration
As students looking for housing in Toronto, this project meant a lot to us as it was applicable to certain challenges in our life. It was insightful to see that certain presumptions we had of the market for price listings may have been biased, as we were often observing outliers and making assumptions of the market from it. This highlighted the importance of data for making accurate interpretations. Additionally, other stakeholders such as young professionals and first-time buyers also struggle to find affordable housing in major cities like Toronto. This project aims to provide a real estate price prediction model to help buyers, investors, and renters make informed decisions.
What it does
The model takes relevant real estate features such as number of bedrooms, number of bathrooms, monthly maintenance fees, and property size to provide users an estimated price. This project was made using a Stremlit web application, which made our T-PPP app highly user-friendly and accessible.
How we built it
The project consists of four main components: Data Collection & Cleaning – In addition to the real estate data, we also collected TTC subway data to use in our analysis. Using pandas, we cleaned and processed both datasets, taking away any duplicate values or missing values. EDA Findings – We used data analysis to evaluate the correlation between different factors and the real estate listing price. In this way, we selected four features to include in our model: number of bedrooms, number of bathrooms, property size, and maintenance fee. Model Training & Evaluation – Through experimentation, we trained multiple machine learning models, including Gradient Boosting, XGBoost, and LightGBM, and compared their performance to select the best one (Gradient Boosting was the best). Web App Development – We created a web application using Streamlit to make the model easy for users, allowing them to input property details and receive real-time price predictions.
Challenges we ran into
We are a team of three economics students and one computer science student with no prior experience in machine learning. Learning the concepts, tools, and methodologies from scratch was a challenge, but through intensive self-learning and the power of collaboration, we were able to build a working model. Other challenges included handling highly correlated features in the dataset, such as the relationship between property size and price, and optimizing hyperparameters to balance accuracy and computational efficiency. Additionally, integrating real-world real estate data and ensuring that it was clean and structured for machine learning posed some initial difficulties.
Accomplishments that we're proud of
- Learning and implementing machine learning in a short time frame despite having no prior experience.
- Developing a web app that provides users with easy access to price predictions.
- Achieving high model accuracy, with Gradient Boosting outperforming other models.
- Optimizing feature selection and hyperparameters to improve prediction reliability.
- Creating a scalable framework that can be expanded with additional real estate data in the future.
What we learned
Through this project, we deepened our understanding of how different property attributes affect real estate pricing. We also gained hands-on experience with machine learning model evaluation, hyperparameter tuning, and web app deployment. Additionally, we learned how to clean and structure real-world datasets to improve model performance.
What's next for Toronto Property Price Prediction
Moving forward, we aim to:
- Enhance model generalization by expanding training data beyond Toronto’s downtown core.
- Improve the web app UI for a more seamless user experience.
- Incorporate real-time market data to adjust predictions based on changing housing trends.
- Explore additional location-based factors such as neighborhood amenities and walkability scores.
Built With
- folium
- geopandas
- joblib
- lightgbm
- matplotlib
- numpy
- pandas
- python
- scikit-learn
- scipy
- seaborn
- shapely
- streamlit
- xgboost
Log in or sign up for Devpost to join the conversation.