Backstory
Following the constant downward rally of stocks in the stock market, a gradual shift has been seen in investing patterns of people toward the Real Estate market. This change has certainly boomed the real estate market. However, even now most of the real estate industry works on trust and hasn't ben automated still. House Price prediction, is important to drive Real Estate efficiency. Earlier price prediction was done with the help of experience of the brokers. But there's clearly a large scope for modernization. Hence the Inspiration!
Problem Statement
Following the constant downward rally of stocks in the stock market, a gradual shift has been seen in investing patterns of people toward the Real Estate market. House Price prediction, is important to drive Real Estate efficiency. As earlier, House prices were determined by calculating the acquiring and selling price in a locality. Therefore, the House Price prediction model is very essential in filling the information gap and improving Real Estate efficiency. House prices increase every year, so there is a need for a system to predict house prices in the future. House price prediction can help the developer determine the selling price of a house and can help the customer to arrange the right time to purchase a house. The price of a house is highly effected by various conditions like physical conditions, area, concept, and location. With this model, we would be able to better predict the prices.
Technologies Required
This project falls under the category of Machine Learning. This project requires Python and the following Python libraries installed:
- NumPy
- Pandas
- matplotlib
- Scikit-learn
- Seaborn
- Streamlit You will also need to have software installed to run and execute a Jupyter Notebook. Also, you will be required to install Power Bi in order to able to visualize various results given below.
Code
The code for the machine learning model used is in HTF House Price Prediction.ipynb jupyter notebook file. You will also be required to include the TransformedHousePrice.csv dataset file to feed it to the machine learning module. While some code has already been implemented to get you started, you will need to execute all the code blocks i to successfully complete the project. The dataset contains 21609 rows and 31 columns. Raw house price contains unclean data (i.e dirty data) Note that the code is included in HTF. pbix is a Power BI file meant to be used out-of-the-box visualization experience intended to give users a rich experience. If you are interested in how the visualizations are created in the notebook, please feel free to explore this Power BI file.
Project Modules
Exploratory Data Analysis
The main purpose of EDA is to help look at data before making any assumptions. It can help identify obvious errors, as well as better understand patterns within the data, detect outliers or anomalous events, and find interesting relations among the variables.
Data Preprocessing
Real-world data generally contains noises, and missing values, and may be in an unusable format that cannot be directly used for machine learning models. Data preprocessing is a required tasks for cleaning the data and making it suitable for a machine learning model which also increases the accuracy and efficiency of a machine learning model. Here, the data was cleaned and refined.
Data Analysis
Data analysis is the process of cleaning, changing, and processing raw data and extracting actionable, relevant information that helps businesses make informed decisions. The procedure helps reduce the risks inherent in decision-making by providing useful insights and statistics, often presented in charts, images, tables, and graphs.
Data Visualization
Data visualization is the representation of data through the use of common graphics, such as charts, plots, infographics, and even animations. These visual displays of information communicate complex data relationships and data-driven insights in a way that is easy to understand. Here, we have used Power BI for visualization.
Built With
- jupyternotebook
- matplolib
- numpy
- pandas
- scikit-learn
- seaborn
- streamlite
- systemmodellear
Log in or sign up for Devpost to join the conversation.