Inspiration

This project was inspired by a fundamental question: what truly drives household financial health in Canada? With rising housing prices, post-pandemic recovery, growing inequality, and increasing public discussion around affordability, I wanted to move beyond headlines and analyze real household-level data.

As a data science student interested in finance and economic systems, I was particularly motivated to understand whether wealth is primarily driven by income, education, homeownership, or financial behavior. I also wanted to explore how structural factors—like access to housing and post-secondary education—shape long-term financial outcomes.


What it does

The Canadian Household Finance Analysis examines the financial health of 16,208 Canadian households using 2022 survey data. It analyzes:

  • Net worth distribution and lifecycle wealth accumulation
  • Income patterns and skewness
  • Debt composition (mortgage, credit card, student loans, lines of credit)
  • Homeownership and its impact on wealth
  • Education premium and financial resilience
  • Credit card payment behavior as a financial health indicator
  • Correlation between income, assets, and debt
  • COVID-19 financial impact across households

The project identifies key structural drivers of wealth and highlights patterns of inequality across age groups, education levels, and ownership status.


How we built it

1. Data Cleaning and Preparation

  • Started with 16,241 households and 19 variables
  • Removed 33 extreme outliers (net worth > $20M, income > $1M)
  • Final dataset: 16,208 households
  • Verified no missing values and full variable completeness

We ensured data integrity before any analysis to avoid misleading conclusions.

2. Exploratory Data Analysis

We conducted:

  • Distribution analysis (mean, median, percentiles)
  • Skewness assessment for income and net worth
  • Group-based aggregation by age, education, and ownership
  • Debt composition breakdown
  • Credit behavior segmentation

Special attention was given to right-skewed distributions to ensure interpretation focused on medians and percentiles rather than means alone.

3. Correlation and Structural Analysis

We built correlation matrices to identify key drivers of net worth:

  • Strong correlation with home value
  • Strong correlation with liquid assets (bank deposits, TFSA balances)
  • Moderate correlation with income

This helped reveal that long-term asset ownership matters more than short-term income levels.

4. Policy-Oriented Framing

Instead of stopping at descriptive statistics, we translated findings into:

  • Homeownership policy implications
  • Financial literacy recommendations
  • Debt management interventions
  • COVID-19 recovery insights

Challenges we ran into

1. Highly Skewed Distributions

Both income and net worth were heavily right-skewed. A small number of high-wealth households significantly inflated the mean.

We addressed this by focusing on medians, interquartile ranges, and distribution patterns rather than relying solely on averages.

2. Income-Wealth Decoupling

We initially expected income to strongly predict net worth. However, the correlation was only moderate.

This forced deeper analysis into asset ownership and lifecycle effects, shifting the focus from earnings to long-term capital accumulation.

3. Negative Income Values

Some households reported income as low as –$2.7M, likely due to business losses.

Interpreting these without distorting overall conclusions required careful statistical framing.

4. Avoiding Oversimplification

It would have been easy to conclude that “higher income equals higher wealth.” The data showed a more complex system involving education, housing access, and financial behavior.

Maintaining analytical nuance was essential.


Accomplishments that we're proud of

  • Cleaned and structured a large real-world dataset with full integrity validation
  • Identified homeownership as the strongest structural driver of wealth
  • Quantified the education premium in household net worth
  • Linked credit card payment behavior to broader financial health indicators
  • Translated data findings into actionable policy recommendations
  • Built a comprehensive, end-to-end financial analysis framework

Most importantly, we moved beyond surface-level averages to uncover structural wealth dynamics.


What we learned

  1. Wealth accumulation is lifecycle-driven and peaks at ages 55–64.
  2. Homeownership is the dominant wealth multiplier in Canada.
  3. Education compounds long-term financial advantage.
  4. Financial behavior (such as paying credit cards in full) correlates strongly with overall financial health.
  5. Wealth inequality is primarily asset-driven rather than income-driven.
  6. COVID-19 widened resilience gaps among households.

This project reinforced the importance of systems thinking when analyzing economic data.


What's next for Canadian Household Finance Analysis

Future extensions could include:

  • Longitudinal tracking of wealth accumulation over time
  • Intergenerational wealth transfer analysis
  • Interest rate sensitivity modeling for mortgage sustainability
  • Urban versus rural segmentation
  • Machine learning models predicting financial stress risk
  • Deeper provincial comparisons with regional housing market controls

The next step is to transition from descriptive analysis to predictive modeling and scenario simulation, enabling stronger insights for policy design and financial planning.

This project lays the foundation for a scalable, data-driven framework to better understand household financial health in Canada.

Built With

Share this project:

Updates