Inspiration

Cardiovascular diseases are a global challenge, often diagnosed too late for effective intervention. My inspiration was to see if Machine Learning could identify subtle patterns in routine health data—like age, weight, and blood pressure—to provide an early warning system. I wanted to create a tool that could potentially save lives by predicting risks before they become critical.

What it does

The project is an AI-powered diagnostic assistant. It analyzes clinical and lifestyle data from 70,000 anonymized patient records. By processing these inputs, the model predicts the likelihood of cardiovascular disease with an accuracy of 71.60%.

How we built it

I used Python and Google Colab for the entire development process.

  1. Exploratory Data Analysis (EDA): I identified and removed physiological outliers (e.g., impossible blood pressure readings) to ensure data integrity.
  2. Modeling: I implemented a Random Forest Classifier, an ensemble learning method known for its robustness and ability to handle non-linear biological data.

Challenges we ran into

The biggest challenge was "noisy" data. Some patient records contained erroneous blood pressure values (like 16,000 mmHg). Cleaning this data without losing valuable information required careful statistical filtering.

Accomplishments that we're proud of

I am proud of achieving a solid 71.60% accuracy on a large-scale dataset. More importantly, I successfully implemented "Explainable AI" by creating visualizations that show exactly why the model flags a certain risk.

Key Risk Drivers Identified:

  • Age: The most significant predictor.
  • Weight & Height: Crucial physiological indicators.
  • Systolic Blood Pressure (ap_hi): Leading clinical marker.

What we learned

I learned the critical importance of data preprocessing in medical AI. I also deepened my understanding of how ensemble models like Random Forest can be used to solve complex, real-world health problems.

What's next for AI-Driven Early Cardiovascular Risk Detection

The next step is to integrate real-time data from wearable devices to move from static predictions to continuous health monitoring. I also aim to test the model on more diverse global datasets.

Built With

Share this project:

Updates