posted an update

Project Update: Heart Attack Risk Prediction Using Machine Learning

Week 1: Project Kickoff & Data Exploration

What I accomplished this week:

Joined Discord & Reviewed Guidelines

  • Confirmed project requirements
  • Understood submission deadlines and judging criteria

Dataset Analysis

  • Explored the Heart Attack dataset thoroughly
  • Performed initial data profiling using df.info(), df.describe(), and df.isnull().sum()
  • Identified key features: Age, Cholesterol, Blood Pressure, Maximum Heart Rate, etc.
  • Verified target variable distribution (0 = No risk, 1 = At risk)

Environment Setup

  • Created Google Colab notebook
  • Imported essential libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
  • Established reproducible workflow with random seed fixed

Initial Findings

The dataset contains multiple clinical features with no major missing values. Early visualization shows that age and cholesterol levels have strong correlation with heart attack risk.

Next Week's Goals

  • Complete Exploratory Data Analysis (EDA) with visualizations
  • Build baseline Logistic Regression model
  • Start implementing Random Forest classifier

Lessons Learned So Far

Understanding medical data requires domain knowledge. I spent time researching what each clinical feature means to ensure proper interpretation.


Follow along for more updates as I build toward the March 29th deadline!

Hack4Health #Byte2Beat #MachineLearning #HealthcareAI #HeartDiseasePrediction

Week 2 Update: Models Complete!

What I built this week:

Completed Exploratory Data Analysis

  • Created correlation heatmaps
  • Visualized feature distributions
  • Identified key patterns in the data

Implemented Two ML Models

  • Logistic Regression (~85% accuracy)
  • Random Forest Classifier (~87.5% accuracy)

Evaluation Metrics

  • Accuracy, Precision, Recall, F1-score
  • ROC Curve and AUC score
  • Confusion Matrix analysis

Key Finding

Random Forest outperforms Logistic Regression, especially in recall (0.89) which is crucial for medical applications.

Next: Explainability & Risk Categorization

Week 3 Update: Making Model Interpretable

Added this week:

Feature Importance Analysis

  • Identified top predictors: Age, Cholesterol, Max Heart Rate
  • Created visualization showing feature contributions

Risk Categorization

  • Added Low/Medium/High risk levels
  • Based on probability thresholds
  • Makes output more practical for healthcare use

Error Analysis

  • Examined False Negatives and False Positives
  • Focused on minimizing FN (critical in medical diagnosis)

New Visualizations Added

  • Feature importance bar plot
  • Risk category distribution
  • Enhanced confusion matrix

Final Week: Documentation & Submission Prep

Final Week: Ready for Submission!

Completed this week:

Professional Report (PDF)

  • 4-page comprehensive documentation
  • Includes methodology, results, and insights
  • Professional cover page and formatting

GitHub Repository

  • Clean, well-structured repo
  • README with complete project documentation
  • requirements.txt for reproducibility

Final Notebook Polish

  • All cells run without errors
  • Markdown sections with clear explanations
  • All visualizations properly labeled

Final Model Performance

  • Random Forest Accuracy: 87.5%
  • Recall (Heart Attack Class): 0.89
  • ROC-AUC Score: Strong discriminatory power

Project Status: COMPLETE

Ready for March 29th submission deadline!

Thank you to the Hack4Health team and mentors for this amazing opportunity!

Hack4Health #Byte2Beat #AIforHealth #MachineLearning #HeartDiseasePrediction

Log in or sign up for Devpost to join the conversation.