π« Cardio-Lens: Democratizing Heart Health β Project Story
π‘ Inspiration
Cardiovascular disease (CVD) kills 17.9 million people every year β more than cancer, diabetes, and respiratory diseases combined. Yet early detection remains locked behind expensive clinical tests that most of the world can't access.
We asked ourselves one question:
What if a smartwatch could be the first line of defence?
The idea was simple but ambitious β build a Two-Tier AI System that mirrors how healthcare actually works:
- A mass screening model using data that wearable devices already collect (blood pressure, weight, activity level)
- A clinical model that kicks in only when the screening flags a risk, using data from actual medical tests (ECG readings, chest pain type, exercise angina)
This tiered approach means that billions of people with a wearable device could get an initial risk assessment β and only those flagged would need the clinical follow-up. That's the democratization we were after.
π§ What We Learned
The Math Behind the Models
Both tiers use a Random Forest Classifier, an ensemble of decision trees that votes on the final prediction. The predicted probability for class $c$ is:
$$ P(c \mid \mathbf{x}) = \frac{1}{T} \sum_{t=1}^{T} P_t(c \mid \mathbf{x}) $$
where $T$ is the number of trees (150 for Tier 1, 200 for Tier 2) and $P_t$ is the probability estimate from tree $t$.
Feature Engineering Matters More Than Fancy Algorithms
One of our biggest "aha" moments was realizing that BMI β a derived feature β significantly boosted Tier 1 accuracy. We computed it as:
$$ \text{BMI} = \frac{\text{weight (kg)}}{\left(\frac{\text{height (cm)}}{100}\right)^2} $$
The raw dataset stored age in days, so we also had to convert:
$$ \text{age_years} = \left\lfloor \frac{\text{age_days}}{365.25} \right\rfloor $$
These small transformations taught us that domain knowledge + simple math often outperforms blindly throwing data at a neural network.
Data Cleaning is Half the Battle
The Tier 1 dataset (cardio_base.csv) contained 70,000 records, but many had physiologically impossible blood pressure values (e.g., systolic BP of 16000!). We applied clinically informed filters:
$$ 90 \leq \text{ap_hi} \leq 200 \quad \text{and} \quad 50 \leq \text{ap_lo} \leq 140 $$
After cleaning, we retained ~68,551 records β a 2% loss that dramatically improved model reliability.
Explainability is Non-Negotiable in Healthcare AI
We learned that in medical AI, showing why a prediction was made is just as important as the prediction itself. Our Tier 2 model exposes its feature importance scores, letting users (and doctors) see exactly which clinical factors drove the result β making it an Explainable AI (XAI) system, not a black box.
π¨ How We Built It
Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Streamlit Frontend β
β ββββββββββββββ ββββββββββββ ββββββββββββ βββββββββββββ β
β β The Pitch β β Tier 1 β β Tier 2 β βHealth Twinβ β
β ββββββββββββββ ββββββββββββ ββββββββββββ βββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β backend.py β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β load_and_preprocess β β predict_tier1() β β
β β _tier1() / _tier2 β β predict_tier2() β β
β β train_tier1_model β β simulate_bp_reduc() β β
β β train_tier2_model β β β β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β dataset/cardio_base.csv dataset/heart_processed.csvβ
β (70k records) (918 records) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Tech Stack
| Layer | Technology | Why We Chose It |
|---|---|---|
| Frontend | Streamlit | Rapid prototyping with gorgeous custom CSS β no React/JS needed |
| ML Models | scikit-learn (Random Forest) | Interpretable, fast to train, works great on tabular data |
| Data | Pandas + NumPy | Industry-standard for data wrangling and numerical ops |
| Visualizations | Altair | Declarative, interactive charts that respond to user input |
The Two-Tier Pipeline
Tier 1 β The "Watch" Model (Mass Screening)
- Trained on 70k+ records from a population cardiovascular dataset
- Uses 12 features: age, gender, height, weight, BMI, systolic/diastolic BP, cholesterol, glucose, smoking, alcohol, physical activity
- Hyperparameters:
n_estimators=150,max_depth=12,min_samples_leaf=10 - Includes a BP Simulator β an interactive slider that shows how reducing your systolic BP changes your risk in real-time, powered by running the model across a range:
$$ \text{risk}(bp) = P(\text{CVD} = 1 \mid \text{ap_hi} = bp, \ldots) \quad \forall \ bp \in [\text{target}, \text{current}] $$
Tier 2 β The Clinical Model (Doctor's Diagnosis)
- Trained on 918 clinical records with 15 one-hot encoded features
- Includes ECG readings, chest pain type, exercise-induced angina, ST slope
- Hyperparameters:
n_estimators=200,max_depth=10,min_samples_leaf=5 - Outputs a feature importance chart for full explainability
𧬠The Health Twin Simulator β Our Unique Feature
This is the feature we're most proud of. No other heart disease app does this.
The simulator lets a user create a "Future Healthy Self" by setting health goals (lower BP, lose weight, quit smoking, exercise more). The AI then runs 22 parallel predictions β 11 years Γ 2 scenarios (current trajectory vs. healthy trajectory) β to build a side-by-side 10-Year Risk Trajectory.
The risk at each future year $y$ is modeled as:
$$ R_{\text{current}}(y) = P!\left(\text{CVD} \mid \text{age} + y,\; \text{current features}\right) $$
$$ R_{\text{healthy}}(y) = P!\left(\text{CVD} \mid \text{age} + y,\; \text{goal features}\right) $$
The "Years of Aging Reversed" metric converts the risk reduction into something intuitive β telling a user, for example, "Your heart will be 7 years younger after these changes."
π§ Challenges We Faced
1. Noisy, Real-World Data
The Tier 1 dataset had extreme outliers β blood pressure values of 16,000 mmHg, negative ages, and heights of 250+ cm. We spent significant time designing robust filters that removed noise without accidentally discarding edge-case patients.
2. Class Imbalance
Both datasets had near-balanced classes (~50/50), which is unusual in medical datasets. While this simplified training, it made us question data collection methodology and iterate carefully with stratify=y in our train-test splits to preserve the ratio.
3. The "Two Dataset" Problem
Our two tiers used completely different datasets with zero overlapping features. This meant we couldn't build a single unified model β we had to design a system where the two models work sequentially but independently: the screening model acts as a gate, and the clinical model provides the definitive diagnosis.
4. Making AI Output Feel Human
Raw probability scores like 0.7341 mean nothing to a non-technical user. We spent a lot of time on UX β converting probabilities to percentage-based risk scores, color-coded cards (π’ Low / π‘ Moderate / π΄ High), and the Health Twin's "Years of Aging Reversed" metaphor.
5. Performance at Scale
The BP Simulator runs the model across potentially 100+ blood pressure values in real-time as the user moves a slider. We used @st.cache_resource to avoid retraining models on every page load and vectorized predictions where possible to keep the experience snappy.
π Results
| Metric | Tier 1 (Screening) | Tier 2 (Clinical) |
|---|---|---|
| Dataset Size | ~68,551 records | 918 records |
| Features | 12 (wearable-level) | 15 (clinical-level) |
| Accuracy | ~72% | ~87% |
| Use Case | Mass population screening | Hospital-grade diagnosis |
While 72% accuracy for Tier 1 may seem modest, it's important to note that this model uses only wearable-level data β no blood tests, no ECGs, no clinical visits. For a tool that could run on a smartwatch and screen millions of people, a 72% early warning signal is a powerful first step.
π What's Next
- Deep Learning Upgrade: Replace Random Forest with a gradient-boosted ensemble (XGBoost/LightGBM) for potential accuracy gains
- Real Wearable Integration: Connect with Apple HealthKit / Google Fit APIs for live data
- Federated Learning: Train across hospital datasets without sharing patient data
- Longitudinal Tracking: Let users log predictions over time and track their health journey
"The best time to check your heart health was 10 years ago. The second best time is now."
Log in or sign up for Devpost to join the conversation.