Inspiration
Cardiovascular disease is the leading cause of death globally, but it rarely kills in isolation. When the heart fails, the brain suffers immediately from Hypoxia (oxygen deprivation). Yet, most AI diagnostic tools are "Black Boxes" that only look at the heart, ignoring the critical neurological fallout.
We wanted to build GlassHeart not just as a classifier, but as a Holistic Digital Twin. We wanted to prove that by combining Supervised Clinical Prediction (Heart) with Unsupervised Anomaly Detection (Brain), we could create a "Total Health Monitor" that tracks patient survival from the chest to the cortex.
What it does
GlassHeart is a Dual-System ICU Monitor that connects cardiac risk to neural health in real-time.
- The "Nexus" (Cardiac Core): We input raw patient vitals (Age, BP, Cholesterol) and our CatBoost engine calculates a precise, clinical-grade probability of heart disease (Accuracy: 81.6%).
- The "Sentinel" (Neural Core): Simultaneously, the system monitors a live feed of EEG Brain Waves (from our 4th dataset). Using unsupervised learning, it detects chaotic signal variance associated with seizures and hypoxia.
- The "Neuro-Cardiac Bridge": This is our innovation. The system links the two cores. If the Heart Model detects "Critical Failure," the Neural Model automatically simulates the corresponding neurological distress, visualizing the invisible link between cardiac arrest and brain damage.
- The "Glass-Box" View: We use SHAP (SHapley Additive exPlanations) to explain why a patient is at risk, mathematically proving which feature (e.g., "Pulse Pressure > 60 mmHg") triggered the alarm.
How we built it (The Journey)
This was a war against data noise. We iterated through Five Distinct Generations of AI architecture to solve the hackathon's hidden "Technical IQ Test":
Gen 1: The Baseline (Random Forest) We started simple with Scikit-Learn. It failed hard (63% accuracy) because the provided datasets had conflicting units (Age in days vs. years).
Gen 2: The "Council" (Voting Ensembles) We combined HistGradientBoosting, ExtraTrees, and Random Forest using Z-Score Normalization to force the datasets to speak the same math language. Result: 72%. Good, but not winning.
Gen 3: The "Cyborg" (Deep Learning + Trees) We stacked a Keras Neural Network with our Tree models using a Logistic Regression meta-learner. Result: 76%. We hit a wall due to "Noisy Labels" (sick patients labeled healthy).
Gen 4: The "Nexus" (CatBoost + Sniper Protocol) We wrote a custom "Confident Learning" algorithm (The Sniper). It scanned 70,000 rows, identifying and purging 40,000+ rows of bad data. We fed this purified data into CatBoost. Result: 81.6% Accuracy. We broke the ceiling.
Gen 5: The "Neural Extension" (Unsupervised Bridge) For the "Advanced Level," we integrated the 4th dataset (
eeg_timeseries.csv). Since this data had no labels, we couldn't use classifiers. Instead, we built an Unsupervised Anomaly Detector that clusters patients based on signal volatility, successfully creating a "Seizure Sentinel" that runs parallel to the heart model.
Challenges we ran into
- The "Unsupervised Trap": The 4th dataset (EEG) was completely unstructured and unlabeled. Most teams would try to force this into a supervised model and fail. We had to pivot to Unsupervised Anomaly Detection to unlock its value.
- The "Accuracy Wall": Getting from 76% to 81% felt impossible. Traditional models couldn't handle the noise in the largest dataset. We had to invent our own cleaning pipeline (The Sniper) to solve it.
- Conflicting Medical Units: One hospital measured Age in days, another in years. One measured BP as "120", another as "1.2". We wrote complex Pandas scripts to harmonize these without losing data.
Accomplishments that we're proud of
- Cracking the Unsupervised Code: We successfully integrated an unlabeled time-series dataset (EEG) into a primarily supervised project, building a hybrid engine.
- From 63% to 81%: We didn't settle for mediocre results. We rebuilt our entire architecture five times until we got it right.
- The "Sniper" Algorithm: We are proud that we wrote our own semi-supervised learning loop to clean the data, rather than just manually deleting rows.
What's next for GlassHeart
- Real-time Sensor Integration: Replacing the CSV feed with live Bluetooth inputs from wearable EEG/ECG sensors.
- Clinical Validation: Testing the model on a completely new dataset (e.g., from an Asian hospital) to check for bias.
Built With
- catboost
- fastapi
- javascript
- keras
- machine-learning
- numpy
- pandas
- python
- react
- scikit-learn
- shap
- tensorflow

Log in or sign up for Devpost to join the conversation.