Explainable Survival: From Prediction to Trust in Hex
Inspiration
The Titanic dataset is one of the most well-known machine learning benchmarks. After extensive experimentation and achieving a top-quartile Kaggle score, I realized that improving accuracy was not the hardest part of the problem — understanding and trusting the model’s decisions was.
In real-world settings, machine learning models are rarely judged on metrics alone. Stakeholders need to know why a prediction was made, where the model fails, and whether its outputs can be trusted. This project was inspired by that gap between Kaggle-style optimization and decision-ready machine learning.
What I Built
This project is an end-to-end, explainable machine learning application, built entirely in Hex.
Rather than focusing on leaderboard performance, it reframes Titanic survival prediction as a trust and interpretability problem, delivered through an interactive Hex application and an AI-powered exploration layer:
Interactive prediction
Explore survival probabilities for hypothetical passengers through “what-if” analysis.Local explainability
Understand why a specific prediction was made by surfacing the strongest positive and negative drivers.Global model evaluation
Examine where the model performs well — and where it struggles — across demographic and socioeconomic groups.AI-driven exploration (Hex Threads)
Ask natural-language questions about model behavior, error patterns, and subgroup performance, enabling self-serve analysis without writing code.
Together, these views turn a familiar dataset into a decision-support style experience, not a competition notebook.
How It Was Built
Several common tabular models were evaluated, including logistic regression and gradient-boosted approaches. A Random Forest trained with cross-validation provided the best balance of performance, stability, and interpretability, achieving a public leaderboard score of approximately 0.78.
To make the model’s reasoning transparent, I integrated SHAP (SHapley Additive exPlanations) to support both:
- Local explanations (for individual predictions)
- Global insights (feature importance and cohort-level error patterns)
Hex enabled notebooks, interactive apps, and AI-driven Threads to coexist in a single environment, making it possible to move from raw data to a polished, decision-support experience without fragmenting the workflow.
Challenges & Learnings
- Optimizing predictive performance was far easier than making model behavior interpretable and communicable.
- Explainability is most effective when paired with interactive, user-controlled exploration rather than static plots.
- Designing an interface that supports trust requires exposing both model strengths and failure modes.
- Hex excels at turning analysis into a decision-ready product, not just a technical artifact.
Why This Matters
The Titanic dataset is merely a vehicle. The real focus of this project is how modern analytics platforms like Hex can help data scientists build machine learning systems that people can actually understand and trust.
Built With
- hex
- numpy
- pandas
- python
- scikit-learn
- shap
- threads
Log in or sign up for Devpost to join the conversation.