Explainable Survival: From Prediction to Trust in Hex

Inspiration

The Titanic dataset is one of the most well-known machine learning benchmarks. After extensive experimentation and achieving a top-quartile Kaggle score, I realized that improving accuracy was not the hardest part of the problem — understanding and trusting the model’s decisions was.

In real-world settings, machine learning models are rarely judged on metrics alone. Stakeholders need to know why a prediction was made, where the model fails, and whether its outputs can be trusted. This project was inspired by that gap between Kaggle-style optimization and decision-ready machine learning.


What I Built

This project is an end-to-end, explainable machine learning application, built entirely in Hex.

Rather than focusing on leaderboard performance, it reframes Titanic survival prediction as a trust and interpretability problem, delivered through an interactive Hex application and an AI-powered exploration layer:

  • Interactive prediction
    Explore survival probabilities for hypothetical passengers through “what-if” analysis.

  • Local explainability
    Understand why a specific prediction was made by surfacing the strongest positive and negative drivers.

  • Global model evaluation
    Examine where the model performs well — and where it struggles — across demographic and socioeconomic groups.

  • AI-driven exploration (Hex Threads)
    Ask natural-language questions about model behavior, error patterns, and subgroup performance, enabling self-serve analysis without writing code.

Together, these views turn a familiar dataset into a decision-support style experience, not a competition notebook.


How It Was Built

Several common tabular models were evaluated, including logistic regression and gradient-boosted approaches. A Random Forest trained with cross-validation provided the best balance of performance, stability, and interpretability, achieving a public leaderboard score of approximately 0.78.

To make the model’s reasoning transparent, I integrated SHAP (SHapley Additive exPlanations) to support both:

  • Local explanations (for individual predictions)
  • Global insights (feature importance and cohort-level error patterns)

Hex enabled notebooks, interactive apps, and AI-driven Threads to coexist in a single environment, making it possible to move from raw data to a polished, decision-support experience without fragmenting the workflow.


Challenges & Learnings

  • Optimizing predictive performance was far easier than making model behavior interpretable and communicable.
  • Explainability is most effective when paired with interactive, user-controlled exploration rather than static plots.
  • Designing an interface that supports trust requires exposing both model strengths and failure modes.
  • Hex excels at turning analysis into a decision-ready product, not just a technical artifact.

Why This Matters

The Titanic dataset is merely a vehicle. The real focus of this project is how modern analytics platforms like Hex can help data scientists build machine learning systems that people can actually understand and trust.

Built With

Share this project:

Updates