HABI-Scope

Habi-scope: an interpretable, uncertainty-aware habitability engine

Ranks exoplanets by true surface habitability: transparent sub-scores + confidence, then Monte Carlo → probability. A telescope-ready shortlist, not just “inside the HZ”.

Why I built it (inspiration)

Zero Gravity made me drop rigid checklists. Classic Habitable Zone cuts use average flux, but oceans need more: seasons, a surface, and an atmosphere. I wanted a tool that tells the whole story and is honest about uncertainty, useful to scientists deciding where to point a telescope.

What I built (overview)

Habi-scope turns the NASA Exoplanet Archive CSV into an interpretable score (HABI) with probability and confidence.

7 sub-scores (0 to 1): Energy (flux and seasons), Surface (rockiness), Atmosphere (escape and retention), Stellar (host Teff window), Orbital (sanity and tides), System (multiplicity), Feasibility (can we observe it).
HABI: weighted average of available factors, with hard vetoes for obvious no-gos and a confidence term that down-weights missing data.
Probability: Monte Carlo (MC) propagates measurement errors and nuisance physics to estimate ( p(\text{viable surface conditions}) ). A calibrated fallback maps HABI to ( p ) when MC is impossible.
Outputs: ranked tables, stacked contributions, radar for top candidates, ablation (what matters most), and a small model card.

How I built it (methods)

Data cleaning and dictionary
Parsed the raw CSV (including # COLUMN lines) into a clean table and an auto data dictionary.

Derived physics

Stellar luminosity: \( L_\star \propto R_\star^{2}\,(T_\star/T_\odot)^{4} \)
Insolation (Earth = 1): \( S = \dfrac{L_\star}{a^{2}} \)
Seasonal extremes from eccentricity:
\( S_{\min}=\dfrac{L_\star}{(a(1+e))^2} ), ( S_{\max}=\dfrac{L_\star}{(a(1-e))^2} \)

Teff-dependent HZ edges (Kopparapu)
\( S_{\rm eff}(T_\star)=S_\odot + aT' + bT'^2 + cT'^3 + dT'^4 ), with ( T'=\dfrac{T_\star-5780}{1000} \)

Atmosphere retention proxy

Equilibrium temperature: \( T_{\rm eq}\approx 278\,\text{K}\,S^{1/4}!\left(\dfrac{1-A}{4\varepsilon}\right)^{1/4} \)
Jeans-like escape check via \( v_{\rm esc}(M_p,R_p) ) vs. ( T_{\rm eq} \) to score whether air is retainable.

Scoring and probability

HABI is a weighted mean of the seven sub-scores that exist for a planet. Confidence is the fraction of total weight supported by data. Vetoes guard against non-physical regimes.
MC perturbs \( T_\star, R_\star, a, e, R_p \) with priors for albedo and greenhouse to estimate \( p \). If essential values (usually \( R_p \) for RV planets) are missing, switch to a conservative, calibrated fallback.

# confidence-penalized HABI (illustrative snippet)
W = dict(energy=.25, surface=.20, atmosphere=.20, stellar=.15, orbital=.10, system=.05, feasibility=.05)

def HABI(row):
    num = den = 0.0
    for k, w in W.items():
        v = row.get(f"sub_{k}")
        if v is not None and not pd.isna(v):
            num += w * v
            den += w
    return (num / den) if den > 0 else float("nan")

H = df.apply(HABI, axis=1)
conf = df[[f"sub_{k}" for k in W]].notna().dot(pd.Series(W)) / sum(W.values())
df["HABI_penalized_soft"] = H * (0.5 + 0.5 * conf)

What I learned (insights)

HZ is not the same as habitable. Accounting for seasonal extremes \( (S_{\min}, S_{\max}) \) changes borderline rankings.
Atmosphere and energy dominate. Ablation shows these factors drive most score movement. Seasonal stability often flips near-threshold cases.
Interpretability helps. Sub-scores make it easy to justify or challenge a planet’s rank.
Uncertainty matters. MC vs. fallback and confidence penalties prevent over-claiming when key physics are missing.

Results (what the model surfaces)

A shortlist of likely-rocky, MC-backed candidates with high HABI and high ( p ).
A shortlist of HZ systems needing radius. These have strong energy and stellar context but missing ( R_p ). They are prime for follow-up or habitable moon searches.
Clear reasons for each planet via stacked contributions. For example: wins on Energy and Atmosphere, tradeoff on Orbital.

Challenges and how I solved them

Sparse RV planets with no radius. MC cannot run, so I mark insufficient_data, reduce rank via confidence, and use the conservative fallback \( p \).
Memory limits. Replaced merge-heavy steps with idempotent in-place arrays, down-cast types, and capped MC samples. The notebook stays fast on low-RAM hardware.
Scope control. Dropped optional stellar activity penalties for reproducibility under time. Noted as future work.

Why this is valuable (judge lens)

Impact: A rigorous, uncertainty-aware telescope triage. Spend precious photons on the right worlds.
Innovation: Adds seasons and atmosphere retention to HZ logic, plus an interpretable score and a probability.
Completeness: Guardrails, confidence penalties, MC uncertainty, clean exports and visuals.
Communication: Every ranking is explainable. Two lists guide immediate follow-up vs. data-collection needs.

Limitations and next steps

No stellar activity penalties in this edition. Add GALEX UV and TESS flares to refine stellar environment.
Incorporate new masses and radii as they publish to strengthen Atmosphere and Surface scoring.
Explore moon habitability proxies around HZ giants.

TL;DR: Habi-scope goes beyond “in the HZ”. It combines physics, interpretability, and uncertainty to produce a ranked, defendable, probability-aware shortlist—a practical tool for choosing the next worlds to study.

Built With

python

Updates

Naman Saboo started this project — Sep 21, 2025 10:54 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.