Subboom

GIF
Substack Galaxy
Statistically significant differences, on a quick overview
GIF
Engagement increases after work, zigzags in the morning as people have different schedules
GIF
Longer Content gets more engagement, until it's too long
GIF
Finding connected Topics
Substack Recommender Leaderboard
Numbers decrease engagement, people read _and_ comment on weekends
Details matter, images + questions do worse than either alone

Inspiration for subboom!

I love data. I love Substack. Subboom started as a playful Substack “galaxy” exploration, a galaxy-like map I built because I was curious about how newsletters connect. I did my thesis at the Social Networks lab at ETH Zurich, so networks and graphs are often on my mind.

I decided to go bigger when the hackathon got extended: I would transform that galaxy into a full analytics dashboard, helping readers discover new writers through recommendation networks and helping writers learn what drives engagement.

One-liner: Subboom turns the Substack universe into a fun, interactive research exploration for writers and readers—powered by analytics, networks, and Kiro-accelerated development.

What subboom! does

Subboom is a comprehensive analytics and visualization suite for the Substack ecosystem.

Home / Leaderboard (/) – Ranked publications with filtering (category, free/paid) and responsive tables.
Engagement Analysis (/engagement) – Feature impact, time patterns (hourly/weekend effects), content-length correlations, and top-post exemplars.
Topic Explorer (/topic-explorer) – Sort/search topics; scatter plots, bar charts, sparklines, treemap, engagement rates per topic and more.
Topics Overview (/topics) – topic galaxy (force graph), with search, node sizing, and network toggles.
Categories (/cats) – Category stats (mean/median/min/max/std dev), recommendation in/out counts.
Categories Graph (/graph/cats) – category network graph, to be extended with more data.
Substack Galaxy (/graph/recommendation-map) – High-performance 2D map with fuzzy search, 10k+ newsletters, smooth focus animations, and recommendation edges. The original idea. The seed, also why I chose the name, subboom, because the animation feels like a big bang of publications coming to life.

Who it helps

Writers: pick collaborators that could connect to bigger audiences via their recommendation network, learn when to post, what topics are trending/competitive, and which title features hurt/help engagement and what type of content works in which topic, also, details matter!
Readers: traverse “mini-clusters” of connected newsletters; discover hidden gems by network proximity.
Everyone: it’s just genuinely fun to fly around the data galaxy and explore data.

How I built it (with Kiro)

Almost exclusively built in Kiro.

Stack: SvelteKit + Svelte 5, Tailwind v4, Skeleton UI, D3, Three.js (3D). Deployed on Cloudflare Pages/Workers (client-rendered, heavy caching).
Kiro-first workflow
- Specs + Steering: A large /.kiro directory with specs for the TopicGalaxy, Leaderboard, and Accessibility (color-blind-safe palettes, ARIA patterns). I keep a custom rules.md steering file that tells Kiro what to watch for (e.g., “don’t catch-and-hide errors in if blocks”), which dramatically improved code quality.
- Hooks (game-changer early on): After any .svelte edit, a hook reviewed Skeleton dev color pairings, built-in directives, Svelte 5 runes, and Tailwind usage, then auto-fixing inconsistencies where Claude usually gets stuck, was a huge timesaver.
- Testing discipline: Kiro reliably generated and ran UI/UX tests and data-handling tests (null safety, edge cases). It repeatedly surfaced ARIA gaps and fixed them in context.
- Vibe → Spec → Code: I started with heavier spec usage, then mixed in faster “vibe requests” once the architecture was stable.
Impact: I estimate ~100 hours saved vs. building this solo from scratch; the whole project came together in ~30–50 hours thanks to Kiro.

Challenges I ran into

Data scale vs. edge runtime: CloudFlare CPU limits made SSR expensive, so I disabled SSR, pushed rendering client-side, and leaned on aggressive caching.
Graph size & meaning: The recommendation network was originally ~30k edges; after cleaning and trimming non-active/low-signal nodes, I settled at ~12k nodes / 17k edges for both data quality and frame-rate.
Perf & usability: Level-of-detail (LOD) rendering, debounced search/filters, virtualized lists, and canvas-based drawing keep interactions smooth.

Accomplishments I am proud of

Scale: ~50k posts, several thousand newsletters, hundreds of thousands of comments analyzed.
UX: Keyboard navigation, ARIA labeling, and accessible color choices were accounted for. Many via Kiro’s auto-fix and check UX tasks.
Discoveries (from my dataset):
- Mini-clusters: some newsletters recommend niche sets that nobody else does.
- Feature insights: images and reader questions correlate with higher engagement; technical/data posts draw the most comments.
- Numbers and exclamation marks in the titles correlate with lower engagement (numbers ≈–50% in my models).

What I learned

Spec-driven dev with Kiro: writing good specs and a strong steering file compounds; hooks catch stylistic/library issues instantly; letting tests drive fixes leads to a consistently accessible UI.
Big-data perf on the web: How to balance pre-computation, LOD, and client rendering for fluid graph exploration.

What’s next

Engagement prediction API: I’ve trained predictors/inference models; I’ll expose endpoints so writers can estimate engagement before publishing.
Richer topic exemplars: More sample posts per topic; deeper cross-topic trends.
Insights for action: Collaboration suggestions (“who to partner with next?”), cohort growth curves, posting-time recommendations, and alerts.

Methods & formulas (selected)

Let $RR$ = reactions, $CC$ = comments, $SS$ = restacks, and $N_\text{subs}$ = subscribers.

Engagement (per 1k subscribers)

$$ER_{1k}=\frac{R+\;w_c\,C+\;w_s\,S}{N_\text{subs}}\times 1000,\quad \text{with } w_c, w_s > 1 \text{ to reflect higher effort.}$$

Network basics

Degree $k_i=\sum_j A_{ij}$; PageRank $PR$ with damping $d\in(0,1)$. I use degree/PageRank for salience, plus edge-weighted LOD for rendering.

Temporal patterns

I test hourly/day-of-week effects and title-feature impacts via feature importance (tree models) and statistical tests (normalized by $N_\text{subs}$).

Data notes

Data were cleaned/deduped, and non-active nodes pruned for clarity/performance. I report correlations/associations from this dataset—not universal causal claims.

Built With

cloudflare
html5
svelte
sveltekit
tailwind
typescript

Updates

Oliver Holl started this project — Sep 14, 2025 02:51 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.