Hirelock

Inspiration

Hiring is shifting fast. Many companies already use automated screens, structured assessments, and increasingly AI-mediated stages—resume parsing, ranking, chat screens, and interview-style evaluations. That change is not just a product story; it is a systems story: who advances, who drops out before they even apply, and how small policy changes compound across thousands of candidates.

We were motivated by a simple question: if “the next generation of interviews” is partly algorithmic, what outcomes does that produce before it ever touches a real applicant? We did not want to argue from vibes. We wanted a repeatable sandbox—grounded in real resume text where possible—where we could run scenarios, log every decision, and inspect fairness and funnel effects with the same seriousness as an experiment, not a demo.

What it does

We set out to build a hybrid simulation: many synthetic students, a constrained job market, company-specific funnels (screening, assessments, interviews, offers), and selective use of LLMs where judgment matters—not at every tick, because cost, latency, and interpretability matter.

Alongside the sim, we wanted visibility: a web surface to start runs, stream live events, and dig into traces so “why did this candidate fail stage k?” is answerable from data, not folklore.

How we built it

Ground truth from resumes We built an ingestion path from raw PDFs/resumes into structured records: a JSONL corpus plus per-resume artifacts, with optional upload paths for larger experiments. That gives the simulation something to anchor to skills, history, and text instead of purely random agents.
Personas and behavior We generate richer persona profiles from parsed resume content so agents have interview-relevant traits (communication style, knowledge calibration, assessment tendencies, and so on).
Simulation core The engine centers on explicit entities, hiring funnel stages, and orchestration (including parallel paths where we needed throughput). Policies decide apply/skip, progression, and outcomes; the important design choice was to keep the loop inspectable, every transition should be representable as an event.
LLM layer Where the problem is genuinely linguistic or judgment-like, we route to models; where it is mechanical, we keep it deterministic. That mirrors how real orgs will deploy AI in hiring: not everywhere, but at high-leverage nodes.
Web UI and observability We shipped a Vite + React front end with a dev server integration that can stream simulation events into the browser. The UI is as much the product as the sim: without replay and cohort views, the metrics are not actionable.

Challenges we ran into

Cost and throughput: LLM-backed stages scale with event volume; we had to design throttling, batching/parallelism, and “cheap vs strong” call sites so experiments did not become a billing incident. Personas must be expressive enough to matter, but not so free-form that runs become irreproducible theater. Observability at scale: High-volume JSONL event logs are powerful and painful—querying, summarizing, and surfacing the right slice in the UI without drowning the user took iteration.

Accomplishments that we're proud of

We built a pipeline that ingests résumés into structured records and turns them into richer persona profiles for simulated candidates. We shipped a multi-stage hiring simulation with detailed event logging and a web UI that runs scenarios and streams traces so we can inspect funnel outcomes end to end.

What we learned

Routing models only where needed is how you keep runs repeatable, affordable, and debuggable.

What's next for Hirelock

The natural extensions are richer labor-market dynamics (network effects, rumors, discouragement), stronger analytics, and tighter coupling between resume upload, building the persona, and competition among a individual pool so that peole can reason about their own distribution of outcomes.