Inspiration

Most job matching tools optimize for placement speed, not for the person being placed. For a trafficking survivor rebuilding their life, that is the wrong target. A match that ignores a criminal record from exploitation, a missing driver's license, or a need to avoid isolated or male dominated workplaces is not a good match. It is a liability. The Polaris National Survivor Study found that criminal records from coerced activity are one of the largest employment barriers for this population. Few job matching tools model that barrier at all. Even fewer point to a real legal remedy, even though several states let survivors petition to vacate convictions tied to their trafficking. Brief 5 asked for decision support for caseworkers, not surveillance and not automated decisions for individuals. That gap and that framing are what we built toward.

What it does

A caseworker enters a survivor's profile. This includes free text skills, work authorization, documentation held versus required, criminal record history, wage floor, and six trauma informed preferences such as tolerance for isolated workplaces, night shifts, and uniformed roles. The system anonymizes the profile right away. The matching engine and the LLM never see a name. The profile then moves through five stages. Skills get mapped to O*NET categories using embedding similarity. Occupations that violate a hard constraint get filtered out, each with the exact rule that excluded it. What survives gets ranked on ten weighted fit criteria with a fuzzy TOPSIS model. Claude explains the top candidates in plain language, with resume safe framing and risk flags. A parallel analysis looks at everything that was filtered out to find which single real world intervention, such as a DMV pathway, a vacatur filing under a named statute, or a wage floor conversation, would unlock the most additional options. The caseworker sees four views: ranked Candidates, an honest Excluded list, ranked Interventions, and a History tab that tracks outcomes for candidates they already acted on. The system never contacts the survivor and never makes a placement decision. It ranks, explains, and shows its work.

How we built it

We built a five layer pipeline. Each layer is a pure function over a shared, frozen Pydantic contract. A skill mapping layer uses sentence-transformers with the BAAI/bge-small-en-v1.5 model, run locally so it stays deterministic and free of per call API cost. A deterministic rule engine handles hard constraints. A fuzzy TOPSIS model scores the survivors of that filter. An LLM reasoning layer on Claude Sonnet 4.6, run at temperature 0 with JSON schema enforced output, explains results instead of deciding them. A sensitivity analysis layer simulates relaxing each constraint to measure real intervention impact. Three workstreams built against that shared contract at the same time. One owned data and encrypted persistence, using AES-256-GCM field level encryption and HMAC-SHA-256 keyed identifiers. One owned the pipeline internals. One owned the Streamlit UI. This only worked because the contract stayed fixed while all three were building. We used Claude as an AI coding assistant throughout the build, with a human reviewing and directing every change before it landed. We disclose that here as required by the submission rules.

Challenges we ran into

Our most useful bug came from our own testing. We gave the system a list of construction and trade skills: carpentry, drywall, plumbing, and electrical work. The top result it returned was Public Relations Manager. We traced this to missing reference data. About ten percent of our occupation rows, concentrated in the highest paid specialist and management titles, have no skills list and no work context ratings at all. Every scoring function we had written treated that missing data as neutral or favorable instead of as a penalty. A title the data said nothing about was beating titles with real, honestly scored data. We fixed this by adding a named exclusion rule for insufficient data, so missing information is never read as a quiet recommendation. We also found that our intervention feature was overstating its numbers. The underlying rule engine is first match wins, so it only records one reason per excluded occupation. Fixing a license requirement would not actually unlock a job that also failed the wage floor, but our first version did not check for that.

Accomplishments that we're proud of

Every excluded occupation carries a specific, named reason. This includes insufficient data, which we added the same day we found the gap instead of leaving it undocumented. The Interventions panel now re-simulates each constraint relaxation instead of trusting the first recorded reason, so a caseworker can trust the number behind each suggestion. The legal pathway feature is grounded in real statute citations from Florida, New York, and California, instead of generic advice to consult a lawyer. Its language is deliberately careful. It names a pathway and a referral. It never names a filing strategy or predicts an outcome. The whole system also holds a firm line on language. No string in the UI says recommend or best match, because the caseworker is the decision maker and the copy needs to reflect that, not just the architecture behind it.

What we learned

Missing data in a real reference dataset does not fail loudly. It fails by quietly defaulting to fine, and a system that does not notice this will end up rewarding the things it knows the least about. A rule engine built for one purpose, like hard filtering, can give a subtly wrong answer when reused for a second purpose, like a what if analysis, unless you explicitly recompute that second answer instead of reading it off the first one. In a domain this sensitive, deciding what the system will not do carries as much weight as deciding what it will do. No behavioral tracking. No hard exclusion based on past history. No legal advice. No placement decision.

What's next for Constraint-Aware Survivor Career Pathway Planner

Geographic exclusion zones are collected during intake but are not yet enforced, since our occupation data has no location field to check against. Adding real location data at the posting level is the natural next step. We also want scheduled checks for drift in the BLS and O*NET reference data, vacatur statute coverage beyond the three states we have verified, and weight tuning driven by the outcome history we are now collecting instead of by hand set defaults. The most important next step is putting this in front of a real survivor support organization for a pilot and listening closely to what their caseworkers tell us we got wrong.

Built With

Share this project:

Updates