PathMapper

The interface
The pre-friend stage, Cora checking in until the messages are clear enough.
Test scenario
Research base
Our research base
PIN input every time the user refreshes or enters again.
Account PIN to ensure user safety.
Account theme
Changing PIN's

Inspiration

When someone faces a major life decision, they are simultaneously overwhelmed by information, uncertain about their own values, and emotionally too activated to think clearly. They usually have enough information; what they lack is a structured way to reason through it without judgment.

We realized that existing decision support tools fail because they assume the user is calm and ready to decide. Pros and cons lists force complex, emotionally loaded trade-offs into flat binary comparisons. Standard chatbots hallucinate confidence and fail to detect logical contradictions. This inevitably leads to cognitive overload and the tendency to make major choices under acute stress — which reliably produces worse outcomes. These chatbots reason like a machine; you give a prompt, they give a huge output that doesn’t really help much…you ask for simpler approach, they oversimplify the matter. We were inspired to build a system that acts like an emotionally aware friend, one that thinks with you, rather than deciding for you.

What It Does

PathMapper is an AI-powered life decision simulator that helps users unpack complex career and life dilemmas through a structured, conversational reasoning pipeline.

Input & Emotional Sensing: The user provides a free-text description of their dilemma. While they type, an Emotional Sensing Layer monitors hesitation and write/delete cycles, gracefully advising them to pause and feel free while writing, type whatever they want…whatever makes them comfortable. If elevated stress is detected — without blocking or interrupting the conversation.
The Pre-Friend Gate: Before any analysis begins, Cora (the Coordinator) checks whether the input has enough substance to reason about — does it name real options, a stated factor, a timeframe, an emotional signal, personal stakes? If the input is too thin, Cora asks a few light follow-up questions rather than letting the pipeline run on a one-line input and silently produce hallucinated checkpoints.
Cognitive Checkpoints: Once the input passes Cora's gate, the system detects patterns like contradictions, repetition, bundling, hedging, or omission. It deploys one of five specialized AI Friends — Felix, Paige, Carter, Connie, and Blair — each with a distinct personality suited to the emotional weight of their checkpoint, to ask targeted questions and resolve logical friction before any analysis begins.
Narrative & Scoring: The system generates two contrasting, first-person future narratives grounded in the resolved premises. These paths are scored deterministically across five dimensions — Financial Trajectory, Growth Rate, Values Alignment, Social Capital, and Stability — producing a side-by-side comparison and a "lean" with an explicit "flip condition" (the specific fact that would make the lean wrong).
Human Handback: The session concludes by explicitly returning agency to the user, ensuring they remain in control of the final decision.

How We Built It

We engineered PathMapper in 7 days using a high-velocity, dual-model architecture built on Next.js 14, React, and TypeScript.

The AI Reasoning Pipeline: We use Groq (llama-3.3-70b-versatile) as our primary Actor for low-latency conversational reasoning, while Gemini (gemini-2.0-flash) functions as our Critic and safeguard validation engine. If Groq hits a rate limit, the pipeline automatically falls back to Gemini — transparent to the user.

Actor-Critic Loop: To ensure our AI personas never sound clinical or robotic, Groq drafts a response and Gemini evaluates it against strict tone and logic constraints. If Gemini flags an issue, Groq revises and resubmits — capped at two iterations, after which the main model's output ships regardless, preventing the system from stalling in an infinite disagreement loop.

Schema Validation: Every LLM output is validated against a strict Zod schema before the pipeline proceeds. If the model returns malformed JSON, the pipeline throws a typed error rather than silently breaking downstream.

Deterministic Scoring: To completely eliminate LLM mathematical hallucinations, the final scoring phase uses pure TypeScript. The engine parses structured JSON tags extracted from the narratives to evaluate five SCCT-grounded dimensions. The total confidence score is computed as:

$$S_{total} = \sum_{i=1}^{5} d_i \cdot w$$

Where $d_i$ is the discrete score for dimension $i$, $d_i \in \{1, 2, 3, 4, 5\}$, and $w = 4$ normalizes the output to a 100-point scale for internal LLM context. Users see scores displayed as X/25.

Security: We implemented AES-256-GCM encrypted localStorage (key derived from the user's Clerk ID via SHA-256, with a unique random IV per operation), a PIN-protected inactivity lock, automatic session wipe on logout, and server-side API key isolation — all verified via network audit.

Challenges We Ran Into

Our biggest challenge was ensuring the system did not oversimplify the user's intent or build narratives on unchecked assumptions. Initially, a single overloaded LLM call produced flawed narratives — an early version assumed our demo persona Alex prioritized salary simply because he mentioned it first, even though he admitted three paragraphs later that flexibility actually mattered more. We solved this by architecting the Checkpoint Loop, forcing the AI to pause and resolve contradictions in a conversational turn before proceeding to narrative generation.

Maintaining distinct persona voices — ensuring Paige stays honest but soft while Felix stays a straight shooter, without either drifting into the other's tone — required extensive prompt engineering and the Gemini critic loop. Each friend's system prompt includes an explicit voice card that bans AI vocabulary entirely and instructs lowercase, casual texting. The critic validates tone compliance on every response before the user sees it.

Another problem was calibration; the synthetic test cases were poor and didn’t give reliable results to measure against. That’s why we decided we needed a full research in order to fully understand the matter and be able to establish checkpoints that matter and test cases that measures the system.

Accomplishments We're Proud Of

We are proud of our Responsible AI Guardrails — specifically the flip-condition, which forces the system to display its own most vulnerable point and structurally prevents user over-reliance. We also successfully engineered an Emotional Sensing Layer that tracks behavioral stress entirely on the client, preserving strict privacy without requiring backend database storage or API tokens. On the security side, AES-256-GCM encryption with per-operation IVs, PIN-protected inactivity lock, and server-side API key isolation make PathMapper's security posture meaningfully stronger than most hackathon projects handling sensitive personal data. On a side note, we are also proud of the watch idea. The stress detection is a really important feature in our project that we didn’t want to be exclusive for those who own expensive smart watches only; that’s why our cheap alternative is a great addition to allow all users to access this feature.

What We Learned

We learned how deeply behavioral decision science intersects with prompt engineering. Translating abstract psychological concepts like Bounded Rationality, Epistemic Uncertainty (Hedging), and System 1 vs. System 2 thinking into deterministic code states and structured JSON schemas taught us how to bridge the gap between human empathy and computational logic.

We also learned from the feedback on our qualifier submission. Our evaluators noted that strong responsible AI sections name concrete, scenario-specific mitigations — not generic risks. That feedback directly shaped how we designed PathMapper's guardrails: rather than listing "AI could be biased," we identified the specific failure mode, named who gets harmed, and built four structural design choices to reduce it. The qualifier feedback also pushed us to justify architectural choices, not just describe them — which is why every component in PathMapper has an explicit answer to "why this, not something simpler."

What's Next for PathMapper

Full Hardware Wearable Integration: Building the companion app and finalizing the custom ESP32/GSR wristband to push live biometric stress data into the sensing layer via WebSockets allowing actual connection between watch and web rather than the dummy one currently implemented.
User-Set Weights: UI slider controls allowing users to override the uniform scoring weights so the algorithm dynamically reflects their personal values. The totalScore() function already accepts a weights parameter — the UI is what's missing.
Therapist Mode: Letting users share a structured PathMapper session summary with their therapist, replacing "I have a decision to make" with a precise brief of what the AI detected and what the current lean is.
Cross-Device Sync: Secure, row-level encrypted server-side storage to maintain sessions across devices.
Longitudinal Tracking: Following up 3 months later to see if the decision worked out — creating a feedback loop that validates and improves the scoring model over time.