Subtext: AI That Knows When Not To Draft
Where This Started
A PM sends you a Slack message:
"Hey quick one, rough estimate for export? No commitment, just trying to get a feel before Friday planning."
You answer with a number because you're trying to be helpful.
By Friday, that number is in the roadmap. Six weeks later, you're working weekends to hit a deadline you never really agreed to.
I tested that exact thread in ChatGPT, Claude, and Gemini. All three wrote polished, reasonable-sounding replies. The problem was that the replies still gave an estimate.
The wording was fine. The move was wrong.
That is the gap Subtext is built for. Most AI writing tools help you say something better. Subtext helps you decide whether saying it is the right move at all.
What Subtext Does
You paste a sensitive workplace thread. Subtext returns a short decision brief:
- The Read: what seems to be happening, with evidence from the thread.
- The Move: the next action to take, not just tone advice.
- The Reply: three calibrated drafts: Cooperative, Boundary-setting, and Escalation-ready.
- The Watch: what to look for in the next message.
If the thread looks high-risk, Subtext changes behavior. Mentions of HR, retaliation, performance plans, role fit, or judgment concerns do not get the normal three drafts. Instead, Subtext gives one factual holding reply and points the user toward an appropriate support channel.
Sometimes the safest draft is no draft.
The Pivot
The first version was basically one LLM call with a long system prompt.
It worked until it didn't.
Hidden-commitment threads still sometimes produced estimates. Manager-HR threads sometimes suggested "talk to your manager," even when the manager was the person who sent the concerning message. Every time I tightened the prompt, a new failure mode appeared.
So I changed the architecture.
The LLM does not get to choose the move anymore. A deterministic workplace playbook does that. The LLM can enrich the read and improve wording, but it cannot override the safety policy.
The current system has three layers:
- A deterministic playbook names the pattern and chooses the move: hidden commitment, paper-trail avoidance, deadline compression, scope creep, or high-risk.
- The LLM helps where it is useful: evidence, alternate reads, and wording.
- A local critic checks the result: no placeholders, no accidental estimates, no legal conclusions, no unsafe routing back to the sender.
If the critic finds a problem, the LLM output is discarded and Subtext falls back to the playbook.
For high-risk threads, the controller skips the LLM entirely.
How I Built It
Subtext was built with MeDo using React, TypeScript, Tailwind/shadcn, Supabase Edge Functions, and Gemini 2.5 Flash through MeDo's gateway.
The important part was not just generating screens. It was using MeDo's iteration loop to turn a fragile prompt into an actual advisor pipeline:
Signal Extractor -> Risk Classifier -> Strategist -> Drafter -> Critic
I grounded the prompt and playbook in a few frameworks:
- SBI for evidence quality.
- Psychological safety for tone.
- Harvard negotiation principles for avoiding accidental commitments.
- EEOC-style boundaries for calibrated language and support-channel routing.
What Was Hard
The hardest part was keeping the LLM in its lane.
LLMs are very good at sounding helpful. In this product, that can be dangerous. A beautifully written reply that gives the wrong commitment is still the wrong answer.
Power imbalance was another hard case. If the manager is the source of the concern, "ask your manager" is not safe guidance. Subtext now blocks that path in both the prompt and the critic.
The language also has to stay calibrated. Subtext should not say "this is illegal." It should say "this may warrant caution" or "consider consulting an appropriate support channel." That difference matters.
What I Learned
For sensitive decisions, safety is not just a prompt problem. It is a controller problem.
If the model can override the policy, the policy is not real.
That lesson applies beyond workplace communication: legal intake, medical triage, financial tools, customer safety workflows. Anywhere the cost of the wrong answer is high, the model should not be the only guardrail.
The biggest feature in Subtext is the refusal. When the app decides not to draft a normal reply, that is not a missing feature. That is the product doing its job.
The Principle
The move is the product.
The LLM serves the move.
The critic guards the boundary.
Built With
- baidu
- claude
- ernie
- gemini
- medo
- react
- shadcn-ui
- tailwind-css
- typescript
- vite
Log in or sign up for Devpost to join the conversation.