HarborAI

Crisis PR Agent — AI-Powered Crisis Communications

Inspiration

Crisis communications is one of the last bastions of expensive, slow, human-only consulting. When a scandal breaks, companies scramble — hiring PR agencies at $500/hour, waiting days for a "strategic assessment" that's often just a junior analyst Googling precedents. We asked: what if an AI swarm could do this in 30 seconds?

The idea crystallized around a simple observation: every PR crisis follows a pattern. There are historical precedents, quantifiable financial risks, and a finite set of strategic responses. The real value isn't the information — it's the speed of assembling it into a coherent action plan. That's exactly what multi-agent orchestration excels at.

We wanted to build something that felt like a premium SaaS product, not a hackathon prototype — a tool where you type a company name, and within seconds you're looking at a full crisis dossier: the articles, the financial exposure, three competing response strategies with ROI projections, AI-drafted press releases, and an invoice showing you just saved thousands in consulting fees.

What It Does

Crisis PR Agent is a 5-agent AI pipeline that takes a company name and delivers a complete crisis response package:

The Watcher scans the web for negative press, scores each article's exposure risk, and groups them by crisis topic
The Scorer translates media coverage into financial metrics — estimated reach, customer churn probability, and Value at Risk
The Historical Strategist finds real-world precedents of similar crises and how companies handled them
The Strategist synthesizes everything into three named response strategies (Offensive, Diplomate, Silence) with full communication drafts
The CFO generates a transparent invoice comparing AI cost vs. human consulting equivalent, showing ROI

The user flow is: enter a company name → browse discovered crisis topics → select one → get three strategies with drafts, historical precedents, and a cost breakdown — all from a single pipeline call.

How We Built It

The Agent Pipeline

The backend is a LangGraph orchestration of 5 specialized agents running on FastAPI. The key architectural insight was the execution graph:

Agent 1 → (Agent 2 ‖ Agent 3) → Agent 4 → Agent 5

Agent 1 (The Watcher) runs first, then Agents 2 and 3 execute in parallel (they're independent — precedent research and financial scoring don't depend on each other), and Agent 4 waits for both before generating strategies. Agent 5 is purely computational (no LLM calls).

Each agent has two entry points: one for the LangGraph pipeline (*_node(state)) and a standalone function for the REST API. This let us develop agents independently and compose them flexibly.

Agent 1: Web Intelligence

The Watcher chains three services:

Tavily for news search (up to 10 articles per company)
Jina Reader for full-text extraction (bypasses paywalled/JS-heavy sites)
Gemini 2.5 Flash for structured analysis — each article gets a summary, subject classification, authority score (1–5), severity score (1–5), and sentiment

Articles are processed in parallel (ThreadPoolExecutor, 5 workers). Gemini also filters out navigation/footer/ad content from raw HTML before analysis.

The exposure score combines everything into a single risk metric:

$$\text{Exposure} = (\text{Authority} \times \text{Severity}) \times R_{\text{risk}} \times R_{\text{recency}} \times W_{\text{sentiment}}$$

where risk multipliers vary by subject (security fraud at 1.8x vs. customer service at 1.0x), recency decays from 3.0x (breaking news < 2 hours) down to 0.7x (> 30 days old), and sentiment weights range from 1.0 (negative) to 0.1 (positive).

Agent 3: Financial Risk Modeling

The Scorer translates media signals into business impact using a Monte Carlo-inspired simulation:

$$\text{Reach} = \min!\Big(5000 \times \text{Authority} \times \frac{\text{Severity}}{2} \times V_{\text{viral}},\; 10^6\Big)$$

where $V_{\text{viral}}$ is a Gemini-classified coefficient (0.8 for boring news up to 2.5 for scandal-level virality).

The Value at Risk combines acquisition loss and churn loss:

$$\text{VaR} = \underbrace{\text{Reach} \times 0.005 \times \text{CAC}}{\text{lost prospects}} + \underbrace{\text{Exposed} \times \frac{\text{Severity}}{5} \times \frac{W{\text{topic}}}{10} \times 0.1 \times \text{ARR}}_{\text{churned revenue}}$$

For multi-article crises, we apply deduplication weights — the first article carries 100% of its VaR, the second only 20%, and subsequent articles 10%. This prevents double-counting when multiple outlets cover the same story.

Agent 2: Precedent Research with Grounding

The Historical Strategist uses Gemini Pro with Google Search Grounding in a 3-phase pipeline:

Search for similar historical crises
Search for strategies those companies adopted
Search for measurable outcomes

Each phase builds on the previous one's context. A final verification step with Gemini Flash cross-checks extracted cases against their source URLs to filter hallucinated precedents.

Agent 4: Strategy Generation

The Strategist receives enriched articles (from Agent 3), historical precedents (from Agent 2), and financial metrics to produce three distinct strategies. Each strategy includes a tone, recommended channels, key actions, estimated cost, and projected impact with ROI scores. It also generates four communication drafts: press release, internal email, social media post, and legal notice.

Dual API Key Strategy

We hit Gemini rate limits immediately when running Agent 2 and Agent 3 in parallel. The fix: two separate Google API keys (GOOGLE_API_KEY for Agents 1–2, GOOGLE_API_KEY1 for Agents 3–4), giving each parallel branch its own quota.

The Frontend

The SPA is built with React 18 + Vite + TypeScript + Tailwind CSS v3. We aimed for a design language that feels like a premium consulting tool:

Instrument Serif for display headings, Outfit for body text
A muted blue-gray palette (royal #2b3a8f, steel #5a7d95, mist #e8eaf0)
FLIP-style bubble transitions between views
3D perspective tilt-on-hover cards (inline perspective(800px) to avoid stacking context bugs)
Staggered timeline animations that sync with real agent progress

The frontend transforms raw backend data through dedicated functions. For example, the unbounded exposure scores get log-normalized to a 1–10 criticality scale:

$$\text{Criticality} = \text{clamp}!\Big(1,\;\text{round}(2.2 \times \ln(\text{exposure_score})),\;10\Big)$$

We also built a demo mode — pre-cached API responses for three showcase companies (OpenAI, Tesla, Apple) captured by a Python script and imported at build time. Adding ?debug to the URL bypasses the backend entirely with simulated delays.

Outcome-Based Billing (Paid.ai)

Agent 5 (The CFO) doesn't use an LLM. It aggregates actual API costs from Agents 2–4, calculates what a human consulting equivalent would cost, and builds a structured invoice:

Agent	Human Equivalent	Formula
Historical Strategist	Variable	cases × 3h × €150/h
Risk Analyst	Variable	€500 + 0.01% of total VaR
Executive Strategist	Fixed	€2,500

When the crisis is trivial (alert_level = IGNORE), Agent 5 refuses to bill — returning an action refusal with a reason, because running the full pipeline isn't worth the client's money. The API cost is still tracked for transparency.

Challenges We Faced

Content extraction quality. Raw web pages are noisy — navigation menus, cookie banners, related article links, ad copy. We added a Gemini-powered paragraph filter before analysis: the LLM reads chunks and selects only actual article content. Even then, paywall detection required pattern matching for common paywall phrases.

Financial model calibration. Early iterations of Agent 3 produced absurd VaR numbers — a single blog post about a minor bug would estimate millions in losses. We iterated heavily: reducing the reach multiplier from 20,000 to 5,000, adding topic-weight dampening, introducing the deduplication weights, and capping reach at 1M. The formulas are deliberately conservative — better to underestimate than to cry wolf.

Parallel agent execution and rate limits. LangGraph makes the graph easy, but sharing API quotas across parallel branches was a real pain point. Our dual-API-key solution is pragmatic but inelegant — a proper solution would be token-bucket rate limiting at the client level.

FLIP animations with scrollable containers. The 3D tilt cards on the strategy page initially broke scroll hit-testing. Setting transformStyle: preserve-3d creates a new stacking context that interferes with overflow-y: auto. The fix was using inline perspective(800px) directly in the transform string instead.

Structured LLM output reliability. Even with Pydantic schemas and explicit format instructions, Gemini occasionally returns malformed JSON or skips required fields. Every agent has defensive guards and fallback values. The precedent verification step (Gemini Flash checking Gemini Pro's citations) was added after we caught fabricated source URLs in early runs.

What We Learned

Multi-agent systems are about data contracts, not prompts. The hardest part wasn't writing prompts — it was defining clean Pydantic schemas that every agent agreed on. Type safety across the pipeline caught more bugs than testing did.
Parallel execution requires thinking about resources, not just dependencies. The execution graph is simple, but sharing API quotas, managing thread pools, and handling partial failures added real complexity.
Log-normalization is your friend. When mapping unbounded scores to bounded UI elements (like a 1–10 criticality badge), logarithmic scaling preserves meaningful differences without outliers dominating.
Demo data is a first-class feature. Pre-caching real API responses and replaying them through the same transformation pipeline gave us confidence the UI handles real-world data shapes, not just our mock data assumptions.
Design is a multiplier. Investing in typography, color system, and animations early made the tool feel trustworthy — critical for a product that's asking you to trust AI with your crisis communications.