Outreach OS

About the Project

Inspiration

Cold outreach is broken. Sales teams spend hours manually searching Google Maps, copying business names into spreadsheets, visiting websites one by one, and writing the same research notes over and over. We wanted to ask a simple question: what if you could type a plain English brief like "independent coffee roasters in Portland, OR" and get a fully enriched lead list in under a minute?

That question became Outreach OS.

How We Built It

The system is a multi-layer pipeline. A user submits a query through the React frontend, which hits a FastAPI backend. The backend provisions a fresh, isolated Postgres database for that job using Ghost's MCP server, then hands off to a research agent.

The agent runs three sequential phases:

Search -- HasData's Google Maps scrape API returns businesses matching the query, including name, phone, address, website, and rating.
Enrich -- For each lead, the agent scrapes the business website with BeautifulSoup and passes the content to GPT-4o-mini, which generates a 2 to 3 sentence research summary highlighting services, business signals, and potential pain points. Email addresses are extracted via regex.
Write -- Enriched leads are persisted to the per-job Ghost database via asyncpg.

The frontend polls for status updates and surfaces leads in a clean table the moment the pipeline completes. Each job gets its own isolated database, and completed jobs can be forked -- Ghost snapshots the database instantly so you can re-run research on a fresh copy without touching the original.

The per-lead cost follows a simple model. If $n$ leads are processed, with an average enrichment cost of $c_e$ per lead and a search cost amortized across the batch as $c_s$, the total cost is approximately:

$$C(n) = n \cdot c_e + c_s$$

At current pricing, $c_e \approx \$0.001$ (OpenAI) and $c_s \approx \$0.002 \cdot n$ (HasData), giving roughly $C(10) \approx \$0.03$ per run.

What We Learned

Database-per-job isolation is a genuinely powerful pattern. Ghost made it trivial to spin up a fresh Postgres instance per query and fork it later, which gave us clean data boundaries without any multi-tenant complexity. We also learned that website scraping is the dominant source of latency -- the $O(n)$ sequential enrichment loop is the obvious next optimization target, since parallelizing it would bring 10-lead runs from ~45 seconds down to ~10 seconds.

Challenges

The biggest challenge was wiring the Ghost MCP server into an async FastAPI process correctly. MCP stdio sessions are stateful and cannot be shared across concurrent requests, so we had to serialize all Ghost calls behind an asyncio lock. Getting the fork workflow right -- provisioning, schema migration, research re-run, and status tracking -- required careful coordination across the orchestrator, the Ghost MCP client, and the master database.

Auth0 JWT validation in an async context and keeping the frontend polling loop snappy without hammering the API were smaller but real friction points.

Built With

anthropic-mcp-python-sdk
asyncpg
auth0
auth0-react-sdk
beautifulsoup4
docker
fastapi
ghost
ghost-database-provisioning
ghost-mcp-server
hasdata-google-maps-search-api
httpx
jwt
mcp-server
openai-gpt-4o-mini
postgresql
pydantic
python-3.11
python-dotenv
python-jose
react-19
react-router
rs256
truefoundry
uvicorn
vite

Updates

Atin Kumar Singh started this project — Mar 27, 2026 08:01 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.