Steyer Aura

Inspiration

The rise of AI-powered search (ChatGPT, Perplexity, Gemini) is rewriting the rules of online visibility. Traditional SEO optimizes for Google's crawler, but what about being cited by an LLM? Most businesses have zero awareness of their "AI discoverability," and only a few existed to measure or fix it. We built Steyer Aura to change that.

What it does

Steyer Aura takes any website URL and delivers a full AI-readiness report: a GEO score ∈ [0, 100], a coherence score and a comparison score both computed as the cosine similarity between sentence embeddings:

$$\text{similarity} = \frac{\vec{v}_A \cdot \vec{v}_B}{|\vec{v}_A| \cdot |\vec{v}_B|}$$

The coherence score compares the site's own content against its web reputation (retrieved via Tavily); the comparison score benchmarks it against the sector leader. Steyer Aura then auto-generates an llms.txt file optimized for AI crawlers, rewrites the site's metadata and structured data using Gemini, and exports the full audit as a visual mind map on Miro.

How we built it

Four-stage pipeline:

Crawl — Crawl4AI + BeautifulSoup extracts clean Markdown and JSON-LD structured data
Compress — Compresr (YC 2026) semantically distills the content into a token-minimal summary safe to feed to an LLM
Audit — Gemini 2.5 Flash via LangChain scores the site; Sentence Transformers compute the similarity metrics
Improve — Gemini rewrites metadata, structured data, and content for AI engines

Wired together through a FastAPI backend, Supabase webhooks, and ngrok. We used Streamlit as a rapid sanity-check UI during development, then Lovable to ship a polished production React frontend connected to the backend in under two hours.

Challenges we ran into

Raw website content easily overflows an LLM's context window, producing hallucinated scores and invented recommendations , our biggest correctness bug. Integrating Compresr's compression API fixed it entirely. We also had to surgically isolate Crawl4AI's async event loop from synchronous LangChain calls using concurrent.futures, and spent significant time mapping nested audit JSON onto a meaningful spatial layout in the Miro MCP server.

Accomplishments that we're proud of

End-to-end working pipeline in 24 hours: crawl any site so far, score it, compress it with a YC-backed tool, rewrite it with Gemini, and visualize the full audit as a Miro mind map — all from a single API call. The Streamlit → Lovable frontend strategy let us validate fast and ship clean without sacrificing either speed or quality.

What we learned

Context window management is a correctness problem, not just a performance one. Combining multiple specialized models (Gemini for analysis, Sentence Transformers for similarity) yields far richer outputs than any single-model approach. And GEO is a real, measurable, underserved problem that every business with a web presence will need to solve.