Scoutlytics

Landing Page
Dashboard
Competitor Citation Heatmap
Competitor Benchmarking
Content Freshness
Download the report in PDF
Generated PDF

About the project

Scoutlytics is the first Answer Engine Optimization (AEO) tool that moves beyond diagnosis into execution. Every existing AEO tool tells you what to fix. Scoutlytics fixes it — generating deployment-ready assets in minutes, not weeks and packaging them into a professional PDF implementation brief.

Inspiration

AI search engines — ChatGPT, Perplexity, Google AI Overviews, Claude, You.com — now answer queries directly, citing specific web pages. Traditional SEO no longer guarantees visibility. Brands need to appear in the AI-generated answer, not below it.

The tools that exist today (Profound, Scrunch AI, Peec AI, Omnia) all stop at dashboards and recommendations. Recommendations require a strategist to interpret, a developer to implement, and a content writer to execute. That cycle takes 3–6 weeks minimum.

We asked: what if a single tool could collapse that entire pipeline — from live citation intelligence to deployment-ready fixes — into one automated run? That's Scoutlytics.

What it does

Input a domain and topic. Receive everything you need to get cited by AI search engines:

Citation Discovery — queries You.com Search API with 5 query variants to find which URLs AI engines currently cite for your topic
Live Content Extraction — fetches full page content from every cited URL via You.com's livecrawl mode (real-time, not cached)
Pattern Analysis — a custom TypeScript engine clusters cited pages into structural archetypes, identifies gaps, and calculates a Citation Probability Score (0–100)
Asset Generation — You.com Express Agent produces deployment-ready fixes: rewritten page copy, valid JSON-LD schema, FAQ sections, and content blocks — tailored to the specific gaps identified
PDF Implementation Brief — all assets are rendered into a professional multi-page PDF via Foxit PDF Services, ready to hand off to a developer or content team

Total time: under 90 seconds.

How we built it

Architecture: Next.js 16 App Router with a 5-stage asynchronous pipeline. Each stage is a discrete API route designed to complete within Vercel's 10-second serverless function timeout. The client orchestrates stages sequentially from a loading page with live progress visualization.

You.com APIs (4 capabilities):

Search API — called with 5 query variants per analysis for citation discovery. This is the ground-truth layer; without it, we have no data on what AI engines actually cite.
Search API (livecrawl) — extracts full live page content for every cited URL plus the user's domain. The structural signals parsed from these pages (headings, schema, FAQ, entities, word count) feed the pattern engine.
Express Agent API — generates deployment-ready assets. Prompts are constructed dynamically from the gap analysis, so output is tailored to each run.
Advanced Agent API — runs deep iterative research with streaming for complex topics, surfacing subtopic coverage and knowledge gaps.

Foxit APIs (2 services + fallback):

PDF Services API — primary output pipeline. The brief is rendered as self-contained HTML → uploaded to Foxit → converted to PDF → polled → downloaded. The template includes structured sections, code blocks, before/after comparisons, and a branded cover page.
Document Generation API — template-driven fallback if PDF Services is unavailable.
DOCX fallback — if both Foxit services fail, generates a DOCX locally via the docx library. The user always gets a downloadable deliverable.

Data layer: Hybrid persistence — in-memory Map for fast access during analysis, Supabase as the durable backing store for dashboard history.

Scoring: Citation Probability Score is calculated from weighted structural signals:

$$S = S_{\text{base}} + S_{\text{citation}} + S_{\text{schema}} + S_{\text{faq}} + S_{\text{depth}} + S_{\text{headings}} + S_{\text{entities}}$$

where $ S_{\text{base}} = 10 $, $ S_{\text{citation}} \leq 30 $, $ S_{\text{schema}} \leq 15 $, $ S_{\text{faq}} \leq 12 $, $ S_{\text{depth}} \leq 15 $, $ S_{\text{headings}} \leq 12 $, $ S_{\text{entities}} \leq 6 $, capped at 100.

Challenges we ran into

Vercel's 10-second timeout forced us to split what is logically one pipeline into 5 independent API routes, each with its own error handling and state persistence. The client-side orchestration had to be resilient to partial failures.
You.com livecrawl returns markdown, not HTML, which means we couldn't extract existing schema markup directly from the crawled content. We had to build heuristic detection from the markdown structure instead.
JSON-LD escaping — the Express Agent returns JSON with literal escape sequences (\n, \"). We wrote a multi-layer parsing pipeline: JSON.parse → double-encoded string detection → manual unescape fallback → clean re-stringify.
@graph schema validation — standard JSON-LD validators choke on @graph wrapper structures. We built custom logic to detect @graph arrays, validate @context at the parent level, and iterate items individually.
Foxit PDF rendering required engineering a self-contained HTML template with inline CSS that renders correctly across PDF conversion — no external stylesheets, no asset references, print-optimized page breaks.

Accomplishments that we're proud of

Zero-to-deliverable in 90 seconds. A complete competitive analysis, gap identification, asset generation, and professional PDF brief — fully automated.
Three-strategy PDF resilience. Foxit PDF Services → Foxit Document Generation → local DOCX. The user always gets their document.
The pipeline is not synthetic. Every You.com API call serves a distinct, necessary function. Remove any one and the pipeline breaks. The same is true for Foxit — the PDF brief is the product, not a demo feature.
Citation Probability Score provides a quantified, reproducible metric where the industry currently relies on qualitative guesses.
Live, not cached. All content extraction uses You.com's livecrawl mode — judges can verify the data is real-time.

What we learned

AEO is a real gap in the market. Every tool we researched stops at recommendations. The execution gap is where all the value is.
You.com's API surface is surprisingly deep. The combination of Search + livecrawl + Express Agent + Advanced Agent covers the full spectrum from data retrieval to content generation. We didn't need any other AI provider.
PDF generation is harder than it looks. Browser-rendered HTML and PDF-rendered HTML behave differently. Inline CSS, careful section breaking, and self-contained templates are non-negotiable for consistent output.
Serverless constraints shape architecture. The 10-second timeout is a hard wall that forced better design — each stage is independently retry-able and the state model had to support partial progress.

What's next for Scoutlytics

Scheduled monitoring — automated recurring analyses that track citation status over time and alert when a competitor gains or loses a citation
CMS integrations — one-click deployment of generated assets directly to WordPress, Webflow, and Shopify
Multi-language AEO — citation patterns differ across languages and regions; expanding query variants and content generation to non-English markets
Browser extension — a lightweight overlay that shows citation probability scores while browsing any page
Team collaboration — shared workspaces, role-based access, and approval workflows for enterprise AEO teams

Built With

foxit
next.js
node.js
react
supabase
tailwind
typescript
vercel
you.com

Updates

Private user started this project — Feb 20, 2026 10:38 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.