Inspiration

Landing pages fail quietly.
Founders, marketers, and builders spend hours tweaking copy, colors, and layouts—but still don’t know why users bounce or conversions stall. Most tools show metrics, not reasons.

We wanted to build something that behaves like a senior SEO and CRO expert sitting beside you, opening your landing page, seeing what users see, and telling you—clearly and honestly—what’s going wrong.

With the rise of multimodal AI, we realized Gemini could do more than generate text. It could look at a page, reason about it, and explain its impact on user behavior. That became the core idea behind LandingPageRoasterAI.


What it does

LandingPageRoasterAI takes a landing page URL and:

  • Opens the page in a real browser
  • Captures the visible content and layout
  • Analyzes the page using Gemini’s multimodal reasoning
  • Produces an expert-level audit with:
    • A bold, attention-grabbing roast
    • A realistic conversion score
    • Clear diagnosis of what’s hurting SEO and conversions
    • Actionable guidance on how to fix it

Instead of generic advice, the feedback is grounded in what’s actually visible on the page—copy, hierarchy, CTAs, trust signals, and UX clarity.


How we built it

We designed the system as a grounded, agent-like pipeline:

  1. The frontend collects a landing page URL
  2. The backend uses Playwright to load the page like a real user
  3. We extract:
    • Visible page text
    • A full-page screenshot
  4. This real data is passed to Gemini for multimodal analysis
  5. Gemini reasons over the content and returns a structured audit and roast
  6. The frontend renders the results in a clear, readable format

By separating observation (what exists on the page) from judgment (why it fails or succeeds), we ensured the AI’s output is grounded, explainable, and consistent.


Challenges we ran into

One of the biggest challenges was realizing that AI cannot meaningfully judge a website from a URL alone. Without real page data, the model naturally converges to average, generic outputs.

We also faced challenges enforcing strict output formats from a large language model while still allowing expressive, human-like feedback. Solving this required carefully designing prompt contracts and output schemas so Gemini could reason freely without breaking the system.

Balancing multimodal input, performance, and reliability within a hackathon timeframe was another key challenge.


Accomplishments that we're proud of

  • Built a fully grounded, multimodal Gemini application
  • Used real browser rendering instead of static assumptions
  • Created an AI system that explains why pages fail, not just what to change
  • Designed a stable AI-to-backend contract suitable for real products
  • Delivered an end-to-end working system within a short hackathon window

What we learned

  • Multimodal AI is most powerful when grounded in real data
  • Clear input–output contracts are essential when building with LLMs
  • “Reasoning” is not magic—it must be carefully constrained and validated
  • AI products succeed when they explain decisions, not just generate output

Most importantly, we learned that AI can act like a true expert—when given the right context and boundaries.


What's next for LandingPageRoasterAI

  • Add mobile-specific audits and device previews
  • Introduce section-level scoring and prioritization
  • Allow comparison between multiple landing pages
  • Expand beyond landing pages to onboarding flows and product pages
  • Enable continuous monitoring as pages evolve

Our long-term vision is to make expert-level UX and conversion insights accessible to anyone building on the web.

Built With

Share this project:

Updates