Sourcerer

Inspiration

AI can already teach almost anything, but it can also hallucinate—and beginners often don’t know when an answer is wrong. We noticed that people trust learning on platforms like YouTube, blogs, or forums not because they are perfect, but because knowledge there is challenged, corrected, and discussed.

AI chats today are isolated. There’s no visible debate, no peer review, and no sourcing.

We built Sourcerer to add that missing trust layer to AI learning—combining critique, sourcing, and transparency into one system.

What it does

Sourcerer is an AI learning platform that transforms tutoring conversations into sourced, reviewable study posts.

Users learn through an AI tutor chat.
The conversation is turned into a structured study post.
AI reviewer agents critique the content:
- Skeptic AI challenges weak reasoning
- Fact-Checker AI flags hallucinations
- Beginner AI asks clarifying questions
- Explainer AI improves clarity
- Consensus AI summarizes trust and uncertainty
A browser-grounded verifier checks important claims and attaches sources.
A trust score shows which parts are reliable or uncertain.
Users can view everything in:
- Thread View (Reddit-style comments)
- Visual Review View (AI reviewers attached to exact paragraphs)

Instead of trusting one answer, learners can see knowledge being challenged, sourced, and improved in real time.

How we built it

We built Sourcerer as a multi-agent AI system with browser-grounded verification and an interactive frontend.

Backend (Python + FastAPI)

FastAPI + Uvicorn for a clean REST API (/ask, /chat, /convert, /reply)
A central run_pipeline() function orchestrates all agents:
- Generator (initial answer)
- Critic agents (multi-perspective review using Claude Haiku)
- Verifier agent (browser grounding with Stagehand + Browserbase)
- Teacher (final synthesis using Claude Sonnet)
- Consensus (trust summary and scoring)
Anthropic SDK (Claude) powers all reasoning:
- Haiku for lightweight critique and evaluation
- Sonnet for generation, verification reasoning, and final teaching output
Stagehand + Browserbase enable live web browsing so the verifier can fetch and attach real evidence to claims.
Arize Phoenix + OpenTelemetry provide traceability across the pipeline, letting us see where claims were introduced, challenged, or corrected.
Pydantic + dotenv manage data models and environment configuration.

Frontend (React + TypeScript)

React + Vite + TypeScript for fast, modular UI development
Tailwind CSS for clean, responsive styling
Lucide React for icons
Built two key interfaces:
- Thread View (Reddit-style discussion)
- Visual Review View (AI reviewers attached to paragraphs)

Integrations

Anthropic (Claude) for all core intelligence
Browserbase + Stagehand for browser-grounded verification
Arize Phoenix for observability and evaluation
Fetch.ai (uAgents / ASI:One) to optionally expose Sourcerer as a discoverable AI tutor agent

We focused on a polished vertical slice that clearly demonstrates the product experience rather than building a full production system.

Deployment

Frontend: Deployed on Vercel for fast, globally distributed hosting of our React application.
Backend: Deployed on Render, running our FastAPI-based multi-agent pipeline.

This combination allowed us to move quickly during the hackathon while maintaining a responsive UI and a reliable backend service for AI orchestration.

Challenges we ran into

Designing useful AI critique: Multiple agents can easily overwhelm users. We had to carefully define roles so each comment added distinct value.
Linking comments to exact content: Attaching feedback to specific paragraphs required structuring outputs beyond typical LLM responses.
Balancing grounding vs speed: Browser-based verification is powerful but slow, so we limited it to key claims for the demo.
Keeping the demo polished: We had to balance backend complexity with a UI that judges could immediately understand.
Time constraints: Building a multi-agent system, frontend experience, and integrations in 24 hours required aggressive prioritization and fallback strategies.

Accomplishments that we're proud of

Built a system where AI answers are challenged, sourced, and defended, not just generated
Created a visual review experience where AI agents interact with exact parts of content
Integrated browser-grounded verification into the learning flow
Designed a flexible architecture that supports multiple AI providers
Delivered a compelling demo showing how a hallucination or overstatement is flagged and improved
Made AI learning feel interactive, transparent, and collaborative

What we learned

AI is incredibly helpful for learning, but trust and transparency are just as important as accuracy
Multi-agent systems are only effective when outputs are structured and interpretable
Grounding answers with external sources significantly increases user confidence
UI/UX matters deeply—how you show uncertainty can be as important as detecting it
The best demos are not just technically complex—they are intuitive and tell a clear story