Inspiration
Knowledge takes a decade to reach people. Research happens, then years of validation, practitioner adoption, textbooks, courses — and by the time it's accessible, the frontier has moved. The breakthroughs that would unlock someone's potential never reach them in time.
I've felt this personally. By the time my mental model catches up to a field, the ideas I'm excited about are already mature somewhere else — people have built products, published papers, moved on. It's not that talent is missing. It's that ideas have a transmission delay, and that delay is brutal. The same labs and the same places produce most breakthroughs not because talent only exists there, but because cutting-edge knowledge reaches them first.
Ideas are software for humans. If yours is outdated, your intelligence and talent are wasted — you're solving problems that have already been solved, missing connections that already exist, building things that are already obsolete.
I wanted to remove that friction entirely. Not with better search, not with summaries, but by synthesizing the exact book each person needs — from wherever they are to the cutting edge of any field. A book that doesn't exist yet, written for one reader, grounded in the latest research, in hours instead of years.
## What it does
Polaris synthesizes personalized, research-backed books from scratch. You provide your topic, professional domain, background, and goal — and it generates a 100-800 page book tailored to exactly one reader.
This isn't summarization or adaptation. Every book is synthesized from the ground up: structure follows research, examples speak your professional language, and content bridges the gap between what you know and the cutting edge.
The pipeline orchestrates four Gemini 3 capabilities across 16 stages:
- Gemini Deep Research discovers cutting-edge papers and frameworks before the outline is finalized — structure follows research, not the other way around
- Gemini 3 Flash powers 90+ type-safe DataModels through the Synalinks neuro-symbolic framework — vision, planning, self-critique, and bottom-up content assembly
- Gemini 3 Pro Image generates book covers (15 artistic styles) and in-chapter illustrations with dynamic prompts tailored to each book's content
- Gemini Search Grounding powers a claim-first citation pipeline — claims are planned before writing, verified against primary sources, and rejected if unverifiable
Beyond surface-level generation, Polaris fetches full-text papers from arXiv, extracts their content, and feeds them into a knowledge graph powered by Graphiti for advanced graph reasoning — connecting concepts, methods, and findings across dozens of papers. Every chapter is grounded in actual research, not LLM memory.
The result is a book with verified citations, domain-specific examples, and knowledge graph-grounded reasoning — delivered through a web platform where you fill out a 5-step form and download your PDF.
## How we built it
Polaris has two components: a Python generation engine and a Next.js web platform.
Generation Engine — Built on Synalinks, a neuro-symbolic framework providing type-safe structured outputs via DataModels (90+), intelligent control flow via Branch, and LLM-based routing via Decision. The 16-stage pipeline:
- Deep Research — Gemini Deep Research API discovers the latest papers and frameworks
- Book Vision — Synalinks Branch decides reader mode (practitioner/academic/hybrid), reshaping the entire book
- Outline Generation — Multi-angle concept extraction with coverage verification loops
- Research-Informed Restructuring — Discovered papers reshape the outline before finalization
- Chapter Prioritization — Role-tagged chapters (IMPLEMENTATION, FRONTIERS, etc.) selected by reader's goal
- Hierarchical Planning — Book → chapter → section plans, each with self-critique loops
- Two-Level Research Distribution — Each paper → one chapter, each subsection → unique concepts and example domain
- Stage 2 Research — arXiv full-text fetching, PDF extraction, and knowledge graph integration via Graphiti MCP
- Citation Pipeline — Claims planned per subsection → verified with Gemini Search Grounding → rejected if unverifiable
- Content Generation — Bottom-up assembly with full book context at every level
- Illustrations — Mermaid diagrams + Gemini 3 Pro Image
- Cover Generation — Dynamic prompts rendered by Gemini 3 Pro Image (15 styles)
- PDF Assembly — Markdown → HTML → PDF via WeasyPrint
All intermediates are cached, enabling resume from any stage.
Web Platform — Next.js 14 with a 5-step book builder wizard, real-time progress tracking across 9 pipeline stages, and PDF/Markdown download. Modular frontend (12 components, 2 custom hooks) served by a FastAPI backend with background job tracking.
## Challenges we ran into
Repetition at scale — LLMs naturally repeat themselves across chapters. We solved this with two-level research
distribution: each paper is assigned to exactly one chapter via synalinks.Decision (LLM-based matching, not keywords), and
each subsection gets only its unique concepts with a unique example domain.
Citation hallucination — LLMs confidently cite papers that don't exist. Our claim-first pipeline inverts the typical approach: plan what needs citing before writing, verify each claim against primary sources using Gemini Search Grounding, and reject anything unverifiable. Only original papers and documentation qualify.
Structural coherence — A 200-page book generated subsection-by-subsection loses the thread. Hierarchical planning (book → chapter → section plans with mode-aware critique loops) and full-context generation — every subsection sees the complete book plan — solve this.
Deep Research integration — The Gemini Deep Research API is asynchronous with variable completion times. We built a polling wrapper with caching, parallel execution with semaphores, and graceful fallback when research is unavailable.
Full-text paper processing — Abstracts aren't enough for deep reasoning. We built an arXiv pipeline that resolves paper IDs via Gemini Search Grounding, downloads full PDFs, extracts text via arxiv2text, and feeds complete papers into the knowledge graph for entity and relationship extraction.
## Accomplishments that we're proud of
- Research-first architecture — Papers are discovered before the outline exists. This single decision fundamentally changes output quality
- Zero hallucinated citations — Every citation is verified against a primary source. If it can't be verified, it's rejected
- Two-level distribution pattern — Paper exclusivity + concept exclusivity + domain exclusivity per subsection solves repetition in long-form LLM generation. The pattern generalizes beyond books
- 90+ type-safe DataModels — Every piece of data flowing through 16 stages has a schema enforced by Synalinks, preventing hallucination cascades
- Knowledge graph-grounded content — Papers aren't just cited — their concepts, methods, and findings are connected through Graphiti's graph reasoning
- It actually works — The pipeline generates real books with real citations from real papers, and you can download the PDF
## What we learned
- Research-first architecture produces a fundamentally different (and better) book structure than generate-then-research
- Type-safe structured outputs via Synalinks DataModels are non-negotiable for multi-stage pipelines — one malformed output cascades across 16 stages
- Conservative citation (reject when uncertain) produces far more trustworthy results than aggressive citation
- LLM-based matching (
synalinks.Decision) crushes keyword matching for paper-to-chapter assignment - Gemini Search Grounding is the most reliable method for resolving arXiv paper IDs — more reliable than fuzzy title matching
- The knowledge pipeline problem is real — the gap between frontier research and accessibility is about transmission speed, not talent
## What's next for Polaris - the book YOU need
- Multi-language synthesis — True synthesis in any language with culturally relevant examples, not translation
- Interactive learning paths — Adaptive curricula with exercises tailored to your level
- Collaborative editing — Human-in-the-loop refinement where domain experts review and improve chapters
- Real-time knowledge updates — Detect new papers and offer chapter updates automatically
- Team knowledge bases — Shared reference books for teams entering a new domain, with role-specific chapters
- API for education platforms — Let any learning platform generate personalized textbooks on demand can you give me a script and rephrase a little to be more authentic
Log in or sign up for Devpost to join the conversation.