Polaris - the book YOU need

Inspiration

Knowledge takes a decade to reach people. Research happens, then years of validation, practitioner adoption, textbooks, courses — and by the time it's accessible, the frontier has moved. The breakthroughs that would unlock someone's potential never reach them in time.

I've felt this personally. By the time my mental model catches up to a field, the ideas I'm excited about are already mature somewhere else — people have built products, published papers, moved on. It's not that talent is missing. It's that ideas have a transmission delay, and that delay is brutal. The same labs and the same places produce most breakthroughs not because talent only exists there, but because cutting-edge knowledge reaches them first.

Ideas are software for humans. If yours is outdated, your intelligence and talent are wasted — you're solving problems that have already been solved, missing connections that already exist, building things that are already obsolete.

I wanted to remove that friction entirely. Not with better search, not with summaries, but by synthesizing the exact book each person needs — from wherever they are to the cutting edge of any field. A book that doesn't exist yet, written for one reader, grounded in the latest research, in hours instead of years.

## What it does

Polaris synthesizes personalized, research-backed books from scratch. You provide your topic, professional domain, background, and goal — and it generates a 100-800 page book tailored to exactly one reader.

This isn't summarization or adaptation. Every book is synthesized from the ground up: structure follows research, examples speak your professional language, and content bridges the gap between what you know and the cutting edge.

The pipeline orchestrates four Gemini 3 capabilities across 16 stages:

Gemini Deep Research discovers cutting-edge papers and frameworks before the outline is finalized — structure follows research, not the other way around
Gemini 3 Flash powers 90+ type-safe DataModels through the Synalinks neuro-symbolic framework — vision, planning, self-critique, and bottom-up content assembly
Gemini 3 Pro Image generates book covers (15 artistic styles) and in-chapter illustrations with dynamic prompts tailored to each book's content
Gemini Search Grounding powers a claim-first citation pipeline — claims are planned before writing, verified against primary sources, and rejected if unverifiable

Beyond surface-level generation, Polaris fetches full-text papers from arXiv, extracts their content, and feeds them into a knowledge graph powered by Graphiti for advanced graph reasoning — connecting concepts, methods, and findings across dozens of papers. Every chapter is grounded in actual research, not LLM memory.

The result is a book with verified citations, domain-specific examples, and knowledge graph-grounded reasoning — delivered through a web platform where you fill out a 5-step form and download your PDF.

## How we built it

Polaris has two components: a Python generation engine and a Next.js web platform.

Generation Engine — Built on Synalinks, a neuro-symbolic framework providing type-safe structured outputs via DataModels (90+), intelligent control flow via Branch, and LLM-based routing via Decision. The 16-stage pipeline:

Deep Research — Gemini Deep Research API discovers the latest papers and frameworks
Book Vision — Synalinks Branch decides reader mode (practitioner/academic/hybrid), reshaping the entire book
Outline Generation — Multi-angle concept extraction with coverage verification loops
Research-Informed Restructuring — Discovered papers reshape the outline before finalization
Chapter Prioritization — Role-tagged chapters (IMPLEMENTATION, FRONTIERS, etc.) selected by reader's goal
Hierarchical Planning — Book → chapter → section plans, each with self-critique loops
Two-Level Research Distribution — Each paper → one chapter, each subsection → unique concepts and example domain
Stage 2 Research — arXiv full-text fetching, PDF extraction, and knowledge graph integration via Graphiti MCP
Citation Pipeline — Claims planned per subsection → verified with Gemini Search Grounding → rejected if unverifiable
Content Generation — Bottom-up assembly with full book context at every level
Illustrations — Mermaid diagrams + Gemini 3 Pro Image
Cover Generation — Dynamic prompts rendered by Gemini 3 Pro Image (15 styles)
PDF Assembly — Markdown → HTML → PDF via WeasyPrint

All intermediates are cached, enabling resume from any stage.

Web Platform — Next.js 14 with a 5-step book builder wizard, real-time progress tracking across 9 pipeline stages, and PDF/Markdown download. Modular frontend (12 components, 2 custom hooks) served by a FastAPI backend with background job tracking.

## Challenges we ran into

Repetition at scale — LLMs naturally repeat themselves across chapters. We solved this with two-level research distribution: each paper is assigned to exactly one chapter via synalinks.Decision (LLM-based matching, not keywords), and each subsection gets only its unique concepts with a unique example domain.

Citation hallucination — LLMs confidently cite papers that don't exist. Our claim-first pipeline inverts the typical approach: plan what needs citing before writing, verify each claim against primary sources using Gemini Search Grounding, and reject anything unverifiable. Only original papers and documentation qualify.

Structural coherence — A 200-page book generated subsection-by-subsection loses the thread. Hierarchical planning (book → chapter → section plans with mode-aware critique loops) and full-context generation — every subsection sees the complete book plan — solve this.

Deep Research integration — The Gemini Deep Research API is asynchronous with variable completion times. We built a polling wrapper with caching, parallel execution with semaphores, and graceful fallback when research is unavailable.

Full-text paper processing — Abstracts aren't enough for deep reasoning. We built an arXiv pipeline that resolves paper IDs via Gemini Search Grounding, downloads full PDFs, extracts text via arxiv2text, and feeds complete papers into the knowledge graph for entity and relationship extraction.

## Accomplishments that we're proud of

Research-first architecture — Papers are discovered before the outline exists. This single decision fundamentally changes output quality
Zero hallucinated citations — Every citation is verified against a primary source. If it can't be verified, it's rejected
Two-level distribution pattern — Paper exclusivity + concept exclusivity + domain exclusivity per subsection solves repetition in long-form LLM generation. The pattern generalizes beyond books
90+ type-safe DataModels — Every piece of data flowing through 16 stages has a schema enforced by Synalinks, preventing hallucination cascades
Knowledge graph-grounded content — Papers aren't just cited — their concepts, methods, and findings are connected through Graphiti's graph reasoning
It actually works — The pipeline generates real books with real citations from real papers, and you can download the PDF

## What we learned

Research-first architecture produces a fundamentally different (and better) book structure than generate-then-research
Type-safe structured outputs via Synalinks DataModels are non-negotiable for multi-stage pipelines — one malformed output cascades across 16 stages
Conservative citation (reject when uncertain) produces far more trustworthy results than aggressive citation
LLM-based matching (synalinks.Decision) crushes keyword matching for paper-to-chapter assignment
Gemini Search Grounding is the most reliable method for resolving arXiv paper IDs — more reliable than fuzzy title matching
The knowledge pipeline problem is real — the gap between frontier research and accessibility is about transmission speed, not talent

## What's next for Polaris - the book YOU need

Multi-language synthesis — True synthesis in any language with culturally relevant examples, not translation
Interactive learning paths — Adaptive curricula with exercises tailored to your level
Collaborative editing — Human-in-the-loop refinement where domain experts review and improve chapters
Real-time knowledge updates — Detect new papers and offer chapter updates automatically
Team knowledge bases — Shared reference books for teams entering a new domain, with role-specific chapters
API for education platforms — Let any learning platform generate personalized textbooks on demand can you give me a script and rephrase a little to be more authentic

Built With

arxiv-api
beautifulsoup4
docker
fastapi
framer-motion
google-gemini-api
graphiti
litellm
mermaid
neo4j
next.js
pymupdf
python
radix-ui
react
sqlite
synalinks
tailwind-css
three.js
typescript
weasyprint

Updates

Aynur Adanbekova started this project — Feb 08, 2026 06:54 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.