Inspiration

Reading is a cognitively demanding activity that requires continuous attention allocation, lexical access, and comprehension monitoring. However, most digital reading tools treat text as static content rather than an active cognitive process.

Research in cognitive science and human–computer interaction shows that well-timed, contextual interventions can significantly improve comprehension while reducing cognitive load. This project explores how Gemini 3’s fast, context-aware language understanding can support readers in real time, helping them resolve uncertainty without breaking focus.

What it does

The Cognitive Reading Companion is an AI-assisted reading environment designed to support comprehension during active reading.

Key features include:

  • PDF upload and automatic chapter or section segmentation
  • Cursor-based reading tracking using word-level hovering as a proxy for reading pace and attention
  • Word-level interaction, where hovering highlights a word and clicking opens an explanation panel
  • Contextual explanations generated by Gemini 3 based on the surrounding passage
  • A side chat interface for follow-up questions without leaving the reading flow

How we built it

The system consists of three main components:

  • Frontend: An interactive PDF reader with cursor tracking and word-level event handling
  • Backend: Text extraction, chapter segmentation, and request orchestration
  • AI layer: Gemini 3 API calls for contextual explanations and clarification
  • Only localized text segments are passed to Gemini 3, enabling precise reasoning while maintaining low latency and preserving reading flow.

Use of Gemini 3

Gemini 3 is central to the project’s functionality.

  • Contextual language understanding allows explanations to reflect how a word or phrase is used within the passage
  • Low-latency responses ensure that interaction does not disrupt reading
  • Flexible prompting adapts responses based on whether the user requests a definition, clarification, or conceptual explanation
  • Rather than acting as a standalone chatbot, Gemini 3 is embedded directly into the reading experience as a cognitive assistant.

Cognitive Science Perspective

From a cognitive science perspective, the system targets three major challenges in reading:

  1. Lexical access, or resolving unfamiliar words
  2. Attention regulation, including maintaining reading pace and place
  3. Context switching costs caused by leaving the text to search for explanations By integrating explanations directly into the reading interface, the system reduces extraneous cognitive load and preserves working memory resources, aligning with principles from cognitive load theory.

What's Next

With more time, we would explore:

  • Adaptive assistance based on reading behavior and interaction patterns
  • Automated comprehension checks generated by Gemini 3
  • Accessibility features for neurodivergent readers
  • Empirical validation through small-scale user studies

Built With

Share this project:

Updates