OneTap: Your AI-Powered Guide to Reading

Inspiration

As a 16-year-old high school student in Hualien, I face difficult classical Chinese texts like Du Fu's "A Night's Journey" every day.

Whenever I get stuck and can't understand something, the process is always incredibly painful:

  • Take a picture.

  • Open Gemini.

  • Copy and paste (it's impossible to copy when there are mathematical formulas or images).

  • Wait.

  • Switch back to the textbook.

These few seconds completely disrupt my flow of learning, and after a few days, Gemini might "forget" because the conversation is too long.

I asked myself: "Why can't AI just live inside my textbook?"

I developed OneTap not to replace reading, but to eliminate friction. I want an e-reader that can "understand" which page I'm reading—like a smart bookmark clipped to a book, ready to explain the current page's content without me having to explain it to it.

What it does

OneTap is a smart reader with a built-in Google Gemini 1.5 Flash brain.

  • Context-Aware: The app knows which page you're reading. You don't need to copy and paste context, as AI automatically analyzes the current viewport.

  • One-Click Explanation: With a single click (or by asking a question), OneTap uses Gemini to analyze the text, charts, or formulas on the screen and provides a concise explanation.

  • Seamless Overlay: The explanation appears in the dialogue area without obscuring the original text, allowing students to read the original book while referring to the AI's explanation.

How we built it

As a student with limited time, I focused on speed, integration, and a "Serverless First" architecture:

  • Frontend: Using React and deployed on Vercel. I spent a lot of time polishing the UI to ensure a smooth, uninterrupted reading experience like a native app.

  • AI Brain: At its core is Google Gemini 1.5 Flash.

  • I leveraged its Multimodal capabilities to handle images and diagrams in the textbook.

  • Operating Logic:

$$\text{Context}{viewport} + \text{User}{query} \xrightarrow{\text{Gemini 1.5 Flash}} \text{Explanation}$$

  • We injected the current page's content as a "System Context," making the AI ​​feel like a tutor reading the same book.

  • Backend: Used Google Firebase for user authentication and real-time synchronization of reading progress.

Challenges we ran into

  • Context vs. Cost:

Simply dumping the entire textbook onto AI for analysis every time would be too slow and expensive. I had to design logic to capture and send the content of the current viewport to Gemini only when a student actively seeks help.

Accomplishments that we're proud of

  • Unified Experience: I successfully integrated the PDF reader and the AI ​​chatroom into a single interface, rather than two separate tools.

  • 16-Year-Old Perseverance: Creating a full-fledged AI application at a hackathon while preparing for midterms was a huge challenge, but seeing it running on my phone made it all worthwhile.

What we learned

  • Simplicity is the hardest thing: Removing unnecessary buttons and keeping the UI clean is harder than adding new features. I learned to prioritize the purest reading experience.

What's next for OneTap

  • Driving the digital publishing transformation: I plan to participate in a government innovation competition this year. If I win, I will strive to collaborate with booksellers to integrate this technology into traditional textbooks, helping the publishing industry transform through AI.

  • Voice Mode: Utilizing Gemini's multimodal capabilities, students can ask questions verbally (e.g., "Hey, explain this picture to me").

  • Automatically generated quizzes: After reading a chapter, the system automatically generates a review quiz based on the content to aid memorization.

Built With

Share this project:

Updates