OneTap: Your AI-Powered Guide to Reading
Inspiration
As a 16-year-old high school student in Hualien, I face difficult classical Chinese texts like Du Fu's "A Night's Journey" every day.
Whenever I get stuck and can't understand something, the process is always incredibly painful:
Take a picture.
Open Gemini.
Copy and paste (it's impossible to copy when there are mathematical formulas or images).
Wait.
Switch back to the textbook.
These few seconds completely disrupt my flow of learning, and after a few days, Gemini might "forget" because the conversation is too long.
I asked myself: "Why can't AI just live inside my textbook?"
I developed OneTap not to replace reading, but to eliminate friction. I want an e-reader that can "understand" which page I'm reading—like a smart bookmark clipped to a book, ready to explain the current page's content without me having to explain it to it.
What it does
OneTap is a smart reader with a built-in Google Gemini 1.5 Flash brain.
Context-Aware: The app knows which page you're reading. You don't need to copy and paste context, as AI automatically analyzes the current viewport.
One-Click Explanation: With a single click (or by asking a question), OneTap uses Gemini to analyze the text, charts, or formulas on the screen and provides a concise explanation.
Seamless Overlay: The explanation appears in the dialogue area without obscuring the original text, allowing students to read the original book while referring to the AI's explanation.
How we built it
As a student with limited time, I focused on speed, integration, and a "Serverless First" architecture:
Frontend: Using React and deployed on Vercel. I spent a lot of time polishing the UI to ensure a smooth, uninterrupted reading experience like a native app.
AI Brain: At its core is Google Gemini 1.5 Flash.
I leveraged its Multimodal capabilities to handle images and diagrams in the textbook.
Operating Logic:
$$\text{Context}{viewport} + \text{User}{query} \xrightarrow{\text{Gemini 1.5 Flash}} \text{Explanation}$$
We injected the current page's content as a "System Context," making the AI feel like a tutor reading the same book.
Backend: Used Google Firebase for user authentication and real-time synchronization of reading progress.
Challenges we ran into
- Context vs. Cost:
Simply dumping the entire textbook onto AI for analysis every time would be too slow and expensive. I had to design logic to capture and send the content of the current viewport to Gemini only when a student actively seeks help.
Accomplishments that we're proud of
Unified Experience: I successfully integrated the PDF reader and the AI chatroom into a single interface, rather than two separate tools.
16-Year-Old Perseverance: Creating a full-fledged AI application at a hackathon while preparing for midterms was a huge challenge, but seeing it running on my phone made it all worthwhile.
What we learned
- Simplicity is the hardest thing: Removing unnecessary buttons and keeping the UI clean is harder than adding new features. I learned to prioritize the purest reading experience.
What's next for OneTap
Driving the digital publishing transformation: I plan to participate in a government innovation competition this year. If I win, I will strive to collaborate with booksellers to integrate this technology into traditional textbooks, helping the publishing industry transform through AI.
Voice Mode: Utilizing Gemini's multimodal capabilities, students can ask questions verbally (e.g., "Hey, explain this picture to me").
Automatically generated quizzes: After reading a chapter, the system automatically generates a review quiz based on the content to aid memorization.
Built With
- firebase
- framer-motion
- google-gemini
- next.js
- react
- shadcn-ui
- tailwind-css
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.