โ๏ธ Scribe AI
An interactive AI tutor that visually solves complex problems on a digital whiteboard. Scribe AI transforms static equations and text into dynamic, animated explanations with a fluid, handwritten effect, complete with spoken guidance.
(Note:* You should record a GIF of your app in action and replace the URL above.)*
โจ Core Concept
The traditional way of learning complex subjects often involves staring at static problems in a textbook. It can be difficult to follow the logic and understand the flow of a solution.
Scribe AI solves this by acting as a personal AI tutor on a digital whiteboard. You can type a problem or simply snap a picture of one from your textbook. The AI doesn't just return an answerโit generates a step-by-step visual script. Our custom rendering engine then animates this script, drawing out each equation and diagram while a clear voice explains each step, making the solution intuitive and easy to follow.
๐ Features
- AI-Powered Solutions: Leverages the Google Gemini API to understand problems and generate step-by-step logical solutions.
- Camera & Image Input: Snap a photo of a math problem from a textbook or whiteboard, and the AI's vision capabilities will analyze it directly.
- Text-to-Speech Explanations: As each step of the solution is drawn, a clear voice reads the explanation aloud, creating a multi-sensory learning experience.
- Step-by-Step Animated Drawing: Renders solutions with a dynamic, handwritten effect, drawing each character and symbol sequentially.
- Handles Complex Content: Capable of drawing equations, explanatory text, diagrams, and schematics.
- Extensible Architecture: The AI communicates with the app via a simple JSON "scripting language," making it easy to add new drawing and speech capabilities.
๐ ๏ธ Tech Stack
| Technology | Purpose |
|---|---|
| React Native (Expo) | Cross-platform mobile app development |
| TypeScript | For robust, type-safe code |
| Google Gemini API | AI-powered reasoning and vision analysis |
| React Native SVG | Core library for rendering vector graphics |
| React Native Reanimated | For smooth, performant, 60 FPS animations |
| Expo Router | File-based routing for app navigation |
| Expo Camera / Image Picker | Image & Camera Handling |
| Expo Speech | Text-to-Speech Synthesis |
โ๏ธ How It Works
The magic of Scribe AI is in its simple yet powerful five-step pipeline:
- User Input: The user either types a problem or uses the camera to capture an image of one.
- AI Vision & Generation: If an image is provided, it's sent to the Gemini API for vision analysis to extract the text and context. This extracted problem is then used in a carefully engineered prompt that instructs the AI to generate a step-by-step solution as a JSON array of commands.
- JSON Scripting: The app receives a script like this:
json [ {"command": "drawEquation", "payload": {"equation": "2x+5=11", ...}}, {"command": "pause", "payload": {"duration": 500}}, {"command": "drawText", "payload": {"text": "Subtract 5 from both sides", ...}} ] - Parsing: A parser converts these high-level commands into low-level SVG path strings and speech instructions.
- Animation & Speech Synthesis: The
DrawingOrchestratorcomponent takes this list of instructions and animates each path sequentially on the screen. As eachdrawTextcommand is processed, its content is also sent to the device's text-to-speech engine to be spoken aloud, perfectly synced with the animation.
๐ Getting Started
Follow these instructions to get a copy of the project up and running on your local machine for development and testing purposes.
Prerequisites
Installation
Clone the repository:
git clone https://github.com/your-username/scribe-ai.git cd scribe-aiInstall NPM packages:
npm installSet up your environment variables:
- You will need a Google Gemini API Key. You can get one from Google AI Studio.
- For this hackathon project, we are inputting it directly in the UI.
Run the application:
npx expo startIf you encounter any caching issues, especially after installing new packages, run with the
--clearflag:npx expo start -cScan the QR code with the Expo Go app on your phone. You may need to grant camera permissions for the image upload feature to work.
๐บ๏ธ Roadmap & Future Work
Scribe AI is currently a proof-of-concept with a solid foundation. Here are some features we're excited to build next:
- [ ] Advanced Diagramming: Add support for more complex shapes for physics (free-body diagrams) and calculus (graphs, integrals).
- [ ] Interactive Steps: Allow users to tap on a specific step to get a more detailed explanation from the AI.
- [ ] Save & Export: Save sessions or export the final drawing as an image or PDF.
- [ ] Color & Highlighting: Add commands for changing colors to highlight specific parts of a diagram or equation.
- [ ] Multi-language Support: Add support for different languages for both the AI and the text-to-speech engine.
๐ License
This project is licensed under the MIT License - see the LICENSE.md file for details.
Built With
- ai
- libraries
- react-native
Log in or sign up for Devpost to join the conversation.