Inspiration

The global e-waste crisis is escalating, with over 50 million metric tons of electronic waste generated annually. A significant portion of this waste consists of smartphones that are discarded simply because repair is considered "too difficult" or "too expensive" for the average person. We were inspired by the Right to Repair movement and the potential of Generative AI to bridge the skills gap. We asked ourselves: What if we could give every local repair shop or even a hobbyist at home the precision of a factory robot and the diagnostic eyes of an expert engineer? The aesthetic and functional inspiration comes from the concept of a "Cyberpunk Repair Terminal" a futuristic interface where human intuition meets machine precision to keep technology alive and out of landfills.

What it does

SRR-CI (Smartphone Repair Robotic Control Interface) is a web-based command center that democratizes high-precision repair tasks. AI Vision Analysis: It uses the device's webcam to stream a live feed of the workspace. Users can trigger the Google Gemini 1.5 Pro model to analyze the scene in real-time. The AI identifies key components (e.g., "Pentalobe Screws," "Battery Connector," "Display Cables") and returns their bounding box coordinates. Robotic Control Bridge: It connects the web browser directly to a robotic arm (via Arduino/ESP32) using the Web Serial API. Manual & Autonomous Modes: Manual: Users control 5-finger robotic grippers via sliders with < 50ms latency. Autonomous Macros: Users can execute complex sequences like "Unscrew Pentalobe" or "Lift Battery" with a single click, guided by the AI's coordinate data.

How we built it

We built SRR-CI as a Client-Side First application to ensure accessibility (no heavy backend servers required). Frontend: We used React (Vite) for a blazing-fast UI, styled with Tailwind CSS to create a high-contrast, "Cyberpunk" aesthetic optimized for low-light repair environments. The Brain (AI): We integrated the Google AI SDK (@google/generative-ai). We prompt the gemini-1.5-pro model to act as a computer vision expert, processing raw base64 image data from the HTML5 Canvas and returning structured JSON data: $$BoundingBox = { label, x_{min}, y_{min}, x_{max}, y_{max} }$$ The Body (Hardware): We utilized the browser's Web Serial API (navigator.serial). This allows the React app to open a direct, bidirectional communication channel (UART) with a microcontroller (Arduino/ESP32) via USB, sending servo angle commands in real-time strings (e.g., T:90\n for Thumb actuation).

Challenges we ran into

The "Hallucination" Loop: During development, we faced issues where the AI code generation suggested deprecated or non-existent model names (like gemini-3-flash) or confused the experimental @google/genai SDK with the stable @google/generative-ai library. We had to manually debug and enforce strict version control in package.json to stabilize the build. Browser Security Sandboxes: The Web Serial API is powerful but strictly sandboxed. We struggled with "Access Denied" errors when running the app inside iframe-based preview environments (like GitHub Codespaces). We learned to handle Security Error exceptions gracefully and enforce "Open in New Tab" workflows for hardware access. Latency vs. Accuracy: Balancing the heavy lifting of an AI request (approx. 1-2 seconds) with the need for real-time motor control (ms latency). We implemented a throttling algorithm for the serial writer to ensure the Arduino buffer doesn't overflow while the AI thinks in the background.

Accomplishments that we're proud of

Deployment Success: We successfully deployed a hardware-capable web app to Vercel that can control physical motors from a phone or laptop. Seamless AI Integration: Seeing the "Optical Sensor" actually identify a battery on a live video feed for the first time was a magic moment. It proved that complex computer vision doesn't need a dedicated GPU server anymore just a browser key. The UI/UX: We built an interface that doesn't look like a boring admin panel. It feels like a tool from the future, which makes the repair process engaging rather than tedious.

What we learned

The Power of Web Serial: We learned that the modern web is capable of talking directly to hardware. This opens up a massive door for "Web-to-Physical" applications without needing to install drivers or Python scripts. Spatial Reasoning with LLMs: We discovered that Large Language Models (LLMs) like Gemini are surprisingly good at 2D spatial reasoning when given the right prompt structure, effectively replacing traditional YOLO/OpenCV pipelines for this specific use case. Project Resilience: We learned that when the AI tools give conflicting advice, manual verification and understanding the core documentation is key to unblocking the project.

What's next for smartphone repair robotic interface

Closed-Loop Feedback: We plan to feed the visual data back into the robot's logic so it can "self-correct" if it misses a screw. AR Overlay: Implementing an Augmented Reality layer using the bounding box data to project repair guides (e.g., "Unscrew Here First") directly onto the video feed. Community Library: Creating a database of repair macros for different phone models (e.g., iPhone 15 vs. Samsung S24), allowing the community to share "repair scripts" for the robot.

Built With

Share this project:

Updates