Inspiration Electronics debugging is often a tedious process of staring at tiny wire connections and double-checking pinouts against datasheets. We wanted to build Circuit-Chan Vision, an AR Lead Engineer that acts as a "second pair of eyes." Inspired by the need for faster prototyping, Circuit-Chan transforms a simple smartphone camera into a powerful diagnostic tool that can "see" current logic states and circuit health.

What it does Circuit-Chan Vision uses the Gemini multimodal engine to:

Analyze Breadboards: Instantly identifies ICs, resistors, LEDs, and capacitors. Logic State Recognition: Detects logic gate configurations and predicts output states using a specialized pattern (e.g., AND_01_Out0). Fault Detection: Compares physical wiring against a digital schematic to find loose wires, polarity errors, or incorrect placements. AR Intelligence: Provides [x, y] coordinates for faults and overlays deep component intelligence, including pinouts and datasheet links, directly onto the physical world. How we built it The project is built on a modern React + Vite stack using TypeScript. The core "brain" is powered by the Google Gen AI SDK, specifically utilizing Gemini 3 Flash Preview for low-latency, high-accuracy visual reasoning. We integrated a custom dataset of circuit configurations to define a standardized logic-state labeling system.

Challenges we ran into One of the biggest hurdles was attempting to fine-tune Gemini for our specific multimodal circuit data. We faced technical constraints with SDK schema validation for multimodal tuning jobs. However, this led to a breakthrough: we discovered that a highly optimized Dataset-Informed Prompt Template combined with Gemini's raw multimodal capabilities was actually more efficient and flexible than a static tuned model.

Accomplishments that we're proud of We are particularly proud of the Logic State Recognition engine. It doesn't just see a black box; it understands that the chip is an AND gate, notes which pins are high/low, and correctly identifies the expected output state—all from a single image.

What we learned We learned that the current frontier of AI development is shifting from "training" to "prompt engineering." Our pivot from fine-tuning to advanced multimodal prompting taught us how to embed structured intelligence directly into the AI's "System Instruction" for real-time performance.

What's next for Circuit-Chan Vision Next, we want to integrate real-time video stream analysis and expand the component library to include complex microcontrollers like ESP32 and Arduino, providing live code-stepping overlays in AR.

Built With

  • built-with-languages:-typescript
  • javascript-frameworks:-react
  • lucide-react-tools:-node.js
  • npm
  • vite-apis:-google-gemini-(google-gen-ai-sdk)-styling:-vanilla-css
Share this project:

Updates