Inspiration
Industrial downtime is a silent killer of productivity, costing factories millions every hour. While modern machines have sensors, they lack the "intuition" of a veteran engineer who can look at a component and know it's about to fail. I wanted to build a "Ghost Engineer"—a digital twin expert that stays on-site 24/7, using the world’s most advanced reasoning model to predict failures before they stop production.
What it does
The Ghost Engineer is an autonomous mechanical auditor. It takes a visual feed (photos or video) of industrial hardware and cross-references it against thousands of pages of technical manuals. Using Gemini 3’s reasoning, it identifies cracks, leaks, or wear-and-tear. Most importantly, it is agentic; it doesn't just report problems—it uses function calling to automatically log maintenance tickets and order replacement parts in real-time.
How we built it
I built this project using Python and the Streamlit framework for the industrial UI. The "brain" of the system is Gemini 3 Pro Preview. Reasoning: I utilized the thinking_level="HIGH" configuration to ensure the model performs deep mechanical analysis. Context: I leveraged the 1.2M token window to allow the AI to "read" entire service manuals in a single session. Agency: I implemented Function Calling to connect the AI's thoughts to a maintenance database. Logic: We use a reliability check where the AI calculates the probability of failure P(f) based on the severity of the visual anomaly A_s and the age of the part t: P(f) = 1 - e^{-\lambda t .A_s}
Challenges we ran into
The most significant challenge was the 429 Quota Exhaustion errors. Because Gemini 3 Pro is a high-compute reasoning model, the free tier limits are strict. I had to optimize the way PDF data is sent by converting files into specific byte-parts and implementing a "Thought Signature" display to ensure the model's internal reasoning was preserved without crashing the session.
Accomplishments that we're proud of
I am incredibly proud of successfully surfacing the Thought Signatures. In a safety-critical field like engineering, "Black Box" AI is dangerous. Seeing the Ghost Engineer’s "Internal Thoughts" as it reasons through a mechanical fault provides a level of transparency that could actually be used in real-world factories.
What we learned
I learned that the future of AI isn't just about "chatting"—it's about Agency. Building this project taught me how to bridge the gap between a digital Large Language Model and physical-world mechanical hardware. I also gained deep experience in handling multimodal inputs (text + image) simultaneously within a single reasoning chain.
What's next for The Ghost Engineer
The next step is to integrate Gemini 3’s Multimodal Live API for continuous video auditing. I want the Ghost Engineer to move from a "photo-uploader" to a "live-camera" agent that can walk through a factory floor on a mobile device or AR glasses, providing real-time safety overlays for technicians.
Built With
- gemini3preview
- googleaistudio
- googlegenaisdk
- pillow
- pypdf2
- python
- streamlit
Log in or sign up for Devpost to join the conversation.