Inspiration

Industrial downtime isn't just expensive—it's a massive information bottleneck. In a crisis, maintenance teams are forced to manually cross-reference high-speed video, 500-page technical manuals, and thousands of rows of sensor data. We were inspired to build Sentinel-3 to act as an "Autonomous Engineer," capable of seeing, reading, and calculating solutions in real-time to prevent catastrophic failure before it happens.

What it does

Sentinel-3 is a multimodal AI agent designed to act as a 24/7 "Digital Engineer" for industrial facilities. It doesn't just process text; it fuses three distinct data streams to solve complex maintenance problems:

Visual Inspection: It "watches" equipment video feeds to detect physical anomalies like excessive vibration or smoke.

Deep Technical Knowledge: It "reads" and retains information from 500-page PDF manuals, instantly retrieving specific torque specs, safety clearances, and part numbers.

Data Analytics: It autonomously writes and executes Python code to analyze high-frequency sensor logs (CSV), calculating real-time risk scores based on historical failure thresholds.

How we built it

We engineered an agentic pipeline using the Gemini 3 Pro and Flash models via the Google-GenAI Python SDK. The architecture focuses on Multimodal Data Fusion:Vision: Using Gemini's video ingestion to identify mechanical anomalies (e.g., centrifugal pump cavitation).Long-Context Retrieval: Leveraging the 1M token window to "read" entire PDF maintenance manuals (like the Goulds Pump 3196 Series).Native Code Execution: Instead of simple pattern matching, the agent writes and executes Python scripts to analyze massive CSV sensor logs ($10,000+$ rows) directly.

Challenges we ran into

The road to a stable build was paved with technical hurdles:

UI Resilience: We faced persistent "Uncaught Errors" in the AI Studio frontend. We pivoted to a Google Colab-based API architecture, which offered more stability and allowed for deeper debugging.

The Token Wall: Large industrial datasets often triggered 429 RESOURCE_EXHAUSTED errors. We solved this by implementing a "Snippet-to-Full-Scan" strategy: the agent reads a data schema snippet to write accurate Python logic, then uses Code Execution to process the full file locally on the disk.

Security: We implemented Colab Secrets to ensure API credentials remained protected while keeping the notebook open-source for the judges.

Accomplishments that we're proud of

The "Agency" Pivot: When the standard AI Studio UI faced backend outages, we didn't stop. We successfully transitioned the entire project to a custom API-driven architecture in Google Colab using the latest google-genai SDK.

Token Optimization: We developed a proprietary "Snippet-to-Full-Scan" strategy. This allows the model to understand massive datasets (10,000+ rows) without hitting API rate limits by using a schema-first reasoning approach.

Thinking in Depth: We successfully implemented Thinking Level: HIGH, enabling the agent to provide a "Chain of Thought" reasoning log. This transparency is crucial for high-stakes industrial environments where engineers need to see why a decision was made.

Code-on-the-Fly: Achieving a 100% success rate in having the AI write and execute its own diagnostic scripts to find anomalies that a human analyst might take hours to spot.

What we learned

This project pushed our understanding of Agentic Agency. We learned that the power of Gemini 3 isn't just in answering questions, but in its ability to reason about which tool to use. We discovered that "Thinking Mode" is the bridge between raw data and actionable engineering decisions.

What's next for Sentinel-3

Edge Integration: Porting the Sentinel-3 backend to NVIDIA Jetson or similar edge devices for real-time, on-site vibration analysis without needing a constant cloud connection.

Live Sensor Streaming: Transitioning from static CSV uploads to a live MQTT/WebSocket stream, allowing the agent to "monitor the pulse" of a factory floor in real-time.

Augmented Reality (AR) HUD: Developing a prototype interface for Google Glass or mobile devices where technicians can point their camera at a machine and see Gemini’s maintenance "Thoughts" overlaid on the hardware.

Multi-Agent Coordination: Expanding to a swarm of agents where one monitors safety, another manages spare part inventory, and a third coordinates the repair schedule based on manual specs.

Built With

Share this project:

Updates