RigEye Project Submission

Inspiration

The project addresses the critical needs of oil rig workers who operate in one of the most hostile environments on earth. These workers are often disconnected from the cloud, surrounded by deafening noise, and manage high-pressure machinery where a single wrong decision can be fatal. Currently, they rely on manual gauge checks and massive 4,000-page PDF manuals that are impossible to search quickly during an emergency.

What it does

RigEye is an "Offline Safety Oracle" for the industrial edge that uses a Multimodal Small Language Model (SLM) to give workers super-human awareness.

  • Visual Diagnosis: A worker points their tablet at a scene, and RigEye identifies the equipment, diagnoses the problem (e.g., pressure spike, corrosion, missing safety gear), and retrieves the exact protocol from a local vector database.
  • Zero-Latency Alerts: It provides an instant "Safe/Unsafe" verdict without sending a single byte of data to the cloud, ensuring zero latency and total privacy.
  • Key Modules:
    • The Gauge Reader: Identifies valve types, reads pressure (e.g., 850 PSI), checks local safety limits, and alerts the worker if limits are exceeded.
    • The Corrosion Hunter: Classifies rust severity and auto-drafts maintenance tickets.
    • The PPE Enforcer: Detects missing gear (like H2S monitors) and locks work permits until the gear is visible.
    • The RAG Agent: Answers questions about equipment manuals using retrieval-augmented generation.

How we built it

  • Architecture: We built a pipeline that processes the Camera Feed and Worker Voice through the RunAnywhere SDK, feeding into a Local VLM and Vector Search (Local Vector Store) before outputting via Local TTS.
  • Model Selection: We chose Moondream2 (approx. 1.6B parameters) because it is small enough to run on consumer mobile hardware via the SDK while retaining high accuracy for OCR and scene understanding.
  • Optimization: We used the RunAnywhere SDK to initialize the inference engine, automatically adapting model execution to each device's capabilities, including mobile NPUs and GPUs.

Challenges we ran into

  • Memory Management: Running complex models on-device risks crashes; we had to implement smart memory management and automatic optimization techniques to avoid out-of-memory violations and maintain app responsiveness.
  • Structured Output: To ensure the local model provided usable data for the app, we had to pass in a JSON schema to force the model to output only valid JSON.

Accomplishments that we're proud of

  • Total Privacy (Air-Gapped): We created a completely Air-Gapped environment where proprietary imagery and voice logs never leave the device, ensuring compliance with strict industrial security standards and Data Residency laws.
  • Single Model Efficiency: Instead of training separate models for gauges, rust, etc., we successfully utilized a single "General Purpose Vision" VLM to solve multiple distinct problems simply by changing the system prompt.
  • Disaster Prevention: Enabling a system that can avert disaster using only the compute in a worker's pocket.

What we learned

  • VLM Versatility: We learned that a Multimodal SLM can replace multiple specialized models, solving distinct problems like corrosion detection and PPE enforcement with one architecture.
  • Security Value: We reinforced the understanding that uploading rig infrastructure photos to cloud-based AI is a massive security risk, making on-device inference essential for high-value industrial targets.

What's next for RigEye

  • Expanding Use Cases: We plan to further develop the "Corrosion Hunter" and "PPE Enforcer" modules to cover more complex safety scenarios.
  • Full Deployment: Realizing the vision of a completely offline "Safety Oracle" that can be deployed to the most remote industrial sites without compromising data security.

Built With

  • moondream2
  • runanywhere
  • tts
  • vectordb
Share this project:

Updates