Inspiration
What it does
💡 Inspiration
Industrial accidents and manufacturing errors cost the global economy billions of dollars every year. In high-stakes environments like pharmaceutical manufacturing, a single broken capsule or a worker missing a glove can ruin an entire batch or, worse, endanger patient safety.
Currently, factories rely on standard CCTV cameras, but these are passive systems. They require a human guard to stare at monitors 24/7. This approach is expensive and fundamentally flawed: humans cannot maintain 100% concentration for an entire shift. Fatigue leads to missed errors, and missed errors lead to accidents.
We wanted to solve this by building a system that doesn't just record video, but actually understands it.
🤖 What it does
VisionOps is an intelligent safety sentinel that turns standard security cameras into autonomous guardians. Powered by Google's Gemini 3 Pro, it analyzes live video feeds in real-time to detect hazards, safety violations, and quality control issues.
Key capabilities include:
- Real-Time Hazard Detection: Instantly identifies workers without PPE (helmets, masks), chemical spills, fires, or machinery malfunctions.
- Instant Multi-Channel Alerts: When a hazard is detected, the dashboard flashes a "Red Alert" and immediately sends an SMS via Twilio to the safety manager.
- Automated Compliance: The system logs every incident and generates downloadable PDF Incident Reports for legal and safety audits.
- Interactive 3D Dashboard: A futuristic, responsive command center built with Streamlit and Vanta.js for real-time monitoring.
⚙️ How we built it
The project is built on a robust Python backend with a modern Streamlit frontend.
- The Brain (Gemini 3 Pro): We stream video frames from the camera directly to the Gemini 3 Multimodal API. We engineered a specialized system prompt that forces the model to act as a "Factory Safety Officer" and output strict JSON data containing the safety status, confidence score, and specific violation details.
- The Eyes (OpenCV): We use OpenCV to capture and preprocess high-fidelity video feeds before sending them for inference.
- The Voice (Twilio): We integrated the Twilio API to bridge the digital and physical worlds. The moment Gemini detects a high-confidence threat, a server-side trigger dispatches an SMS alert.
- The UI (Streamlit + Vanta.js): We pushed the limits of Streamlit by injecting custom Javascript to create a reactive "Neural Network" 3D background that visually represents the AI thinking in real-time.
🧠 Challenges we ran into
- Prompt Engineering for Consistency: Getting a Large Language Model to output structured JSON 100% of the time was difficult. We spent hours refining the system prompt to ensure it wouldn't hallucinate and would only trigger on genuine hazards.
- Real-Time Latency: Sending video frames to the cloud takes time. We optimized our frame-skipping logic and image compression to find the perfect balance between detection speed and API bandwidth.
- Streamlit Customization: Streamlit is great for data apps but limited for custom UI. We had to learn how to inject raw HTML and CSS to get the "Cyberpunk" aesthetic and the 3D background working correctly.
🏆 Accomplishments that we're proud of
- End-to-End Automation: We successfully built a pipeline that goes from Visual Input → AI Understanding → Physical Alert (SMS) → Digital Record (PDF) without any human intervention.
- The "Wow" Factor: We are particularly proud of the UI. It looks and feels like a professional enterprise software product.
- Practical Utility: This isn't just a toy; it solves a real, expensive problem that exists in thousands of factories today.
🚀 What's next for VisionOps
While we optimized VisionOps for industrial safety, the underlying architecture is universal.
- Scalability: By simply updating the system prompt, this same code can be adapted to detect shoplifters in retail stores, monitor playground safety in schools, or secure bank vaults.
- Edge Deployment: We plan to explore distilling the model to run on edge devices (like Raspberry Pi) for environments with poor internet connectivity.
VisionOps isn't just a project; it's the future of autonomous industrial safety.
Log in or sign up for Devpost to join the conversation.