Argus: Real-Time Multimodal Vision for Lab Safety Compliance

System diagram

Inspiration

As scientists working in wet-labs daily, we realized that while AI is transforming fields like finance and software engineering, the physical laboratory remains a "dark zone." We use cutting-edge CRISPR and mass spectrometry, yet our safety compliance still relies on human memory and manual checklists. In a high-pressure lab environment, it’s remarkably easy to leave a tube unlabeled or forget to close a hazardous waste lid. We built Argus to be the "all-seeing" eye—the first AI-driven safety guardian designed by scientists, for scientists, to ensure that no health and safety protocol is overlooked.

What it does

Argus is a real-time monitoring system that turns any camera—from a laptop webcam to XREAL AR glasses—into a proactive safety officer. The user inputs specific regulatory concerns (e.g., "detect unlabeled Eppendorf tubes" or "open chemical containers"). The app then captures images at a set interval, analyzes them using Gemini 3, and pushes alerts to a message queue. Lab managers and users receive instant, time-stamped notifications of safety breaches, allowing for immediate corrective action before accidents happen.

How we built it

Our team of three integrated a full-stack vision pipeline: Front End: Shu built the UI using Flutter, enabling a cross-platform experience that handles high-frequency image capture and live feedback. Backend: Keith architected a Dockerized backend that orchestrates calls to the Gemini 3 API. We implemented a message queue to handle the flow of safety alerts from the model back to the user interface. Hardware: Fei successfully ported the visual feed and alert system to XReal glasses, allowing for a hands-free "heads-up display" (HUD) that keeps scientists' hands in the biosafety hood and their eyes on the experiment.

Challenges we ran into

The optical complexity of a wet-lab environment presents a unique computer vision problem. Unlike standard environments, laboratories are filled with transparent plastics, glassware, and meniscus levels that distort light. Because many containers are clear and labels are small, the signal-to-noise ratio is low. We plan to capture more lab images and videos to optimize Gemini 3's few-shot learning performance for real-time experimental guidance.

Accomplishments that we're proud of

We successfully bridged the gap between the physical lab bench and AI, creating a functional end-to-end pipeline that translates visual hazards into real-time digital alerts. A major highlight was porting the Argus interface to XReal AR glasses, proving that we can deliver critical safety notifications through a "hands-free" heads-up display.

What we learned

We gained significant experience in dynamic photo and video processing, learning when to scale resolution up for fine details—like reading tiny labels—or down for general safety oversight to minimize latency. By experimenting with thinking levels and sampling rates, we moved beyond simple detection into visual reasoning, discovering how to configure the model to handle the unique transparency and complexity of a wet-lab environment.

What's next for Argus: Real-Time Multimodal Vision for Lab Safety Compliance

Our next step is transitioning from interval-based photos to continuous video-stream ingestion, leveraging Gemini 3’s high frame-rate sampling to detect dynamic hazards like spills. We plan to enable complex natural language directives, allowing users to define sophisticated spatial rules—such as monitoring the distance between flammables and heat sources—while implementing dynamic resolution scaling for high-fidelity "deep dives" into small, transparent labware. Ultimately, by grounding the model in specific lab SOPs, Argus will evolve from a simple monitor into a protocol-aware assistant that understands the unique nuances of every scientific bench.