Inspiration

We wanted to build a system that makes environmental action immediate.

A lot of sustainability tools explain pollution, waste, or climate issues at a high level, but they do not help in the exact moment when a person is standing in front of litter or recyclable waste and wondering what to do. We were especially inspired by the idea that the same object can matter differently depending on environmental conditions around it.

That led us to Aeris: a real-time environmental intelligence system that sees waste through live computer vision, combines it with air-quality context, and gives people a clear action they can take immediately.

What it does

Aeris detects waste objects such as cans, paper, and bottles in real time through a live camera feed. Once an object is detected, Aeris combines that detection with environmental context, including CASTNET air-quality data and live contextual signals, and generates a short recommendation about what the user should do next.

The system is designed to answer a practical question:

“What is this object, and what should I do with it right now?”

In the live app, users can:

  • view real-time object detection with bounding boxes
  • upload short video clips for detection
  • receive environmental recommendations beside the live video
  • use a custom fine-tuned waste model instead of relying only on generic classes

How we built it

We built Aeris as a Streamlit-first system so the main experience could stay fast, visual, and easy to demo live.

Our stack included:

  • Python
  • Streamlit for the live camera interface
  • YOLO / Ultralytics for object detection
  • FastAPI for backend endpoints and data plumbing
  • Gemini API for environmental reasoning and recommendation generation
  • CASTNET dataset for air-quality context and environmental signal grounding
  • OpenCV / streamlit-webrtc / PyAV for real-time and uploaded-video processing

On the vision side, we started with generic YOLO detection, then moved to a custom fine-tuned model for our target waste classes: can, paper, and bottle. We converted annotated COCO-style data into YOLO format, cleaned corrupted images, organized train/validation splits, and trained custom checkpoints remotely.

On the intelligence side, we built a recommendation layer that uses Gemini as the primary LLM path. Instead of only naming an object, Aeris explains why that object matters environmentally and suggests a specific action.

We also optimized the real-time experience by reducing inference size, tuning frame skipping, using tracking for smoother boxes, and cleaning up the UI so the video only shows the object label and bounding box while the reasoning stays in the side panel.

Challenges we ran into

One of our biggest challenges was that generic object detection was not enough for our use case. Off-the-shelf models often missed crumpled paper, confused bottles and cans, or failed on cluttered scenes.

We also ran into several engineering issues while training and integrating the custom model:

  • dataset formatting and conversion issues
  • corrupted JPEGs inside training data
  • dependency issues during remote training
  • model path mismatches
  • slow live inference and low FPS
  • duplicate UI information between the live video and the app panel
  • provider import issues while wiring Gemini into the recommendation layer

Another challenge was balancing speed and clarity. We wanted the system to feel real-time, but also wanted the recommendations to be meaningful and grounded in environmental context rather than feeling like a generic chatbot response.

Accomplishments that we're proud of

We are proud that Aeris became a working end-to-end environmental intelligence system instead of just a detector demo.

Some highlights:

  • fine-tuned a custom waste detection model for can, paper, and bottle
  • integrated live computer vision with environmental context
  • made Gemini part of the core product behavior, not just a one-off demo call
  • built a real-time Streamlit interface with live detection and uploaded-video support
  • improved model quality significantly through annotation cleanup and retraining
  • created a system that connects physical objects to immediate environmental action

We are especially proud that the project combines three things meaningfully:

  1. live vision
  2. real environmental data
  3. actionable AI reasoning

What we learned

We learned that the hardest part of AI projects is not just calling a model. It is making multiple systems work together reliably in real time.

We learned a lot about:

  • preparing and cleaning detection datasets
  • fine-tuning YOLO for a narrow domain
  • designing a fast real-time vision loop
  • keeping AI output useful instead of overly verbose
  • using environmental data to support product decisions and recommendations
  • making a live demo feel focused and believable

We also learned that good AI UX often means showing less on screen, not more. Cleaning up the live display made the system easier to trust and easier to understand.

What's next for Aeris - The Environmental Intelligence System

Our next step is expanding Aeris from a small custom waste detector into a broader environmental response assistant.

Planned next steps:

  • add more waste and pollution-related classes such as wrappers, bags, bins, and contaminated materials
  • improve long-range and multi-object detection performance
  • build a stronger environmental insights dashboard on top of CASTNET trends
  • personalize recommendations by location and municipal recycling rules
  • support historical reporting and impact tracking
  • deploy Aeris publicly so it can be used in schools, campuses, and public spaces

Long term, we see Aeris as an environmental intelligence layer that helps people make better decisions in real time, not just after the damage is already done.

Built With

  • castnet-dataset
  • coco-annotations
  • fastapi
  • gemini-api
  • modal
  • opencv
  • pyav
  • python
  • pytorch
  • streamlit
  • streamlit-webrtc
  • ultralytics-yolo
Share this project:

Updates