Wall-E

Webapp interface
Detection report card for each trash site (sample data used)
AI summary, cleanup/action recommendations and top trash hotspots
Robot prototype

Why We Built This

Urban waste management is a growing challenge in cities worldwide, and a glance around us reveals its relevance in downtown Toronto. In fact, at our current rate of disposal, which exceeds 400,000 tonnes per year, the city projects that the Green Lane Landfill has approximately 10 years remaining before it reaches capacity. In recognition of this urgency, we also realized that traditional cleanup efforts are reactive and inefficient - by the time trash is reported and cleaned up, it has already caused environmental damage and community health risks. Moreover, improper waste sorting by individuals leads to contamination of recycling streams, with current contamination rates reaching as high as 27-29%. Thus, we drew inspiration from the beloved Pixar movie, Wall-E, by building a robot that transforms waste management from reactive to proactive. This is achieved by creating an autonomous system that detects, classifies, and maps trash in real-time, enabling immediate community response while ensuring proper waste categorization.

What Wall-E Does

Wall-E is an end-to-end trash detection and management platform consisting of three integrated components:

Physical Robot Component
Vision System (YOLO v8): A real-time computer vision system that combines YOLO object detection with Google Gemini AI for intelligent trash identification
Cloud Backend (aiBackend.py): A Flask-based API server that orchestrates AI analysis and data management.
Interactive Dashboard (webapp.html): A responsive web interface for visualization and action, including a live map with trash detection markers powered by Leaflet.js, AI-generated insights panel showing patterns and recommendations, real-time statistics (total items found, recyclable count, locations), and more.

Technical Stack

Computer Vision & AI:

YOLOv8 - Real-time object detection
Google Gemini 2.0 Flash - Vision analysis and natural language insights
OpenCV - Image processing and manipulation
NumPy - Numerical operations Backend:
Python 3.x
Flask - Web server framework
Flask-CORS - Cross-origin resource sharing
Pillow (PIL) - Image handling
Requests - HTTP client Frontend:
HTML5/CSS3/JavaScript
Leaflet.js - Interactive mapping
Leaflet.heat - Heatmap visualization
Modern CSS (gradients, glassmorphism, animations)
Responsive design for mobile and desktop Data & Storage:
JSON file-based database for detections
Local image storage with organized directory structure
RESTful API architecture

How It Works

Detection: Camera/robot captures images of the environment
Local Processing: YOLO quickly identifies if objects are present
Cloud Analysis: Promising frames are uploaded to the backend
AI Classification: Gemini Vision analyzes the image and identifies specific trash items, materials, and disposal categories
Storage: Detection data (with GPS, timestamp, image, AI results) is saved
Visualization: Dashboard displays new detections on the map with detailed information
Insights: Gemini analyzes patterns across all detections to provide strategic cleanup recommendations
Action: Users can navigate to locations and mark them as cleaned

Challenges We Faced

Integrating Gemini API for the First Time None of us had worked with Google’s Gemini Vision API before. Getting it to process image data correctly was harder than expected — especially when dealing with file size limits, authentication issues, and JSON parsing.
Connecting the Camera Feed to the Cloud Creating a seamless pipeline from local video feeds to cloud processing required balancing speed, accuracy, and reliability.
Managing Latency and Real-Time Constraints Gemini’s analysis can take several seconds, which doesn’t pair well with a 30+ FPS video stream. Early tests quickly overloaded our backend.
Environment Variables and API Keys A surprisingly tricky issue came from environment variable persistence. Our API keys worked in one terminal but disappeared when launching scripts elsewhere. We eventually hardcoded them for testing (with a big note to externalize them for production). It’s not perfect, but for a hackathon prototype, it was a practical solution that let us keep moving.

Real-World Impact

Wall-E helps cities and organizations move from cleanup to prevention. By automating trash detection and classification, we can accelerate cleanup response times, reduce recycling contamination, empower data-driven waste policies, and engage communities through transparency and technology.

Future enhancements

Building and deploying robots for real-world testing
Database migration from JSON to scalable cloud storage (e.g., MongoDB, PostgreSQL)
More extensive predictive modeling to forecast where trash is likely to appear
Community engagement tools (leaderboards, verified cleanups, volunteer impact tracking)
Municipal integration with local waste management APIs for automated coordination