Inspiration

Most homeowners don't find out their solar panels are underperforming until they open their electricity bill — months after the problem started. A single degraded 8.4 kW system can silently lose $870 per year, and across San Diego County, this invisible problem costs millions. I wanted to build something that catches faults in seconds, not months.

What I Learned

I learned how to build a fully serverless real-time ML pipeline on AWS — from raw sensor data to a live WebSocket dashboard. I also learned the physics of photovoltaic output — specifically how to model expected energy production using Scripps irradiance, panel temperature, system size, and roof tilt/azimuth. Understanding this let me build a weather-corrected baseline, so I only alert when output is low given actual sunlight, not just low in absolute terms.

How I Built It

Parsed real Scripps AWN sensor readings — solar radiation, temperature, humidity, UV index — across multiple San Diego station files Crossed each weather snapshot with 50 ZenPower permit specs (system size, tilt, azimuth) to generate training rows Trained an XGBoost regression model on SageMaker to predict expected kWh per 5-minute interval Built a serverless pipeline: S3 → scorer Lambda → SageMaker endpoint → DynamoDB → DynamoDB Streams → broadcaster Lambda → API Gateway WebSocket → React dashboard Deployed everything with AWS CDK (64 cloud resources in a single Python file) Challenges

Weather-correcting the anomaly threshold — raw output thresholds fire on every cloudy afternoon. Getting the model to learn the difference between "low output because clouds" vs. "low output because hardware fault" required careful feature engineering using Scripps irradiance as the primary signal. Real-time WebSocket fanout — broadcasting scored readings to all connected dashboard clients through API Gateway required careful connection management in DynamoDB, handling stale connections gracefully. Demo reliability — making a live ML pipeline work reliably under hackathon conditions meant building a full replay system so judges can see anomalies fire on demand, not just hope the timing works out.

Built With

Share this project:

Updates