SolarWarden: AI-Powered Solar Farm Operations Assistant

The Challenge: Scaling Solar Farm Operations

System Complexity Issues:

  • Interconnected components (String panels → Inverters → Transformers → Grid) creating single points of failure
  • Overwhelming logs for each component, pinpointing the failure point is tedious, even with charting tools
  • Average 12-18 alarm events per hour during fault conditions
  • 72% of maintenance teams report "alarm fatigue" from false positives

Financial Impacts:

  • Each hour of downtime costs $2,500-$7,000 for a 50MW facility
  • Manual diagnostics require 3-5 technician hours per incident, excluding time to arrange work orders and on-site evidence collecting and reporting
  • 28% of energy losses due to delayed fault detection

Solar Farm owners

  • Alarm flooding in control centers
  • Manual root cause analysis delays
  • Significant operational downtime costs

Key Challenge: Optimizing the human-AI handoff process for maintenance verification consumed significant development cycles, requiring iterative testing of confirmation protocols and ADK state tracking mechanisms.

Our Solution: SolarWarden

SolarWarden is an agentic AI platform built on Google's Agent Development Kit (ADK) with Gemini integration, delivering:

  1. Automated Anomaly Detection - Continuous monitoring of device logs
  2. Intelligent Diagnostics - Deep analysis of performance metrics
  3. Context-Aware Alarms - Error code interpretation using technical manuals
  4. Executive Reporting - Actionable insights and recommendations

Technical Architecture

ADKarchitectur

SolarWarden's multi-agent system features:

Core Orchestrator Agent

  • Processes user queries and intents
  • Validates input data quality
  • Coordinates analysis workflows
  • Generates final consolidated reports

Specialized Sub-Agents

  1. daily_pr_agent - Plant-wide efficiency monitoring
  2. detailed_inverter_performance_agent - Inverter-level diagnostics
  3. detailed_plant_timeseries_agent - Temporal pattern analysis
  4. alarm_research_agent - Root cause investigation with grounding from device manufacturer maintenance manual (using Vertex AI RAG)
Solution Architecture

TechStack

Implementation Challenges

During development, we focused on the scope of:

Investigation Planning

  • We first encounter the problem where the data analysis takes too much time because the flow of fetching data and analysis takes too much time, basically it can be simplified as this flow, where there is a to-and-fro for each data scope we wish to tackle Agent -> fetch_data (mcp) -> Agent -> Analysis We tackle this problem by utilizing sequential_agent and parallel_agent architecture to perform the analysis as vectorized as possible

RAG Engine Implementation

  • At the initial stages of development, when we used only one knowledge store to contain all real-world context operation manuals, we discovered that the Alarm Trace Assistant would get confused by the multiple contexts without proper filtering. This resulted in less accurate RAG results.
  • This issue was resolved by dividing the knowledge base into several unique knowledge bases and listing out all knowledge bases for the agent before querying them. This allows the agent to compare and find the most relevant knowledge base effectively before it starts querying to obtain more accurate results.

Report findings from the agent

  • Due to the nature of the data, which consists of days of 5-minute interval data, we find the response from the agent is lengthy. But when we limit the characters to be output by the agent, the agent tends to oversimplify the response, and the analysis report ends up less useful for the next agent.
  • We tackled this problem by utilizing each agent's output with the after_agent_callback to use another LLM to make structured output in the form of JSON (also very helpful for debugging)

Future Plan of SolarWarden

Planned Work Order Integration:

  1. Developed human-in-the-loop architecture for maintenance workflows, where a workorder_agent creates and assign workorder based on the investigation result.

  2. Integrate with Google Tasks MCP servers for contractor assignment and progress tracking

  3. Multimodal Analysis:

  4. Integrated Gemini VLLM for equipment image/video diagnostics returned by on site contractors and further enrich the investigation report

Built With

Share this project:

Updates