EcoCompute AI

Hongping Zhang posted an update — Jan 17, 2026 01:49 AM EST

Title: Building a "Green FinOps" Gatekeeper with Gemini 3

The Spark We often talk about AI's potential to solve climate change, but we rarely talk about the climate cost of AI itself. I was shocked to learn that training a single large language model can emit as much carbon as five cars over their entire lifetimes. As developers, we want to build powerful models, but we lack the tools to see the invisible energy cost of our code. I realized that if we want "Green AI" to be more than a buzzword, we need to make energy auditing as automatic as linting. That’s why I built EcoCompute AI.

What it does EcoCompute AI is an intelligent infrastructure agent—a virtual "Senior Performance Engineer"—that lives in your CI/CD pipeline. It doesn't just estimate carbon; it actively optimizes your code.

See: It scans PyTorch code and even hand-drawn architecture sketches. Search: It uses Google Search to find real-time 2026 hardware specs (e.g., NVIDIA B200 TDP) and MLPerf benchmarks, ensuring the data is never stale. Solve: It uses a Python sandbox to mathematically verify bottlenecks (like Arithmetic Intensity) and automatically refactors code to reduce energy consumption by 30-50%. How I built it ⚙️ The core is built on the Gemini 3 Pro model using the google/genai SDK. I implemented a "Hybrid Grounding" architecture:

Deep Reasoning: I allocated a thinkingBudget of 1024 tokens. This allows the agent to plan its audit strategy and "think" through complex physics calculations before generating a response. Tool Use: I integrated the googleSearch tool for live grounding and a custom codeExecution tool. The agent writes Python scripts to calculate FLOPs/Byte ratios, ensuring the math is accurate and not hallucinated. Self-Correction: I built a robust error-handling loop. If the Python sandbox throws an error (e.g., a precision mismatch), Gemini 3 catches it, analyzes the traceback, adjusts its assumptions (e.g., switching from FP32 to FP16), and retries automatically. Challenges I ran into The biggest challenge was the "Hallucination vs. Physics" problem. Early versions of the agent would confidently invent hardware specs or get the math wrong. I solved this by implementing Strict Tool Enforcement: forcing the model to cite sources for every number and use the sandbox for every calculation. If it can't prove it with code or a citation, it doesn't say it.

Accomplishments that I'm proud of

The "Cinematic" Demo Flow: I created a simulation engine that visualizes the agent's thought process—from searching Google to fixing its own code errors—making the "black box" transparent to the user. Infrastructure-First Design: Following the "Cal.com model," I made the reports embeddable. You can drop a live energy audit widget directly into a HuggingFace Model Card or internal documentation. What I learned Working with Gemini 3 taught me that Agentic workflows > RAG. For technical domains, giving the model tools to discover truth is far more powerful than just feeding it static documents. I also learned that "sustainability" is really a data problem—once developers see the "Carbon Debt" in their PRs, they naturally want to fix it.

What's next for EcoCompute AI I plan to implement a true backend parser for .nsys profiling logs to replace the current simulation, making the calibration engine enterprise-ready. I’m also looking to integrate it directly as a GitHub Action for automated PR reviews.

Log in or sign up for Devpost to join the conversation.