🌍 Project Story: Data-Driven Rooftop Harvesting Intelligence System

💡 Inspiration

This project originally started as a completely different idea, but it shifted after I reflected on a real-world issue happening closer to home. The hackathon prompt pushed me to think about infrastructure and environmental optimization, and that immediately made me think about the ongoing drought and water stress concerns in regions like Corpus Christi.

Seeing how unpredictable water availability can directly impact cities, industries, and long-term planning made me rethink what “resource optimization” really means in a modern context. Instead of building something abstract, I wanted to design a system that connects physical infrastructure (buildings), environmental constraints (water stress), and corporate behavior (ESG and sustainability commitments) into a single decision-making score.

That idea became the foundation for my project:

a unified intelligence system that evaluates rooftop harvesting viability and corporate environmental alignment using real-world data sources.

🧠 What I Learned

This project forced me to connect multiple domains that normally don’t interact:

1. Environmental + Financial Data Are Deeply Coupled

I learned that water cost, drought severity, and regulatory pressure can actually be modeled as a structured signal:

[ S_{financial} = f(\text{water cost}, \text{water stress}, \text{rebates}, \text{regulation}) ]

Even small changes in regional water stress drastically affect feasibility scores.

2. Corporate Sustainability Data Is Extremely Fragmented

One of the biggest surprises was that ESG information is not centralized. For example:

10-K filings contain mostly legal risk language
ESG commitments are often in separate sustainability reports
Net-zero goals may only appear in press releases or PDFs

This taught me that “clean datasets” almost never exist in real-world systems.

3. AI is Most Useful as a Bridge, Not a Replacement

I used AI (Gemini) not to “solve everything,” but to:

Normalize messy OpenStreetMap building names
Infer corporate entities from noisy text
Bridge ambiguous cases where deterministic parsing failed

This made me realize that AI is best used as a semantic fallback layer, not the primary logic.

🏗️ How I Built It

The system is structured as a three-layer micro-architecture, each handling a different dimension of the final score:

🧱 1. Physical Intelligence Layer

This service evaluates rooftop potential using:

Roof area estimation
Building classification (commercial, industrial, residential)
Viability tiering

It produces a score based on:

[ S_{physical} = A_{roof} \cdot w_{type} + bonus_{scale} ]

Where:

(A_{roof}) = normalized roof area
(w_{type}) = building type multiplier
bonus = large-scale infrastructure adjustments

💰 2. Financial / Environmental Pressure Layer

This module models regional water economics:

Water pricing
Sewer costs
Water stress index
Local rebates and incentives

This produces a normalized score:

[ S_{financial} \in [0, 100] ]

It captures how “expensive or urgent” water scarcity is in a region.

🏢 3. Corporate ESG Intelligence Layer

This was the most complex part.

I built a pipeline that:

Pulls SEC EDGAR 10-K filings
Extracts relevant sections (Risk Factors, MD&A)
Runs keyword-based ESG signal extraction
Converts signals into structured scores

Key extracted signals include:

Climate risk mentions
Carbon emissions disclosures
Net-zero commitments
ESG framework adoption (GRI, SASB, TCFD)

The final corporate score is computed as:

[ S_{corporate} = 0.40S_{ESG} + 0.35S_{climate} + 0.25S_{regulatory} ]

🔗 4. Aggregator Layer

Finally, everything is combined:

[ S_{final} = 0.34S_{physical} + 0.33S_{financial} + 0.33S_{corporate} ]

This creates a unified “harvest viability score” that blends infrastructure, environment, and corporate behavior.

🎨 Frontend

The frontend was built using v0, which allowed rapid prototyping of:

Score dashboards
Component breakdown views
Data transparency panels

This let me focus more on backend intelligence rather than UI engineering.

⚠️ Challenges Faced

1. EDGAR / SEC Data Complexity

One of the hardest parts was working with the SEC EDGAR system.

Challenges included:

Inconsistent company naming conventions (e.g., Alphabet vs Google)
Accession number–based file paths
Strict requirement for correct CIK resolution
HTML-heavy 10-K documents requiring robust parsing

For example, retrieving a filing required a multi-step pipeline:

Company name → CIK
CIK → latest 10-K metadata
Accession number → document index
Index → actual filing file
HTML → cleaned text extraction

A single mismatch would break the entire pipeline.

2. Data Noise and Ambiguity

OpenStreetMap building names like:

“Google Building 41”
“Amazon Fulfillment Center DFW7”

are not clean corporate identifiers.

This required:

Heuristic-based parsing
AI fallback inference
Alias mapping systems

3. Balancing Deterministic Logic vs AI

A key design challenge was deciding:

what should be rule-based
what should be AI-assisted

Too much AI → inconsistent results
Too many rules → brittle system

The final design uses AI only when deterministic extraction fails.

📊 Final Reflection

This project taught me how real-world systems are rarely about a single model or dataset. Instead, they are about orchestrating imperfect data sources into a coherent decision framework.

Even though each subsystem has limitations, combining them produces something meaningful:

A unified score that reflects physical feasibility, environmental pressure, and corporate responsibility.

🚀 Closing Thought

If I had to summarize the project in one line:

“I built a multi-source intelligence system that turns messy real-world infrastructure and corporate data into a single actionable sustainability signal.”

Built With

Updates

Noel Varghese started this project — Apr 12, 2026 03:19 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.