🌍 Project Story: Data-Driven Rooftop Harvesting Intelligence System
đź’ˇ Inspiration
This project originally started as a completely different idea, but it shifted after I reflected on a real-world issue happening closer to home. The hackathon prompt pushed me to think about infrastructure and environmental optimization, and that immediately made me think about the ongoing drought and water stress concerns in regions like Corpus Christi.
Seeing how unpredictable water availability can directly impact cities, industries, and long-term planning made me rethink what “resource optimization” really means in a modern context. Instead of building something abstract, I wanted to design a system that connects physical infrastructure (buildings), environmental constraints (water stress), and corporate behavior (ESG and sustainability commitments) into a single decision-making score.
That idea became the foundation for my project:
a unified intelligence system that evaluates rooftop harvesting viability and corporate environmental alignment using real-world data sources.
đź§ What I Learned
This project forced me to connect multiple domains that normally don’t interact:
1. Environmental + Financial Data Are Deeply Coupled
I learned that water cost, drought severity, and regulatory pressure can actually be modeled as a structured signal:
[ S_{financial} = f(\text{water cost}, \text{water stress}, \text{rebates}, \text{regulation}) ]
Even small changes in regional water stress drastically affect feasibility scores.
2. Corporate Sustainability Data Is Extremely Fragmented
One of the biggest surprises was that ESG information is not centralized. For example:
- 10-K filings contain mostly legal risk language
- ESG commitments are often in separate sustainability reports
- Net-zero goals may only appear in press releases or PDFs
This taught me that “clean datasets” almost never exist in real-world systems.
3. AI is Most Useful as a Bridge, Not a Replacement
I used AI (Gemini) not to “solve everything,” but to:
- Normalize messy OpenStreetMap building names
- Infer corporate entities from noisy text
- Bridge ambiguous cases where deterministic parsing failed
This made me realize that AI is best used as a semantic fallback layer, not the primary logic.
🏗️ How I Built It
The system is structured as a three-layer micro-architecture, each handling a different dimension of the final score:
đź§± 1. Physical Intelligence Layer
This service evaluates rooftop potential using:
- Roof area estimation
- Building classification (commercial, industrial, residential)
- Viability tiering
It produces a score based on:
[ S_{physical} = A_{roof} \cdot w_{type} + bonus_{scale} ]
Where:
- (A_{roof}) = normalized roof area
- (w_{type}) = building type multiplier
- bonus = large-scale infrastructure adjustments
đź’° 2. Financial / Environmental Pressure Layer
This module models regional water economics:
- Water pricing
- Sewer costs
- Water stress index
- Local rebates and incentives
This produces a normalized score:
[ S_{financial} \in [0, 100] ]
It captures how “expensive or urgent” water scarcity is in a region.
🏢 3. Corporate ESG Intelligence Layer
This was the most complex part.
I built a pipeline that:
- Pulls SEC EDGAR 10-K filings
- Extracts relevant sections (Risk Factors, MD&A)
- Runs keyword-based ESG signal extraction
- Converts signals into structured scores
Key extracted signals include:
- Climate risk mentions
- Carbon emissions disclosures
- Net-zero commitments
- ESG framework adoption (GRI, SASB, TCFD)
The final corporate score is computed as:
[ S_{corporate} = 0.40S_{ESG} + 0.35S_{climate} + 0.25S_{regulatory} ]
đź”— 4. Aggregator Layer
Finally, everything is combined:
[ S_{final} = 0.34S_{physical} + 0.33S_{financial} + 0.33S_{corporate} ]
This creates a unified “harvest viability score” that blends infrastructure, environment, and corporate behavior.
🎨 Frontend
The frontend was built using v0, which allowed rapid prototyping of:
- Score dashboards
- Component breakdown views
- Data transparency panels
This let me focus more on backend intelligence rather than UI engineering.
⚠️ Challenges Faced
1. EDGAR / SEC Data Complexity
One of the hardest parts was working with the SEC EDGAR system.
Challenges included:
- Inconsistent company naming conventions (e.g., Alphabet vs Google)
- Accession number–based file paths
- Strict requirement for correct CIK resolution
- HTML-heavy 10-K documents requiring robust parsing
For example, retrieving a filing required a multi-step pipeline:
- Company name → CIK
- CIK → latest 10-K metadata
- Accession number → document index
- Index → actual filing file
- HTML → cleaned text extraction
A single mismatch would break the entire pipeline.
2. Data Noise and Ambiguity
OpenStreetMap building names like:
- “Google Building 41”
- “Amazon Fulfillment Center DFW7”
are not clean corporate identifiers.
This required:
- Heuristic-based parsing
- AI fallback inference
- Alias mapping systems
3. Balancing Deterministic Logic vs AI
A key design challenge was deciding:
- what should be rule-based
- what should be AI-assisted
Too much AI → inconsistent results
Too many rules → brittle system
The final design uses AI only when deterministic extraction fails.
📊 Final Reflection
This project taught me how real-world systems are rarely about a single model or dataset. Instead, they are about orchestrating imperfect data sources into a coherent decision framework.
Even though each subsystem has limitations, combining them produces something meaningful:
A unified score that reflects physical feasibility, environmental pressure, and corporate responsibility.
🚀 Closing Thought
If I had to summarize the project in one line:
“I built a multi-source intelligence system that turns messy real-world infrastructure and corporate data into a single actionable sustainability signal.”
Log in or sign up for Devpost to join the conversation.