Inspiration
Web3 ecosystems face a critical problem: how do you fairly recognize and reward developers who contribute across fragmented platforms? While working with communities like OpenBuild, I witnessed teams spending hundreds of hours manually tracking GitHub contributions, unable to connect them with on-chain activities. Developers with significant cross-ecosystem contributions remained invisible, and reward distribution felt arbitrary rather than merit-based.
The inspiration struck at ETHShenzhen Hackathon (August 2024): what if we could build the first unified dataset linking GitHub identities to blockchain wallets, creating unprecedented intelligence for Web3 ecosystems? This wasn't just about analytics—it was about fundamentally changing how Web3 communities identify, evaluate, and reward their builders.
What We Learned
Building Web3Insight taught us that identity correlation is exponentially harder than pure data aggregation. GitHub APIs don't expose user-level relationships, and blockchain addresses are pseudonymous by design. We had to develop a multi-layered verification system combining signed wallet connections, behavioral analysis, and LLM-powered detection.
We also learned that explainability matters more than accuracy in Web3. Early versions had black-box scoring that communities didn't trust. By adding transparent evidence cards showing every commit, PR, and on-chain interaction behind each score, adoption skyrocketed. Communities want to understand why someone earned rewards, not just that they did.
Finally, scaling data infrastructure to process millions of events daily while maintaining sub-second query speeds required rethinking our architecture. Moving from traditional databases to Apache Iceberg with DuckDB gave us the performance needed for real-time ecosystem analytics.
How We Built It
Tech Stack:
- Rust indexer processing 1M+ events/hour from GHArchive and blockchain explorers
- Apache Iceberg + DuckDB for scalable data lake supporting 5+ years of historical data
- LLM-powered semantic analysis quantifying contribution quality beyond simple metrics
- Smart contracts for automated, transparent on-chain payouts with vesting mechanisms
Three-Layer Evaluation System:
- Ecosystem Activity Score - Graph-based metrics tracking commit velocity, contributor churn, governance activity
- Contribution Attribution - ContributionRank algorithm measuring collaboration influence, plus semantic depth analysis
- Dependency Incentives - Recursive value flow rewarding maintainers of critical libraries
The platform architecture separates data collection (Rust workers), processing (Python + LLMs), storage (Iceberg), and distribution (smart contracts), allowing each component to scale independently.
Challenges We Faced
Identity Verification: Creating a sybil-resistant verification system without compromising privacy was our biggest challenge. We solved this through multi-signal validation—GitHub account age, contribution consistency, social graph analysis, and on-chain activity correlation. Accounts need to pass multiple independent checks before receiving rewards.
Cross-Chain Data Correlation: Each blockchain has different data formats and APIs. Building unified schemas that work across Ethereum, Starknet, Mantle, and 200+ other chains required extensive data normalization and constant maintenance as chains evolve.
Scaling Without Sacrificing Speed: Processing billions of historical events while maintaining real-time updates seemed impossible initially. The breakthrough came from separating hot (recent) and cold (historical) data paths, using DuckDB for blazing-fast analytics on Iceberg's columnar format.
Ecosystem Adoption: Convincing communities to trust an automated system for reward distribution required building credibility through transparency. Every metric is versioned, every score includes exportable evidence, and all payouts are auditable on-chain. This radical transparency turned skeptics into advocates.
From MVP to Production: What started as a hackathon prototype now serves 200+ ecosystems with production-grade reliability. The journey from proof-of-concept to handling real money in reward distributions meant obsessing over security, building comprehensive testing, and establishing clear governance processes for metric adjustments.
Current Status: Live at web3insights.app with partnerships across major ecosystems.
Built With
- duckdb
- nextjs
- postgresql
- rust
Log in or sign up for Devpost to join the conversation.