Inspiration

The challenge of autonomous AI agents in crypto is fundamentally a trust problem. How do you give an AI agent enough power to be useful while ensuring it can't expose your private keys? Most solutions either compromise on security or limit functionality to the point of being impractical.

We saw an opportunity to solve this by combining three key technologies from the sponsor list: IronClaw's Trusted Execution Environment for credential isolation, Neo4j's graph database for transparent governance, and Senso's context verification for preventing AI hallucinations. Each addresses a critical piece of the puzzle.

Our goal was to build an AI agent that could autonomously scout deals, verify information, check compliance policies, and execute transactions—while cryptographically proving that credentials never leave an encrypted enclave. We wanted to demonstrate that autonomous agents can be both powerful and trustworthy.

FORGE represents our vision for how production AI agents should work: secure by design, transparent by default, and verifiable at every step.


What it does

FORGE is an autonomous AI agent platform for crypto commerce that prioritizes security and governance. The system enables users to delegate trading decisions to an AI agent while maintaining complete control through policy-based governance.

Note: The backend supports real voice command processing via Modulate API. The current demo simulates voice input by providing pre-defined text commands to demonstrate the workflow without requiring audio recording.

The workflow:

Users issue natural language commands (simulated as text in the demo), such as "Scout NEAR NFTs under 50 NEAR and buy the best one if risk is low." The system then:

  1. Processes the command - Transcribes audio (or parses text), analyzes intent and emotion, and performs fraud detection to ensure the command is legitimate.

  2. Scouts the market - Uses Tavily for real-time web search and Yutori Scouts for ongoing market monitoring to identify deals matching the user's criteria.

  3. Verifies context - Queries Senso Context Hub to validate collection legitimacy, floor prices, and seller reputation against trusted sources, preventing hallucinations and scams.

  4. Checks policies - Retrieves user-defined risk policies from Neo4j (e.g., "maximum 50 NEAR per transaction") and validates the transaction against all applicable rules.

  5. Calculates impact - Uses Numeric to generate a "Flux Report" showing portfolio changes, variance analysis, and anomaly detection with AI-generated explanations.

  6. Executes securely - Leverages IronClaw's TEE to execute the transaction with credentials stored in an encrypted vault, ensuring the LLM never accesses private keys.

  7. Logs everything - Creates an immutable audit trail in Neo4j showing which policies were checked, what context was verified, and what actions were taken.

Three Demo Scenarios:

FORGE includes three scenarios demonstrating different outcomes:

  1. 🎨 Buy NFT (Success) - Agent scouts, verifies, and purchases a 28.79 NEAR NFT. All checks pass, transaction executes successfully, and Flux Report shows portfolio impact.

  2. 🚫 Policy Reject - Agent attempts to buy a 150 NEAR NFT, but Neo4j governance blocks it for exceeding the 50 NEAR policy limit. Demonstrates policy enforcement preventing overspending.

  3. ⚠️ Risk Reject - Agent finds a 5 NEAR NFT from a suspicious seller. Senso verification detects high fraud risk (0.85) and blocks the transaction before it reaches governance or execution. Demonstrates fraud protection.

The complete flow executes in under 15 seconds with real-time visualization on a live dashboard. Every decision is explainable through the governance graph, providing complete transparency and accountability.


How we built it

Architecture:

FORGE uses a microservices architecture with three primary layers:

  • Agent Runtime Layer - IronClaw TEE-secured runtime hosting an OpenAI GPT-4 powered agent with tool orchestration capabilities
  • Service Layer - Express.js API server integrating 9 sponsor technologies via REST APIs and SDKs
  • Presentation Layer - Next.js 14 dashboard with D3.js graph visualization and real-time updates via Server-Sent Events

Technology Stack:

Core Infrastructure:

  • IronClaw - TEE-secured agent runtime with encrypted credential vault and WASM sandbox for secure execution
  • OpenAI GPT-4 - LLM reasoning engine for autonomous decision-making
  • Neo4j - Graph database for policy storage, relationship modeling, and immutable audit trails
  • Render - Cloud hosting platform for dashboard and API deployment

Intelligence & Discovery:

  • Tavily - Real-time web search for market data and price discovery
  • Yutori - Continuous monitoring scouts for ongoing deal tracking
  • Senso Context Hub - Ground-truth verification to prevent AI hallucinations

Governance & Finance:

  • Modulate Velma - Voice transcription with intent analysis and fraud detection
  • Numeric - Financial reconciliation and portfolio variance analysis
  • Airbyte - ETL pipeline for data synchronization and initial seeding

Implementation Details:

The backend is built with TypeScript and Node.js, using Express for the API layer and the Neo4j driver for graph operations. We implemented circuit breaker patterns for API resilience and automatic fallback to mock responses when external services are unavailable.

The frontend uses Next.js 14 with Tailwind CSS for styling and D3.js for interactive graph visualization. Real-time updates are delivered via Server-Sent Events, allowing the dashboard to reflect agent activity within 2 seconds.

Security is enforced through IronClaw's TEE, where we built custom Rust WASM modules that retrieve credentials from the encrypted vault only during transaction execution. The LLM never receives credentials in its context, and every transaction includes credentialsExposed: false in the audit trail.

We deployed the system using Docker Compose for local development and Render for production hosting, achieving one-command setup for easy reproducibility.


Challenges we ran into

TEE Integration Complexity: Integrating IronClaw's Trusted Execution Environment required careful design to ensure credentials never leaked into LLM prompts. We built custom WASM tools that act as a secure bridge between the agent and the credential vault, retrieving keys only at execution time and logging access without exposing values.

API Orchestration: Coordinating 9 different sponsor APIs with varying authentication methods, rate limits, and response formats proved challenging. We implemented a unified tool interface with circuit breakers and automatic fallbacks to ensure the system remains functional even when external services are unavailable.

Performance Optimization: Meeting the sub-15 second requirement for the complete workflow required optimization at every level. We implemented connection pooling for Neo4j, aggressive caching with 5-minute TTL for search results, and parallel tool execution where dependencies allowed.

Graph Visualization: Rendering Neo4j's governance graph in real-time with D3.js while maintaining performance required careful optimization. We implemented incremental updates, limited node counts to 50, and used force-directed layouts that remain responsive under load.

Audit Trail Completeness: Ensuring every decision had complete provenance required thoughtful graph modeling. We designed a relationship schema that links Decision nodes to Policy, Context, and Transaction nodes, making the entire decision chain queryable and verifiable.


Accomplishments that we're proud of

Provable Security with IronClaw: We achieved genuine credential security with cryptographic proof through IronClaw's TEE. Every transaction in the audit trail shows credentialsExposed: false, verifiable through the governance graph. Private keys are retrieved from an encrypted vault only during transaction execution inside a secure enclave, never exposed to the LLM or application code. This isn't a claim—it's demonstrable through the system's architecture.

Three Scenario Demonstrations: Successfully implemented three distinct demo scenarios that showcase different aspects of the governance and verification layers: successful purchase (Buy NFT), policy enforcement (Policy Reject blocking 150 NEAR transaction), and fraud detection (Risk Reject catching suspicious seller). This demonstrates the system handles both success and failure cases gracefully.

Complete Integration: Successfully integrating all 9 sponsor technologies into a cohesive platform demonstrates the power of combining best-in-class tools. Each technology serves a specific purpose, and they work together seamlessly to create a production-ready system.

Performance: The complete workflow consistently executes in 10-12 seconds, well under our 15-second target. This makes the demo engaging and proves the architecture is performant enough for real-world use.

Transparency: Every decision is logged in Neo4j with complete provenance. Users can query "why did the agent make this decision?" and receive a full answer showing which policies were checked, what context was verified, and what calculations were performed.

Resilience: The system handles multiple consecutive demo runs without errors and automatically falls back to mock responses when APIs are unavailable. This production-ready resilience ensures reliability in real-world conditions.

Testing: We implemented 50+ property-based tests validating correctness properties including idempotence, invariants, and round-trip preservation. This comprehensive test coverage provides confidence in system behavior under varied inputs.


What we learned

TEE Architecture: Working with IronClaw's Trusted Execution Environment demonstrated that TEEs are essential for autonomous AI agents handling sensitive operations. The ability to cryptographically prove credentials were never exposed fundamentally changes the security model.

Graph Databases for Governance: Neo4j's graph model proved ideal for governance and audit trails. Relationships like (Decision)-[:CHECKED_POLICY]->(Policy) make provenance queries natural and efficient, something traditional relational databases struggle with.

Context Verification: Senso's ground-truth verification showed the importance of external validation for AI agents. Verifying information against trusted sources with confidence scores prevents costly mistakes from hallucinations.

Resilience Patterns: Implementing circuit breakers and automatic fallbacks taught us that production systems must gracefully degrade. The ability to continue functioning when external APIs fail is a feature, not a workaround.

Voice Interface Safety: Modulate's fraud detection highlighted the need for safety layers in voice interfaces for financial transactions. Analyzing emotion, urgency, and fraud risk before executing commands helps prevent social engineering attacks.

Integration Complexity: Coordinating multiple APIs requires unified interfaces, consistent error handling, and comprehensive logging. Each API has unique characteristics that must be abstracted away for maintainable code.

Real-Time Visualization: The live dashboard transforms the demo from abstract to tangible. Watching the governance graph grow in real-time and seeing agent status updates makes the system's operation visible and trustworthy.


What's next for FORGE

Immediate Next Steps:

Replace the mock NEAR wallet with real blockchain integration to enable actual on-chain transactions. This requires production-grade error handling and transaction monitoring.

Add multi-user support with authentication, user management, and isolated policy spaces. Each user would have their own governance graph and risk policies.

Develop a mobile application for on-the-go commerce with voice commands, push notifications for deal alerts, and a mobile-optimized dashboard.

Medium-Term Goals:

Expand beyond NEAR to support multiple blockchains (Ethereum, Solana, Polygon) with a unified interface for cross-chain commerce.

Integrate DeFi protocols for yield optimization, liquidity provision, and automated portfolio rebalancing based on user-defined strategies.

Add social features allowing users to share policies, follow successful traders, and access a marketplace for pre-configured policy templates.

Implement advanced analytics with portfolio tracking, performance metrics, and AI-generated insights comparing agent decisions against market benchmarks.

Long-Term Vision:

Build regulatory compliance modules for different jurisdictions with automatic reporting, tax calculation, and audit trail export for regulatory review.

Develop enterprise features including multi-signature governance, approval workflows, and institutional-grade security for DAOs and investment funds.

Explore zero-knowledge proofs for privacy-preserving audit trails that prove compliance without revealing transaction details.

Create a developer SDK and plugin marketplace enabling third-party developers to extend FORGE with custom tools and integrations.

FORGE aims to become the standard platform for autonomous AI agents in crypto commerce—secure, transparent, and trustworthy by design.


Built With

Share this project:

Updates