Anvil SDK

Tagline

The last tool library you'll ever write. Anvil generates, heals, and evolves your AI agent tools—automatically.


Inspiration

Every AI developer has lived this nightmare:

You spend weeks building the perfect agent. It searches Notion, queries databases, fetches weather data. You ship it. You sleep well.

Then Notion updates their API.

Your tool breaks. Your agent breaks. Your users break. You're back to writing code at 2 AM, reading documentation, fixing edge cases, and praying nothing else changes.

We call this "Tool Rot"—the inevitable decay of hard-coded integrations.

After rewriting our 47th API integration for the 3rd time, we asked ourselves: What if tools could write and repair themselves?

That question became Anvil.


What it does

Anvil is a Just-In-Time (JIT) Infrastructure SDK for AI agents. Instead of hard-coding tool implementations, you define intents—what you want the tool to do—and Anvil generates production-ready code on the fly.

The Magic in 3 Lines

from anvil import Anvil

anvil = Anvil()
tool = anvil.use_tool("search_notion", intent="Search my Notion workspace")

# That's it. Anvil reads the live docs, writes the code, and handles auth.
result = tool.run(query="Q4 Planning")

Core Capabilities

🔧 JIT Code Generation

  • Reads live API documentation at runtime
  • Generates complete, type-hinted Python functions
  • Handles authentication, error handling, and rate limits automatically

🩹 Self-Healing

  • Detects when tools fail (404s, schema changes, deprecations)
  • Automatically fetches updated documentation
  • Regenerates and patches the code
  • Retries without human intervention

🔐 Interactive Credential Resolution

  • Detects missing API keys from tool responses
  • Prompts users with helpful context ("Get your Notion API key at...")
  • Securely saves to .env and retries automatically

📦 Glass-Box Architecture

  • All generated code is saved to ./anvil_tools/
  • Fully readable, editable, and version-controlled
  • "Eject" any tool to take manual control

🔌 Framework Adapters

  • Native integration with LangChain, CrewAI, AutoGen, OpenAI Agents SDK
  • Convert any Anvil tool with .to_langchain(), .to_crewai(), etc.

🤖 Multi-Provider

  • Works with Claude, GPT-4, Grok
  • Bring your own API keys, use your preferred model

How we built it

Architecture

┌─────────────────────────────────────────────────────────────┐
│                        Anvil Core                          │
│                   (Orchestration Layer)                    │
└──────────────┬──────────────────────────────────────────────┘
               │
    ┌──────────┴──────────────────────────┐
    │                                      │
┌───▼─────────────┐           ┌───────────▼────────────┐
│  Tool Manager   │           │     Tool Loader        │
│  (File System)  │           │  (Dynamic Import)      │
└─────────────────┘           └────────────────────────┘
         │                              │
         │ manages                      │ executes
         ▼                              ▼
┌──────────────────────────────────────────────────────┐
│           Generated Tool Files (.py)                 │
│         /anvil_tools/search_notion.py                │
└──────────────────────────────────────────────────────┘

The JIT Pipeline

  1. Intent Analysis - Parse user's natural language intent
  2. Documentation Scraping - Fetch live API docs via FireCrawl
  3. Code Generation - LLM generates complete Python function
  4. Dependency Detection - Scan imports, auto-install packages
  5. Verification - Optional sandbox execution for security
  6. Caching - Save to disk with header protocol for versioning

Header Protocol

Every generated file includes metadata:

# ANVIL-MANAGED: true
# version: 1.0
# hash: 8f3a2b1c
  • ANVIL-MANAGED: true - Anvil can regenerate this file
  • ANVIL-MANAGED: false - User has "ejected", don't touch
  • hash - Intent hash to detect when regeneration is needed

Self-Healing Flow

tool.run() fails
    ↓
Detect error type (404, schema, timeout)
    ↓
Fetch updated documentation
    ↓
Send error + old code + new docs to LLM
    ↓
Generate fixed code
    ↓
Increment version (1.0 → 1.1)
    ↓
Retry automatically

Tech Stack

  • Python 3.10+ - Core SDK
  • Anthropic Claude / OpenAI GPT-4 / xAI Grok - Code generation
  • FireCrawl - Live documentation scraping
  • Rich - Beautiful CLI experience
  • Click - Command-line interface
  • importlib - Dynamic module loading
  • Docker - Optional sandboxed verification

Challenges we ran into

1. The "Hallucination Problem"

LLMs sometimes generate plausible-looking but incorrect API calls. We solved this by:

  • Always fetching live documentation, never relying on training data
  • Structured prompts that enforce specific output formats
  • Verification mode that runs generated code in a sandbox

2. Import Hell

Generated code might import packages the user doesn't have. We built:

  • Automatic dependency detection from import statements
  • Smart detection of common packages (requests, httpx, pandas, etc.)
  • Clear error messages guiding users to install missing deps

3. The Credential Dance

Every API needs different auth. Some need headers, some need OAuth, some need query params. Our solution:

  • Standardized missing_credential response pattern
  • Knowledge base of 50+ common API keys with help URLs
  • Interactive prompting that saves to .env

4. Regeneration vs. User Edits

What if a user manually edits generated code? We created the Header Protocol:

  • Files marked ANVIL-MANAGED: true can be regenerated
  • Users can "eject" by setting to false
  • Intent hashing prevents unnecessary regeneration

Accomplishments that we're proud of

  • Zero to working tool in 3 lines of code - The developer experience we always wanted
  • Self-healing actually works - Watch a tool break, heal, and succeed without touching code
  • Framework-agnostic - One tool works with LangChain, CrewAI, AutoGen, and OpenAI Agents
  • Published on PyPI - pip install anvil-agent just works
  • Comprehensive docs - Full documentation site with examples and guides
  • Production-ready - Type hints, error handling, logging, and testing built-in

What we learned

  1. LLMs are incredible code generators when given the right context (live docs > training data)
  2. Developer experience is everything - A beautiful CLI makes people actually want to use your tool
  3. Glass-box beats black-box - Developers trust tools they can inspect and modify
  4. Self-healing is possible - With the right architecture, code can fix itself

What's next for Anvil

Anvil Cloud (In Development)

The next evolution: a global cache of pre-generated, verified tools.

anvil = Anvil(mode="cloud")  # Instant retrieval, no LLM latency

How it works:

  • First user generates a "Search Notion" tool → cached globally
  • Second user requests same intent → instant retrieval
  • Tools are verified, tested, and version-locked
  • Fallback to local generation if cache miss

Benefits:

  • Instant - No waiting for LLM generation
  • Verified - All cached tools are tested and validated
  • Cost-effective - Share generation costs across users
  • Reliable - Cached tools don't depend on LLM availability

Roadmap

  • [ ] Anvil Cloud beta - Global tool cache
  • [ ] Visual tool builder - GUI for non-developers
  • [ ] Tool marketplace - Share and discover community tools
  • [ ] Enterprise features - SSO, audit logs, private caches
  • [ ] More frameworks - Haystack, Semantic Kernel, AutoGPT

Links

Resource Link
Documentation anvil-docs-theta.vercel.app
PyPI Package pypi.org/project/anvil-agent
GitHub Repository github.com/Kart-ing/anvil-sdk
Getting Started Guide Docs: Introduction
How It Works Docs: Concepts
Self-Healing Deep Dive Docs: Self-Healing
LangChain Integration Docs: LangChain
CrewAI Integration Docs: CrewAI
API Reference Docs: Reference

Try it now

# Install
pip install "anvil-agent[anthropic]"

# Initialize (interactive setup)
anvil init

# Start building
python -c "
from anvil import Anvil
anvil = Anvil()
tool = anvil.use_tool('hello_world', intent='Say hello to someone by name')
print(tool.run(name='World'))
"

Built With

  • Python
  • Anthropic Claude API
  • OpenAI API
  • FireCrawl
  • Rich (Terminal UI)
  • Click (CLI)
  • Docker (Sandbox)
  • Vercel (Documentation)

The Vision

We believe the future of AI agents is tools that write themselves.

Not because developers are lazy—but because they shouldn't have to choose between building features and maintaining integrations.

Anvil is the foundation. Write your intent once, and never worry about tool rot again.

It reads the docs. It writes the code. It heals itself.

Ship faster. Break less. Build the future.


Built with ❤️ and too much coffee

Built With

  • anthropic
  • autogen
  • claude
  • click
  • crewai
  • docker
  • firecrawl
  • gpt-4
  • grok
  • langchain
  • openai
  • pypi
  • python
  • rich
  • vercel
Share this project:

Updates