Anvil Logo

Anvil SDK

Tagline

The last tool library you'll ever write. Anvil generates, heals, and evolves your AI agent tools—automatically.

Inspiration

Every AI developer has lived this nightmare:

You spend weeks building the perfect agent. It searches Notion, queries databases, fetches weather data. You ship it. You sleep well.

Then Notion updates their API.

Your tool breaks. Your agent breaks. Your users break. You're back to writing code at 2 AM, reading documentation, fixing edge cases, and praying nothing else changes.

We call this "Tool Rot"—the inevitable decay of hard-coded integrations.

After rewriting our 47th API integration for the 3rd time, we asked ourselves: What if tools could write and repair themselves?

That question became Anvil.

What it does

Anvil is a Just-In-Time (JIT) Infrastructure SDK for AI agents. Instead of hard-coding tool implementations, you define intents—what you want the tool to do—and Anvil generates production-ready code on the fly.

The Magic in 3 Lines

from anvil import Anvil

anvil = Anvil()
tool = anvil.use_tool("search_notion", intent="Search my Notion workspace")

# That's it. Anvil reads the live docs, writes the code, and handles auth.
result = tool.run(query="Q4 Planning")

Core Capabilities

🔧 JIT Code Generation

Reads live API documentation at runtime
Generates complete, type-hinted Python functions
Handles authentication, error handling, and rate limits automatically

🩹 Self-Healing

Detects when tools fail (404s, schema changes, deprecations)
Automatically fetches updated documentation
Regenerates and patches the code
Retries without human intervention

🔐 Interactive Credential Resolution

Detects missing API keys from tool responses
Prompts users with helpful context ("Get your Notion API key at...")
Securely saves to .env and retries automatically

📦 Glass-Box Architecture

All generated code is saved to ./anvil_tools/
Fully readable, editable, and version-controlled
"Eject" any tool to take manual control

🔌 Framework Adapters

Native integration with LangChain, CrewAI, AutoGen, OpenAI Agents SDK
Convert any Anvil tool with .to_langchain(), .to_crewai(), etc.

🤖 Multi-Provider

Works with Claude, GPT-4, Grok
Bring your own API keys, use your preferred model

How we built it

Architecture

┌─────────────────────────────────────────────────────────────┐
│                        Anvil Core                          │
│                   (Orchestration Layer)                    │
└──────────────┬──────────────────────────────────────────────┘
               │
    ┌──────────┴──────────────────────────┐
    │                                      │
┌───▼─────────────┐           ┌───────────▼────────────┐
│  Tool Manager   │           │     Tool Loader        │
│  (File System)  │           │  (Dynamic Import)      │
└─────────────────┘           └────────────────────────┘
         │                              │
         │ manages                      │ executes
         ▼                              ▼
┌──────────────────────────────────────────────────────┐
│           Generated Tool Files (.py)                 │
│         /anvil_tools/search_notion.py                │
└──────────────────────────────────────────────────────┘

The JIT Pipeline

Intent Analysis - Parse user's natural language intent
Documentation Scraping - Fetch live API docs via FireCrawl
Code Generation - LLM generates complete Python function
Dependency Detection - Scan imports, auto-install packages
Verification - Optional sandbox execution for security
Caching - Save to disk with header protocol for versioning

Header Protocol

Every generated file includes metadata:

# ANVIL-MANAGED: true
# version: 1.0
# hash: 8f3a2b1c

ANVIL-MANAGED: true - Anvil can regenerate this file
ANVIL-MANAGED: false - User has "ejected", don't touch
hash - Intent hash to detect when regeneration is needed

Self-Healing Flow

tool.run() fails
    ↓
Detect error type (404, schema, timeout)
    ↓
Fetch updated documentation
    ↓
Send error + old code + new docs to LLM
    ↓
Generate fixed code
    ↓
Increment version (1.0 → 1.1)
    ↓
Retry automatically

Tech Stack

Python 3.10+ - Core SDK
Anthropic Claude / OpenAI GPT-4 / xAI Grok - Code generation
FireCrawl - Live documentation scraping
Rich - Beautiful CLI experience
Click - Command-line interface
importlib - Dynamic module loading
Docker - Optional sandboxed verification

Challenges we ran into

1. The "Hallucination Problem"

LLMs sometimes generate plausible-looking but incorrect API calls. We solved this by:

Always fetching live documentation, never relying on training data
Structured prompts that enforce specific output formats
Verification mode that runs generated code in a sandbox

2. Import Hell

Generated code might import packages the user doesn't have. We built:

Automatic dependency detection from import statements
Smart detection of common packages (requests, httpx, pandas, etc.)
Clear error messages guiding users to install missing deps

3. The Credential Dance

Every API needs different auth. Some need headers, some need OAuth, some need query params. Our solution:

Standardized missing_credential response pattern
Knowledge base of 50+ common API keys with help URLs
Interactive prompting that saves to .env

4. Regeneration vs. User Edits

What if a user manually edits generated code? We created the Header Protocol:

Files marked ANVIL-MANAGED: true can be regenerated
Users can "eject" by setting to false
Intent hashing prevents unnecessary regeneration

Accomplishments that we're proud of

Zero to working tool in 3 lines of code - The developer experience we always wanted
Self-healing actually works - Watch a tool break, heal, and succeed without touching code
Framework-agnostic - One tool works with LangChain, CrewAI, AutoGen, and OpenAI Agents
Published on PyPI - pip install anvil-agent just works
Comprehensive docs - Full documentation site with examples and guides
Production-ready - Type hints, error handling, logging, and testing built-in

What we learned

LLMs are incredible code generators when given the right context (live docs > training data)
Developer experience is everything - A beautiful CLI makes people actually want to use your tool
Glass-box beats black-box - Developers trust tools they can inspect and modify
Self-healing is possible - With the right architecture, code can fix itself

What's next for Anvil

Anvil Cloud (In Development)

The next evolution: a global cache of pre-generated, verified tools.

anvil = Anvil(mode="cloud")  # Instant retrieval, no LLM latency

How it works:

First user generates a "Search Notion" tool → cached globally
Second user requests same intent → instant retrieval
Tools are verified, tested, and version-locked
Fallback to local generation if cache miss

Benefits:

Instant - No waiting for LLM generation
Verified - All cached tools are tested and validated
Cost-effective - Share generation costs across users
Reliable - Cached tools don't depend on LLM availability

Roadmap

[ ] Anvil Cloud beta - Global tool cache
[ ] Visual tool builder - GUI for non-developers
[ ] Tool marketplace - Share and discover community tools
[ ] Enterprise features - SSO, audit logs, private caches
[ ] More frameworks - Haystack, Semantic Kernel, AutoGPT

Links

Resource	Link
Documentation	anvil-docs-theta.vercel.app
PyPI Package	pypi.org/project/anvil-agent
GitHub Repository	github.com/Kart-ing/anvil-sdk
Getting Started Guide	Docs: Introduction
How It Works	Docs: Concepts
Self-Healing Deep Dive	Docs: Self-Healing
LangChain Integration	Docs: LangChain
CrewAI Integration	Docs: CrewAI
API Reference	Docs: Reference

Try it now

# Install
pip install "anvil-agent[anthropic]"

# Initialize (interactive setup)
anvil init

# Start building
python -c "
from anvil import Anvil
anvil = Anvil()
tool = anvil.use_tool('hello_world', intent='Say hello to someone by name')
print(tool.run(name='World'))
"